How to Monitor Caching Database Performance

Redis is an open source, key-value database built with an in-memory design that emphasizes speed. It has support for rich data types, atomic operations, and Lua scripting.


DigitalOcean Managed Databases include metrics visualizations so you can monitor performance and health of your database cluster.

Cluster metrics monitor the performance of the nodes in a database cluster. Cluster metrics cover primary and standby nodes; metrics for each read-only node are displayed independently. This data can help guide capacity planning and optimization. You can also set up alerting on cluster metrics.

Database metrics monitor the performance of the database itself. This data can help assess the health of the database, pinpoint performance bottlenecks, and identify unusual use patterns that may indicate an application bug or security breach.

View Caching Metrics

To view performance metrics for a Caching database cluster, click the name of the database to go to its Overview page, then click the Insights tab.

The Insights tab of a Managed Database cluster

The Select object drop-down menu lists the cluster itself and all of the databases in the cluster. Choose the database to view its metrics.

In the Select Period drop-down menu, you can choose a time frame for the x-axis of the graphs, ranging from 1 hour to 30 days. Each line in the graphs displays about 300 data points.

By default, the summary to the right shows the most recent metrics values. When you hover over a different time in a graph, the summary displays the values from that time instead.

Note
You may notice gaps in your metrics data from outages, platform maintenance, or a database failover or migration. You can check DigitalOcean’s status page for outages, review the cluster maintenance window, visit the cluster’s Settings > Logs (or Logs & Queries) page to look for failovers and migrations.

If you recently provisioned the cluster or changed its configuration, it may take a few minutes for the metrics data to finish processing before you see it on the Insights page.

Caching Metrics Details

Caching databases have the following metrics:

  • Connection Status: the number of successful and rejected client connections

  • Connected Clients: the number of connected clients

  • Throughput: the rate of commands processed per second

  • Key Evictions: the number of keys removed by Caching due to memory constraints

  • Memory Fragmentation: the ratio of the memory allocated by the operating system to Caching to the memory used by Caching

  • Cache Hit Ratio: the ratio of keyspace hits to the number of keyspace hits and misses, which is a measure of cache usage efficiency

  • Replication Status: the number of connected standby nodes

Warning
If you have 200 or more databases on a single cluster, you may be unable to retrieve their metrics. If you reach this limit, create any additional databases in a new cluster.

Connection Status

The connection status plot displays the rate of new connections being received and rejected per second.

Redis connection status plot

If the number of connected threads regularly approaches or exceeds the connection limit, or if you often see an unacceptable number of rejected connections, consider upgrading your database plan to increase your connection limit.

Connected Clients

The connected clients plot displays the number of clients currently connected to your cluster.

Redis connected clients plot

Throughput

The throughput plot displays the overall rate of all Redis operations on the main server, expressed as a moving average of operations per second.

Redis operations throughput plot

You can compare this plot with node performance metrics to identify potential resource constraints. For more insights, look at the query statistics on the Logs & Queries page.

Key Evictions

By default, the Caching key eviction policy is set to noeviction. If you set the eviction policy (on the Settings page) to something other than noeviction, Caching evicts keys when it is constrained for memory. The key evictions plot displays the number of evicted keys.

Redis key evictions plot

This metric is useful when using Caching as a cache or assessing the impact of key evictions on overall key retrieval efficiency. Consider increasing the memory of your Caching cluster if the number of key evictions is consistently significantly greater than zero.

Learn more about key eviction policies and tuning in Using Redis as an LRU cache in the Redis documentation.

Memory Fragmentation

The memory fragmentation plot displays the efficiency of memory mapping, which is defined as the ratio of memory usage measured by the operating system to memory allocated by Caching.

Caching memory fragmentation plot

When adjacent memory blocks are not available, Caching requires additional memory overhead to allocate memory across the non-contiguous blocks, so this ratio is an indication of memory mapping efficiency:

  • Ratios over 1.0 indicate that memory fragmentation is very likely.
  • Ratios under 1.0 indicate that Caching likely has insufficient memory available. Consider optimizing memory usage or upgrading to a plan with more memory.
Note
If your peak memory usage is much higher than your current memory usage, the memory fragmentation ratio may be unreliable.

Learn more about memory allocation and fragmentation in the Redis documentation on memory optimization.

Cache Hit Ratio

The cache hit ratio plot displays the efficiency of key retrieval from the Caching cache, which is defined as the ratio of key hits to the total number of key hits and misses. Key misses occur when a key has been expired or evicted from the cache, or it never existed.

Redis cache hit ratio plot

For optimal responsiveness, keep your cache hit ratio at 0.8 or higher.

Replication Status

The replication status plot displays the count of connected standby nodes if replication is enabled.

Redis replication status plot

Access the Metrics Endpoint

You can also view your database cluster’s metrics programmatically via the metrics endpoint. This endpoint includes over twenty times the metrics you can access in the Insights tab in the control panel.

You can access the metrics endpoint with a cURL command or a monitoring system like Prometheus.

Get Hostname and Credentials

First, you need to retrieve your cluster’s metrics hostname by sending a GET request to https://api.digitalocean.com/v2/databases/${UUID}. In the following example, the target database cluster has a standby node, which requires a second host/port pair:

curl --silent -XGET --location 'https://api.digitalocean.com/v2/databases/${UUID}' --header 'Content-Type: application/json' --header "Authorization: Bearer $RO_DIGITALOCEAN_TOKEN" | jq '.database.metrics_endpoints'

Which returns the following host/port pairs:

[
  {
    "host": "db-test-for-metrics.c.db.ondigitalocean.com",
    "port": 9273
  },
  {
    "host": "replica-db-test-for-metrics.c.db.ondigitalocean.com",
    "port": 9273
  }
]

Next, you need your cluster’s metrics credentials. You can retrieve these by making a GET request to https://api.digitalocean.com/v2/databases/metrics/credentials with an admin or write token:

curl --silent -XGET --location 'https://api.digitalocean.com/v2/databases/metrics/credentials' --header 'Content-Type: application/json' --header "Authorization: Bearer $RW_DIGITALOCEAN_TOKEN" | jq '.'

Which returns the following credentials:

{
  "credentials": {
    "basic_auth_username": "..."
    "basic_auth_password": "...",
  }
}

Access with cURL

To access the endpoint using cURL, make a GET request to https://$HOST:9273/metrics, replacing the hostname, username, and password variables with the credentials you found in the previous steps:

curl -XGET -k -u $USERNAME:$PASSWORD https://$HOST:9273/metrics

Access with Prometheus

To access the endpoint using Prometheus, first copy the following configuration into a file prometheus.yml, replacing the hostname, username, password, and path to CA cert. This configures Prometheus to use all the credentials necessary to access the endpoint:

    
        
            
# prometheus.yml
global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'dbaas_cluster_metrics_svc_discovery'
    scheme: https
    tls_config: 
      ca_file: /path/to/ca.crt
    dns_sd_configs:
    - names:
      - $TARGET_ADDRESS
      type: 'A'
      port: 9273
      refresh_interval: 15s
    metrics_path: '/metrics'
    basic_auth:
      username: $BASIC_AUTH_USERNAME
      password: $BASIC_AUTH_PASSWORD

        
    

Then, copy the following connection script into a file named up.sh. This script runs envsubst and starts a Prometheus container with the config from the previous step:

#!/bin/bash
envsubst < prometheus.yml > /tmp/dbaas-prometheus.yml

docker run -p 9090:9090 \
  -v /tmp/dbaas-prometheus.yml:/etc/prometheus/prometheus.yml \
  prom/prometheus

Go to http://localhost:9090/targets in a browser to confirm that multiple hosts are up and healthy.

The Prometheus dashboard

Then, navigate to http://localhost:9090/graph to query Prometheus for metrics.

A Prometheus graph

For more details, see the Prometheus DNS SD docs and TLS config docs.

Additional Resources

For more details on each available metric, see the Redis documentation.