Kafka is an open-source distributed event and stream-processing platform built to process demanding real-time data feeds. It is inherently scalable, with high throughput and availability.
DigitalOcean Managed Databases include metrics visualizations so you can monitor performance and health of your database cluster. There are two kinds of metrics:
Cluster metrics monitor the performance of the nodes in a database cluster. Cluster metrics cover primary and standby nodes; metrics for each read-only node are displayed independently. This data can help guide capacity planning and optimization. You can also set up alerting on cluster metrics.
Database metrics monitor the performance of the database itself. This data can help assess the health of the database, pinpoint performance bottlenecks, and identify unusual use patterns that may indicate an application bug or security breach.
To view Kafka performance metrics, click the name of the database to go to its Overview page, then click the Insights tab.
The Select object drop-down menu lists the cluster itself and all of the databases in the cluster. Choose the cluster to view its metrics.
In the Select Period drop-down menu, you can choose a time frame for the x-axis of the graphs, ranging from 1 hour to 30 days. Each line in the graphs will display about 300 data points.
By default, the summary to the right shows the most recent metrics values. If you hover over a different time in a graph, the summary will display the values from that time instead.
If you recently provisioned the cluster or added nodes, it may take a few minutes for the metrics data to finish processing before you see it on the Insights page.
Clusters have the following cluster metrics:
The CPU usage plot shows, for all nodes in the cluster, the minimum, maximum, and average percentage of processing power being used across all cores.
If you experience a significant increase in CPU usage, check the throughput plot and query statistics to look for unexpected usage patterns or long-running queries.
Learn more about CPU usage in the Droplet metrics definitions.
The load average plot displays 1-, 5-, and 15-minute load averages, averaged across all nodes in the cluster. Load average measures the processes that are either being handled by the processor or are waiting for processor time.
The three time-based load average metrics are calculated as an exponentially weighted moving average over the past 1, 5, and 15 minutes. This metric does not adjust for multiple cores. Learn more about load averages in the Droplet metrics definitions.
The memory usage plot presents the minimum, maximum, and average percentage of memory consumption across all nodes in the cluster. Because cached memory can be released on demand, it is not considered in use.
Learn more about memory usage in the Droplet metrics definitions.
The log size plot presents the log size for each of your cluster’s largest Topics.
The disk I/O plot presents the overall amount of data being written to and read from all nodes in the cluster.
The messages-per-second plot presents the messages per second per node in the cluster.
The incoming messages plot presents the total number of messages received by the cluster by all nodes in the cluster.
The bytes-in-and-out plot presents the amount of bytes being sent and received by the cluster, organized into client bytes and replication bytes.
The network requests per operation plot presents the amount of network requests across all nodes in the cluster for each of the following operations: FetchConsumer
, FetchFollower
, and Produce
.
The controller offline partitions plot presents the number of offline partitions per node in the cluster.