DigitalOcean Monitoring is a free, opt-in service that gathers metrics about Droplet-level resource utilization. It provides additional Droplet graphs and supports configurable metrics alert policies with integrated email Slack notifications to help you track the operational health of your infrastructure.
When working with monitoring technology, it can be helpful to have some familiarity with common terminology. These are some of the most frequently used concepts that are relevant to DigitalOcean Monitoring:
Resource: In computing, a resource is a basic component with limited availability. Resources include CPU, memory, disk space, or available bandwidth.
Metric: In computing, a metric is a standard for measuring a computer resource. Metrics can either refer to the resource and unit with which to measure, or the data that is collected about that resource.
Units: Units are standard ways of comparing values.
Percentage units: Percentage units specify a value in relationship to the total available quantity, which is typically set at 100%. Percentages are useful for quantities with a known limit, like disk space.
Rate units: Rate units specify a value in relation to another measure (most frequently time). Rate units usually tell you frequency of occurrence over a set time period so that you can compare magnitude. Rate units are useful when there is no easy-to-understand upper boundary that indicates total use or when it is more helpful to examine usage, like incoming bandwidth.
Data point: A data point, or value, is a number and unit representing a single measurement.
Data set: A data set is a collection of related data points.
Time series data: Time series data is data collected at regular intervals and arranged chronologically to examine changes over time.
Trend: A trend indicates a general tendency in a data set over time. Trends are useful for recognizing changes and for predicting future behavior.
Monitoring: In computing, monitoring is the process of gathering and visualizing data to improve awareness of system health and minimize response time when usage is outside of expected levels.
System usage monitoring: System usage monitoring is a type of monitoring that involves tracking system resources.
Alerting: Alerting within a computer monitoring system is the ability to send notifications when certain metrics fall outside of expected ranges.
Threshold: In alerting, a threshold is a value that defines the boundary between normal and abnormal usage.
Alert interval: An alert interval is the period of time that average usage must exceed a threshold before triggering an alert.