# Monitoring Features DigitalOcean Monitoring is a free, opt-in service that lets you track Droplet resource usage in real time, visualize performance metrics, and receive alerts via email or Slack to proactively manage your infrastructure’s health. ## GPU Observability GPU Observability extends DigitalOcean Insights to display GPU-level metrics for DOKS clusters that include GPU node pools created with AI/ML Ready images for AMD and NVIDIA GPUs. It provides a monitoring experience for GPU workloads, so you can track utilization, temperature, memory usage, and performance directly in the **Insights** tab. `do-agent` automatically detects the GPU type on each node and enables the correct exporter (`DCGM` for NVIDIA GPUs or `ROCm` for AMD GPUs). Metrics are collected locally on each GPU worker node. AMD GPUs are available by request only. [Contact support to request access](https://cloudsupport.digitalocean.com). GPU Observability is available on DOKS 1.33.1-do.5 or higher and is automatically enabled when you select **Improved metrics and monitoring** during cluster creation. For security, GPU exporters listen only on `127.0.0.1` to prevent external access. **Note**: The power throttling GPU metric is currently available only for AMD GPUs. NVIDIA support is planned but not yet available. To use AMD GPUs, [contact support to request access](https://cloudsupport.digitalocean.com). - **AI/ML Ready Droplets:** GPU metrics are enabled automatically when you select **Improved Metrics and Monitoring** during Droplet creation. - **Basic Images:** GPU metrics are not enabled by default. For **Basic Images**, you can enable GPU metrics by [manually installing the exporter](https://docs.digitalocean.com/products/droplets/how-to/gpu/enable-metrics/index.html.md), binding it to `127.0.0.1`, reconfiguring `do-agent` to scrape it, and restarting `do-agent`. ## Droplet Graphs Droplet graphs provide visual representations of system-level metrics. Use them to monitor resource usage over time and understand how it correlates to performance. By default, Droplet graphs show public and private bandwidth usage, CPU usage, and disk I/O. By installing the [DigitalOcean metrics agent](https://docs.digitalocean.com/products/monitoring/how-to/install-metrics-agent/index.html.md), you also gain access to load averages (1-, 5-, and 15-minute), memory usage, and disk usage. ## Alert Policies [Alert policies](https://docs.digitalocean.com/products/monitoring/how-to/manage-alerts/index.html.md) let you define thresholds for resource usage. When usage exceeds these thresholds, notifications are sent through email or [Slack](https://slack.com/). You can set alerts for total CPU usage, incoming and outgoing bandwidth, disk read and write operations, memory usage, and disk usage.