DigitalOcean Kubernetes (DOKS) is a managed Kubernetes service. Deploy Kubernetes clusters with a fully managed control plane, high availability, autoscaling, and native integration with DigitalOcean Load Balancers and volumes. You can add node pools using shared and dedicated CPUs, and NVIDIA H100 GPUs in a single GPU or 8 GPU configuration. DOKS clusters are compatible with standard Kubernetes toolchains and the DigitalOcean API and CLI.
Kubernetes is an open-source system for managing containerized applications in a clustered environment. Its focus is to improve how you manage related, distributed components and services across varied infrastructure.
DigitalOcean Kubernetes is a managed Kubernetes service lets you deploy scalable and secure Kubernetes clusters without the complexities of administrating the control plane. We manage the Kubernetes control plane and the underlying containerized infrastructure.
You retain full access to the cluster with existing toolchains. You have cluster-level administrative rights to create and delete any Kubernetes API objects through the DigitalOcean API and doctl.
There are no restrictions on the API objects you can create as long as the underlying Kubernetes version supports them. We offer the latest version of Kubernetes as well as earlier patch levels of the latest minor version for special use cases. You can also install popular tools like Helm, metrics-server, and Istio.
We only support features that are in a beta and general availability stage in upstream Kubernetes. See the Kubernetes documentation to check which feature is in the alpha, beta or general availability stage.
For updates on DOKS’s latest features and integrations, see the DOKS release notes. For a full list of changes for each available version of Kubernetes, including updates to the backend, API, and system components, see the DOKS changelog.
DOKS conforms to the Cloud Native Computing Foundation’s Kubernetes Software Conformance Certification program and is proud to be a CNCF Certified Kubernetes product.
In addition, we run our own extended suite of end-to-end tests on every DOKS release to ensure stability, performance, and upgrade compatibility.
Worker nodes are built on Droplets and can be shared or dedicated CPUs, and GPUs. Unlike standalone Droplets, worker nodes are managed with the Kubernetes command-line client kubectl
and are not accessible with SSH. On both the control plane and the worker nodes, DigitalOcean maintains the system updates, security patches, operating system configuration and installed packages.
All the worker nodes within a node pool have identical resources, but each node pool can have a different worker configuration. This lets you have different services on different node pools, where each pool has the RAM, CPU, and attached storage resources the service requires.
You can create and modify node pools at any time. Worker nodes are automatically deleted and recreated when needed, and you can manually recycle worker nodes. Nodes in the node pool inherit the node pool’s naming scheme when you first create a node pool, however, renaming a node pool does not rename the nodes. Nodes inherit the new naming scheme only when they are recycled or the node pool is resized, creating new nodes.
Kubernetes role-based access control (RBAC) is enabled by default. See Using RBAC Authorization for details.
GPU worker nodes are built on GPU Droplets which are powered by NVIDIA’s H100 GPUs and can be in a single GPU or 8 GPU configuration.
Using GPU worker nodes in your cluster, you can:
You do not need to specify a Runtime Class to run GPU workloads, which makes it easier for you to set up GPU workloads. DigitalOcean installs and manages the drivers required to enable GPU support on the GPU worker nodes. However, for GPU discovery, health checks, configuration of GPU-enabled containers, and time slicing, you need an additional component called NVIDIA device plugin for Kubernetes. You can configure the plugin and deploy it using helm
, as described in the README
file of the GitHub repository. Additionally, for monitoring your cluster using Prometheus, you need to install NVIDIA DCGM Exporter.
For multiple 8 GPU H100 worker nodes, we support high-speed networking between GPUs from different nodes. High speed communication uses 8x Mellanox 400GbE interfaces. To enable this, contact us.
You can also use the cluster autoscaler to scale a GPU node pool down to 1 or use the DigitalOcean CLI or API to manually scale the node pool down to 0. Autoscaling is useful when using on-demand and for jobs like training and fine-tuning.
You can persist data in DigitalOcean Kubernetes clusters to volumes using the DigitalOcean CSI plugin. (See the feature overview page to learn which volume features are available on DigitalOcean Kubernetes.) We do not recommend using HostPath volumes because nodes are frequently replaced and all data stored on the nodes are lost.
You can also persist data to DigitalOcean object storage by using the Spaces API to interact with Spaces from within your application.
The DigitalOcean Kubernetes Cloud Controller supports provisioning DigitalOcean Load Balancers.
Clusters are added to a VPC network for the datacenter region by default. This keeps traffic between clusters and other applicable resources from being routed outside the datacenter over the public internet.
Cluster networking is preconfigured with Cilium. Overlay networking is preconfigured with Cilium and supports network policies.
VPC-native cluster networking allows customers to route traffic directly between pods and other resources on VPC networks such as Droplets and managed databases. Kubernetes services can also be exposed to the VPC network using internal load balancers.
In traditional DOKS clusters, nodes are added to a VPC network, but pods and services operate on a separate virtual network. As a result, pods and services cannot communicate directly with resources in the VPC or peered VPCs, requiring a network translation step that can introduce inefficiencies or inconvenience.
When creating VPC-native clusters, customers provide two additional subnet ranges that are used for pod and service networking. These subnet ranges must not overlap with each other or with any VPCs or VPC-native clusters on the team. With these subnets, VPC-native clusters enable transparent communication between the pod network and other peered VPC networks, including the node VPC, without requiring network translation.
VPC-native networking is available on Kubernetes version 1.31 and above, using Cilium’s full kube-proxy replacement based on eBPF. Cluster-internal services function the same as traditional clusters, but are fully managed by eBPF instead of kube-proxy.
You cannot convert existing clusters to VPC-native because Kubernetes does not support changing the networking stack of a running cluster.
Clusters are automatically tagged with k8s
and the specific cluster ID, like k8s:EXAMPLEc-3515-4a0c-91a3-2452eEXAMPLE
. Worker nodes are additionally tagged with k8s:worker
.
You can add custom tags to a cluster and its node pools. Any custom tags added to worker nodes in a node pool (for example, from the Droplets page), are deleted to maintain consistency between the node pool and its worker nodes.