Validated on 1 Oct 2024 • Last edited on 1 Oct 2024
DigitalOcean Kubernetes (DOKS) is a managed Kubernetes service. Deploy Kubernetes clusters with a fully managed control plane, high availability, autoscaling, and native integration with DigitalOcean Load Balancers and volumes. You can add node pools using shared and dedicated CPUs, and NVIDIA H100 GPUs in a single GPU or 8 GPU configuration. DOKS clusters are compatible with standard Kubernetes toolchains and the DigitalOcean API and CLI.
Add a Node Pool to a Cluster Using Automation
How to Add a Node Pool to a Kubernetes Cluster Using the DigitalOcean CLI
You can also add a GPU node pool to an existing cluster on versions 1.30.4-do.0, 1.29.8-do.0, 1.28.13-do.0, and later.
Note
In rare cases, it can take several hours for a GPU Droplet to provision. If you have an unusually long creation time, open a support ticket.
How to Add a GPU Worker Node to a Cluster Using the DigitalOcean CLI
To add a GPU worker node to an existing cluster, run doctl kubernetes cluster node-pool create specifying the GPU machine type. The following example adds a GPU worker node in single GPU configuration with 80 GB of memory and 4 node pools to a cluster named gpu-cluster:
How to Add a GPU Worker Node to a Cluster Using the DigitalOcean API
To add a GPU worker node to an existing cluster, send a POST request to https://api.digitalocean.com/v2/kubernetes/clusters. The following example adds a GPU worker node in single GPU configuration with 80 GB of memory and 4 node pools to an existing cluster specified by its cluster ID cluster_id.
:
To run GPU workloads after you create a cluster, use the GPU nodes-specific labels and taint in your workload specifications to schedule pods that match. You can use a configuration spec, similar to the pod spec shown below, for your actual workloads:
The above spec shows how to create a pod that runs NVIDIA’s CUDA image and uses the labels and taint for GPU worker nodes.
You can use the cluster autoscaler to scale the GPU node pool down to 1 or use the DigitalOcean CLI or API to manually scale the node pool down to 0. For example: