Dedicated Inference

Validated on 9 Oct 2024 • Last edited on 23 Mar 2026

Dedicated Inference delivers scalable production-grade LLM hosting on DigitalOcean. Create, list, get, update, and delete Dedicated Inference instances; manage accelerators, CA certificate, sizes, GPU model config, and access tokens.

Base URL https://api.digitalocean.com

GET List Dedicated Inferences

/v2/dedicated-inferences
Authorizations: bearer_auth (1 scope)
Http: Bearer
Required scopes: dedicated_inference:read

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

The DigitalOcean API handles this through OAuth, an open standard for authorization. OAuth allows you to delegate access to your account. Scopes can be used to grant full access, read-only access, or access to a specific set of endpoints.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

Because of this, it is absolutely essential that you keep your OAuth tokens secure. In fact, upon generation, the web interface will only display each token a single time in order to prevent the token from being compromised.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

  • dop_v1_ for personal access tokens generated in the control panel
  • doo_v1_ for tokens generated by applications using the OAuth flow
  • dor_v1_ for OAuth refresh tokens

Scopes

Scopes act like permissions assigned to an API token. These permissions determine what actions the token can perform. You can create API tokens that grant read-only access, full access, or limited access to specific endpoints by using custom scopes.

Generally, scopes are designed to match HTTP verbs and common CRUD operations (Create, Read, Update, Delete).

HTTP Verb CRUD Operation Scope
GET Read <resource>:read
POST Create <resource>:create
PUT/PATCH Update <resource>:update
DELETE Delete <resource>:delete

For example, creating a new Droplet by making a POST request to the /v2/droplets endpoint requires the droplet:create scope while listing Droplets by making a GET request to the /v2/droplets endpoint requires the droplet:read scope.

Each endpoint below specifies which scope is required to access it when using custom scopes.

How to Authenticate with OAuth

In order to make an authenticated request, include a bearer-type Authorization header containing your OAuth token. All requests must be made over HTTPS.

Authenticate with a Bearer Authorization Header

curl -X $HTTP_METHOD -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://api.digitalocean.com/v2/$OBJECT"

List all Dedicated Inference instances for your team. Send a GET request to /v2/dedicated-inferences. You may filter by region and use page and per_page for pagination.

Query Parameters

per_page integer 1 – 200 optional
Example: 2

Number of items returned per page

Default: 20
page integer >= 1 optional
Example: 1

Which 'page' of paginated results to return.

Default: 1
region string, one of: nyc2, tor1, atl1 optional
Example: atl1

Filter by region. Dedicated Inference is only available in nyc2, tor1, and atl1.

curl -i -X GET "https://api.digitalocean.com/v2/dedicated-inferences?region=nyc2&page=1&per_page=20" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DO_TOKEN"

Responses

200

The response will be a JSON object with a key called dedicated_inferences. This will be set to an array of objects, each of which will contain the standard attributes associated with a Dedicated Inference. Pagination uses the same links and meta structure as other list endpoints.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

dedicated_inferences array of object required

Array of Dedicated Inference instances.

Show child properties
created_at string (date-time) optional read-only
Example: 2024-01-09T20:44:32Z

When the Dedicated Inference was created.

endpoints object optional read-only
Show child properties
private_endpoint_fqdn string (uri) optional
Example: https://b4bfug4jc41kts2ro54if91eo-private-dedicated-inference.do-infra.ai

Private VPC FQDN of the Dedicated Inference instance.

public_endpoint_fqdn string (uri) optional
Example: https://b4bfug4jc41kts2ro54if91eo-public-dedicated-inference.do-infra.ai

Public FQDN of the Dedicated Inference instance.

id string (uuid) optional read-only
Example: 6b5c619c-359c-44ca-87e2-47e98170c01d

Unique ID of the Dedicated Inference.

pending_deployment_spec object optional

Pending deployment when status is provisioning or updating.

Show child properties
created_at string (date-time) optional
Example: 2024-01-09T20:44:32Z
enable_public_endpoint boolean optional

Whether to expose a public LLM endpoint.

id string (uuid) optional
Example: 7c6d729d-360d-44db-88e3-58e98281d12e

Deployment UUID.

model_deployments array of object optional

At least one model deployment is required.

Show child properties
accelerators array of object optional

Accelerator configuration for this deployment.

Additional nested properties not shown. Refer to the full API spec for details.
model_id string optional

Used to identify an existing deployment when updating; empty means create new.

model_provider string, one of: hugging_face optional
Example: hugging_face

Model provider.

model_slug string optional
Example: mistral/mistral-7b-instruct-v3

Model identifier (e.g. Hugging Face slug).

workload_config object optional

Workload-specific configuration (e.g. ISL/OSL in future).

Additional nested properties not shown. Refer to the full API spec for details.
name string optional
Example: new-dedicated-inference

Name of the Dedicated Inference. Must be unique within the team.

status string, one of: provisioning, updating optional
Example: provisioning
updated_at string (date-time) optional
Example: 2024-01-09T20:44:32Z
version integer optional
Example: 1

Spec version.

vpc object optional
Show child properties
uuid string (uuid) required
Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID for the Dedicated Inference.

region string optional read-only
Example: atl1

DigitalOcean region where the Dedicated Inference is hosted.

spec object optional

Structured configuration for a Dedicated Inference deployment.

Show child properties
enable_public_endpoint boolean required

Whether to expose a public LLM endpoint.

model_deployments array of object required

At least one model deployment is required.

Show child properties
accelerators array of object optional

Accelerator configuration for this deployment.

Additional nested properties not shown. Refer to the full API spec for details.
model_id string optional

Used to identify an existing deployment when updating; empty means create new.

model_provider string, one of: hugging_face optional
Example: hugging_face

Model provider.

model_slug string optional
Example: mistral/mistral-7b-instruct-v3

Model identifier (e.g. Hugging Face slug).

workload_config object optional

Workload-specific configuration (e.g. ISL/OSL in future).

Additional nested properties not shown. Refer to the full API spec for details.
name string required
Example: new-dedicated-inference

Name of the Dedicated Inference. Must be unique within the team.

region string, one of: atl1, nyc2, tor1 required
Example: atl1

DigitalOcean region where the Dedicated Inference is hosted.

version integer required
Example: 1

Spec version.

vpc object required
Show child properties
uuid string (uuid) required
Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID for the Dedicated Inference.

status string (enum) optional read-only
Example: active

Current state of the Dedicated Inference.

updated_at string (date-time) optional read-only
Example: 2024-01-09T20:44:32Z

When the Dedicated Inference was last updated.

vpc_uuid string (uuid) optional read-only
Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID of the Dedicated Inference.

links object required
Show child properties
pages object optional

Pagination links (first, prev, next, last).

Show child properties
(additional properties) string (uri) optional

Additional properties are allowed.

meta object required
Show child properties
total integer required
Example: 1

Total number of results.

401

Authentication failed due to invalid credentials.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

{
  "dedicated_inferences": [
    {
      "created_at": "2024-01-09T20:44:32Z",
      "endpoints": {
        "private_endpoint_fqdn": "https://b4bfug4jc41kts2ro54if91eo-private-dedicated-inference.do-infra.ai",
        "public_endpoint_fqdn": "https://b4bfug4jc41kts2ro54if91eo-public-dedicated-inference.do-infra.ai"
      },
      "id": "6b5c619c-359c-44ca-87e2-47e98170c01d",
      "region": "atl1",
      "status": "active",
      "updated_at": "2024-01-09T20:44:32Z",
      "vpc_uuid": "997615ce-132d-4bae-9270-9ee21b395e5d"
    }
  ],
  "links": {
    "pages": {
      "first": "https://api.digitalocean.com/v2/dedicated-inferences?page=1\u0026per_page=20",
      "last": "https://api.digitalocean.com/v2/dedicated-inferences?page=1\u0026per_page=20"
    }
  },
  "meta": {
    "total": 1
  }
}
{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}
{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}
{
  "id": "server_error",
  "message": "Unexpected server-side error"
}
{
  "id": "example_error",
  "message": "some error message"
}

POST Create a Dedicated Inference

/v2/dedicated-inferences
Authorizations: bearer_auth (1 scope)
Http: Bearer
Required scopes: dedicated_inference:create

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

The DigitalOcean API handles this through OAuth, an open standard for authorization. OAuth allows you to delegate access to your account. Scopes can be used to grant full access, read-only access, or access to a specific set of endpoints.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

Because of this, it is absolutely essential that you keep your OAuth tokens secure. In fact, upon generation, the web interface will only display each token a single time in order to prevent the token from being compromised.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

  • dop_v1_ for personal access tokens generated in the control panel
  • doo_v1_ for tokens generated by applications using the OAuth flow
  • dor_v1_ for OAuth refresh tokens

Scopes

Scopes act like permissions assigned to an API token. These permissions determine what actions the token can perform. You can create API tokens that grant read-only access, full access, or limited access to specific endpoints by using custom scopes.

Generally, scopes are designed to match HTTP verbs and common CRUD operations (Create, Read, Update, Delete).

HTTP Verb CRUD Operation Scope
GET Read <resource>:read
POST Create <resource>:create
PUT/PATCH Update <resource>:update
DELETE Delete <resource>:delete

For example, creating a new Droplet by making a POST request to the /v2/droplets endpoint requires the droplet:create scope while listing Droplets by making a GET request to the /v2/droplets endpoint requires the droplet:read scope.

Each endpoint below specifies which scope is required to access it when using custom scopes.

How to Authenticate with OAuth

In order to make an authenticated request, include a bearer-type Authorization header containing your OAuth token. All requests must be made over HTTPS.

Authenticate with a Bearer Authorization Header

curl -X $HTTP_METHOD -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://api.digitalocean.com/v2/$OBJECT"

Create a new Dedicated Inference for your team. Send a POST request to /v2/dedicated-inferences with a spec object (version, name, region, vpc, enable_public_endpoint, model_deployments) and optional access_tokens (e.g. hugging_face_token for gated models). The response code 202 Accepted indicates the request was accepted for processing; it does not indicate success or failure. The token value is returned only on create; store it securely.

Request Body: application/json

access_tokens object optional
Example: {"hugging_face_token":"$HF_TOKEN"}

Key-value pairs for provider tokens (e.g. Hugging Face).

Show child properties
(additional properties) string optional

Additional properties are allowed.

spec object required

Structured configuration for a Dedicated Inference deployment.

Show child properties
enable_public_endpoint boolean required

Whether to expose a public LLM endpoint.

model_deployments array of object required

At least one model deployment is required.

Show child properties
accelerators array of object optional

Accelerator configuration for this deployment.

Show child properties
accelerator_slug string required
Example: gpu-mi300x1-192gb

DigitalOcean GPU slug.

scale integer required
Example: 1

Number of accelerator instances.

status string, one of: new, provisioning, active optional read-only
Example: active

Current state of the Accelerator.

type string required
Example: prefill_decode

Accelerator type (e.g. prefill_decode).

model_id string optional

Used to identify an existing deployment when updating; empty means create new.

model_provider string, one of: hugging_face optional
Example: hugging_face

Model provider.

model_slug string optional
Example: mistral/mistral-7b-instruct-v3

Model identifier (e.g. Hugging Face slug).

workload_config object optional

Workload-specific configuration (e.g. ISL/OSL in future).

name string required
Example: new-dedicated-inference

Name of the Dedicated Inference. Must be unique within the team.

region string, one of: atl1, nyc2, tor1 required
Example: atl1

DigitalOcean region where the Dedicated Inference is hosted.

version integer required
Example: 1

Spec version.

vpc object required
Show child properties
uuid string (uuid) required
Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID for the Dedicated Inference.

Content type application/json
{
  "access_tokens": {
    "hugging_face_token": "$HF_TOKEN"
  },
  "spec": {
    "enable_public_endpoint": true,
    "model_deployments": [],
    "name": "new-dedicated-inference",
    "region": "atl1",
    "version": 1,
    "vpc": {
      "uuid": "997615ce-132d-4bae-9270-9ee21b395e5d"
    }
  }
}
curl -i -X POST "https://api.digitalocean.com/v2/dedicated-inferences" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DO_TOKEN" \
  -d '{
    "spec": {
      "version": 1,
      "name": "new-dedicated-inference",
      "region": "atl1",
      "vpc": { "uuid": "7e5c619c-359c-44ca-87e2-47e98170c012" },
      "enable_public_endpoint": true,
      "model_deployments": [{
        "model_slug": "mistral/mistral-7b-instruct-v3",
        "model_provider": "hugging_face",
        "workload_config": {},
        "accelerators": [{
          "scale": 2,
          "type": "prefill_decode",
          "accelerator_slug": "gpu-mi300x1-192gb"
        }]
      }]
    },
    "access_tokens": { "hugging_face_token": "$HF_TOKEN" }
  }'

Responses

202

Dedicated Inference create/update accepted for processing (202). Success or failure is not indicated by the response code.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

dedicated_inference object optional

A Dedicated Inference instance.

Show child properties
created_at string (date-time) optional read-only
Example: 2024-01-09T20:44:32Z

When the Dedicated Inference was created.

endpoints object optional read-only
Show child properties
private_endpoint_fqdn string (uri) optional
Example: https://b4bfug4jc41kts2ro54if91eo-private-dedicated-inference.do-infra.ai

Private VPC FQDN of the Dedicated Inference instance.

public_endpoint_fqdn string (uri) optional
Example: https://b4bfug4jc41kts2ro54if91eo-public-dedicated-inference.do-infra.ai

Public FQDN of the Dedicated Inference instance.

id string (uuid) optional read-only
Example: 6b5c619c-359c-44ca-87e2-47e98170c01d

Unique ID of the Dedicated Inference.

pending_deployment_spec object optional

Pending deployment when status is provisioning or updating.

Show child properties
created_at string (date-time) optional
Example: 2024-01-09T20:44:32Z
enable_public_endpoint boolean optional

Whether to expose a public LLM endpoint.

id string (uuid) optional
Example: 7c6d729d-360d-44db-88e3-58e98281d12e

Deployment UUID.

model_deployments array of object optional

At least one model deployment is required.

Show child properties
accelerators array of object optional

Accelerator configuration for this deployment.

Additional nested properties not shown. Refer to the full API spec for details.
model_id string optional

Used to identify an existing deployment when updating; empty means create new.

model_provider string, one of: hugging_face optional
Example: hugging_face

Model provider.

model_slug string optional
Example: mistral/mistral-7b-instruct-v3

Model identifier (e.g. Hugging Face slug).

workload_config object optional

Workload-specific configuration (e.g. ISL/OSL in future).

Additional nested properties not shown. Refer to the full API spec for details.
name string optional
Example: new-dedicated-inference

Name of the Dedicated Inference. Must be unique within the team.

status string, one of: provisioning, updating optional
Example: provisioning
updated_at string (date-time) optional
Example: 2024-01-09T20:44:32Z
version integer optional
Example: 1

Spec version.

vpc object optional
Show child properties
uuid string (uuid) required
Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID for the Dedicated Inference.

region string optional read-only
Example: atl1

DigitalOcean region where the Dedicated Inference is hosted.

spec object optional

Structured configuration for a Dedicated Inference deployment.

Show child properties
enable_public_endpoint boolean required

Whether to expose a public LLM endpoint.

model_deployments array of object required

At least one model deployment is required.

Show child properties
accelerators array of object optional

Accelerator configuration for this deployment.

Additional nested properties not shown. Refer to the full API spec for details.
model_id string optional

Used to identify an existing deployment when updating; empty means create new.

model_provider string, one of: hugging_face optional
Example: hugging_face

Model provider.

model_slug string optional
Example: mistral/mistral-7b-instruct-v3

Model identifier (e.g. Hugging Face slug).

workload_config object optional

Workload-specific configuration (e.g. ISL/OSL in future).

Additional nested properties not shown. Refer to the full API spec for details.
name string required
Example: new-dedicated-inference

Name of the Dedicated Inference. Must be unique within the team.

region string, one of: atl1, nyc2, tor1 required
Example: atl1

DigitalOcean region where the Dedicated Inference is hosted.

version integer required
Example: 1

Spec version.

vpc object required
Show child properties
uuid string (uuid) required
Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID for the Dedicated Inference.

status string (enum) optional read-only
Example: active

Current state of the Dedicated Inference.

updated_at string (date-time) optional read-only
Example: 2024-01-09T20:44:32Z

When the Dedicated Inference was last updated.

vpc_uuid string (uuid) optional read-only
Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID of the Dedicated Inference.

token object optional

Access token for authenticating to Dedicated Inference endpoints.

Show child properties
created_at string (date-time) optional read-only
Example: 2024-01-09T20:44:32Z
id string (uuid) optional read-only
Example: 01333f14-a903-4b8e-92b3-363a767aa052

Unique ID of the token.

name string optional read-only
Example: first-token

Name of the token.

value string optional read-only
Example: di_xxxxxxxxxxxxxxxxxxxxxxxx

Token value; only returned once on create. Store securely.

401

Authentication failed due to invalid credentials.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

{
  "dedicated_inference": {
    "created_at": "2024-01-09T20:44:32Z",
    "endpoints": {
      "private_endpoint_fqdn": "https://b4bfug4jc41kts2ro54if91eo-private-dedicated-inference.do-infra.ai",
      "public_endpoint_fqdn": "https://b4bfug4jc41kts2ro54if91eo-public-dedicated-inference.do-infra.ai"
    },
    "id": "6b5c619c-359c-44ca-87e2-47e98170c01d",
    "pending_deployment_spec": {
      "created_at": "2024-01-09T20:44:32Z",
      "enable_public_endpoint": true,
      "id": "7c6d729d-360d-44db-88e3-58e98281d12e",
      "model_deployments": [],
      "name": "new-dedicated-inference",
      "status": "provisioning",
      "updated_at": "2024-01-09T20:44:32Z",
      "version": 1
    },
    "region": "atl1",
    "spec": {
      "enable_public_endpoint": true,
      "model_deployments": [],
      "name": "new-dedicated-inference",
      "region": "atl1",
      "version": 1
    },
    "status": "active",
    "updated_at": "2024-01-09T20:44:32Z",
    "vpc_uuid": "997615ce-132d-4bae-9270-9ee21b395e5d"
  },
  "token": {
    "created_at": "2024-01-09T20:44:32Z",
    "id": "01333f14-a903-4b8e-92b3-363a767aa052",
    "name": "first-token",
    "value": "di_xxxxxxxxxxxxxxxxxxxxxxxx"
  }
}
{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}
{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}
{
  "id": "server_error",
  "message": "Unexpected server-side error"
}
{
  "id": "example_error",
  "message": "some error message"
}

GET Get Dedicated Inference GPU Model Config

/v2/dedicated-inferences/gpu-model-config
Authorizations: bearer_auth (1 scope)
Http: Bearer
Required scopes: dedicated_inference:read

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

The DigitalOcean API handles this through OAuth, an open standard for authorization. OAuth allows you to delegate access to your account. Scopes can be used to grant full access, read-only access, or access to a specific set of endpoints.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

Because of this, it is absolutely essential that you keep your OAuth tokens secure. In fact, upon generation, the web interface will only display each token a single time in order to prevent the token from being compromised.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

  • dop_v1_ for personal access tokens generated in the control panel
  • doo_v1_ for tokens generated by applications using the OAuth flow
  • dor_v1_ for OAuth refresh tokens

Scopes

Scopes act like permissions assigned to an API token. These permissions determine what actions the token can perform. You can create API tokens that grant read-only access, full access, or limited access to specific endpoints by using custom scopes.

Generally, scopes are designed to match HTTP verbs and common CRUD operations (Create, Read, Update, Delete).

HTTP Verb CRUD Operation Scope
GET Read <resource>:read
POST Create <resource>:create
PUT/PATCH Update <resource>:update
DELETE Delete <resource>:delete

For example, creating a new Droplet by making a POST request to the /v2/droplets endpoint requires the droplet:create scope while listing Droplets by making a GET request to the /v2/droplets endpoint requires the droplet:read scope.

Each endpoint below specifies which scope is required to access it when using custom scopes.

How to Authenticate with OAuth

In order to make an authenticated request, include a bearer-type Authorization header containing your OAuth token. All requests must be made over HTTPS.

Authenticate with a Bearer Authorization Header

curl -X $HTTP_METHOD -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://api.digitalocean.com/v2/$OBJECT"

Get supported GPU and model configurations for Dedicated Inference. Use this to discover supported GPU slugs and model slugs (e.g. Hugging Face). Send a GET request to /v2/dedicated-inferences/gpu-model-config.

curl -i -X GET "https://api.digitalocean.com/v2/dedicated-inferences/gpu-model-config" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DO_TOKEN"

Responses

200

GPU model configs (gpu_slugs, model_slug, model_name, is_gated_model).

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

gpu_model_configs array of object optional
Show child properties
gpu_slugs array of string optional
Example: ["gpu-mi300x1-192gb"]
is_gated_model boolean optional

Whether the model requires gated access (e.g. Hugging Face token).

model_name string optional
Example: Mistral-7B-Instruct-v0.3
model_slug string optional
Example: mistral/mistral-7b-instruct-v3
401

Authentication failed due to invalid credentials.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

{
  "gpu_model_configs": [
    {
      "gpu_slugs": [
        "gpu-mi300x1-192gb"
      ],
      "is_gated_model": true,
      "model_name": "Mistral-7B-Instruct-v0.3",
      "model_slug": "mistral/mistral-7b-instruct-v3"
    }
  ]
}
{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}
{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}
{
  "id": "server_error",
  "message": "Unexpected server-side error"
}
{
  "id": "example_error",
  "message": "some error message"
}

GET List Dedicated Inference Sizes

/v2/dedicated-inferences/sizes
Authorizations: bearer_auth (1 scope)
Http: Bearer
Required scopes: dedicated_inference:read

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

The DigitalOcean API handles this through OAuth, an open standard for authorization. OAuth allows you to delegate access to your account. Scopes can be used to grant full access, read-only access, or access to a specific set of endpoints.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

Because of this, it is absolutely essential that you keep your OAuth tokens secure. In fact, upon generation, the web interface will only display each token a single time in order to prevent the token from being compromised.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

  • dop_v1_ for personal access tokens generated in the control panel
  • doo_v1_ for tokens generated by applications using the OAuth flow
  • dor_v1_ for OAuth refresh tokens

Scopes

Scopes act like permissions assigned to an API token. These permissions determine what actions the token can perform. You can create API tokens that grant read-only access, full access, or limited access to specific endpoints by using custom scopes.

Generally, scopes are designed to match HTTP verbs and common CRUD operations (Create, Read, Update, Delete).

HTTP Verb CRUD Operation Scope
GET Read <resource>:read
POST Create <resource>:create
PUT/PATCH Update <resource>:update
DELETE Delete <resource>:delete

For example, creating a new Droplet by making a POST request to the /v2/droplets endpoint requires the droplet:create scope while listing Droplets by making a GET request to the /v2/droplets endpoint requires the droplet:read scope.

Each endpoint below specifies which scope is required to access it when using custom scopes.

How to Authenticate with OAuth

In order to make an authenticated request, include a bearer-type Authorization header containing your OAuth token. All requests must be made over HTTPS.

Authenticate with a Bearer Authorization Header

curl -X $HTTP_METHOD -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://api.digitalocean.com/v2/$OBJECT"

Get available Dedicated Inference sizes and pricing for supported GPUs. Send a GET request to /v2/dedicated-inferences/sizes.

curl -i -X GET "https://api.digitalocean.com/v2/dedicated-inferences/sizes" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DO_TOKEN"

Responses

200

Enabled regions and sizes with pricing.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

enabled_regions array of string optional
Example: ["nyc2","atl1"]

Regions where Dedicated Inference is available.

sizes array of object optional
Show child properties
currency string optional
Example: USD
gpu_slug string optional
Example: gpu-mi300x1-192gb
price_per_hour string optional
Example: 2
region string optional
Example: nyc2
401

Authentication failed due to invalid credentials.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

{
  "enabled_regions": [
    "nyc2",
    "atl1"
  ],
  "sizes": [
    {
      "currency": "USD",
      "gpu_slug": "gpu-mi300x1-192gb",
      "price_per_hour": "2",
      "region": "nyc2"
    }
  ]
}
{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}
{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}
{
  "id": "server_error",
  "message": "Unexpected server-side error"
}
{
  "id": "example_error",
  "message": "some error message"
}

GET Get a Dedicated Inference

/v2/dedicated-inferences/{dedicated_inference_id}
Authorizations: bearer_auth (1 scope)
Http: Bearer
Required scopes: dedicated_inference:read

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

The DigitalOcean API handles this through OAuth, an open standard for authorization. OAuth allows you to delegate access to your account. Scopes can be used to grant full access, read-only access, or access to a specific set of endpoints.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

Because of this, it is absolutely essential that you keep your OAuth tokens secure. In fact, upon generation, the web interface will only display each token a single time in order to prevent the token from being compromised.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

  • dop_v1_ for personal access tokens generated in the control panel
  • doo_v1_ for tokens generated by applications using the OAuth flow
  • dor_v1_ for OAuth refresh tokens

Scopes

Scopes act like permissions assigned to an API token. These permissions determine what actions the token can perform. You can create API tokens that grant read-only access, full access, or limited access to specific endpoints by using custom scopes.

Generally, scopes are designed to match HTTP verbs and common CRUD operations (Create, Read, Update, Delete).

HTTP Verb CRUD Operation Scope
GET Read <resource>:read
POST Create <resource>:create
PUT/PATCH Update <resource>:update
DELETE Delete <resource>:delete

For example, creating a new Droplet by making a POST request to the /v2/droplets endpoint requires the droplet:create scope while listing Droplets by making a GET request to the /v2/droplets endpoint requires the droplet:read scope.

Each endpoint below specifies which scope is required to access it when using custom scopes.

How to Authenticate with OAuth

In order to make an authenticated request, include a bearer-type Authorization header containing your OAuth token. All requests must be made over HTTPS.

Authenticate with a Bearer Authorization Header

curl -X $HTTP_METHOD -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://api.digitalocean.com/v2/$OBJECT"

Retrieve an existing Dedicated Inference by ID. Send a GET request to /v2/dedicated-inferences/{dedicated_inference_id}. The status in the response is one of active, new, provisioning, updating, deleting, or error.

Path Parameters

dedicated_inference_id string (uuid) required
Example: 6b5c619c-359c-44ca-87e2-47e98170c01d

A unique identifier for a Dedicated Inference instance.

curl -i -X GET "https://api.digitalocean.com/v2/dedicated-inferences/6b5c619c-359c-44ca-87e2-47e98170c01d" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DO_TOKEN"

Responses

200

Response containing a single Dedicated Inference instance.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

dedicated_inference object optional

A Dedicated Inference instance.

Show child properties
created_at string (date-time) optional read-only
Example: 2024-01-09T20:44:32Z

When the Dedicated Inference was created.

endpoints object optional read-only
Show child properties
private_endpoint_fqdn string (uri) optional
Example: https://b4bfug4jc41kts2ro54if91eo-private-dedicated-inference.do-infra.ai

Private VPC FQDN of the Dedicated Inference instance.

public_endpoint_fqdn string (uri) optional
Example: https://b4bfug4jc41kts2ro54if91eo-public-dedicated-inference.do-infra.ai

Public FQDN of the Dedicated Inference instance.

id string (uuid) optional read-only
Example: 6b5c619c-359c-44ca-87e2-47e98170c01d

Unique ID of the Dedicated Inference.

pending_deployment_spec object optional

Pending deployment when status is provisioning or updating.

Show child properties
created_at string (date-time) optional
Example: 2024-01-09T20:44:32Z
enable_public_endpoint boolean optional

Whether to expose a public LLM endpoint.

id string (uuid) optional
Example: 7c6d729d-360d-44db-88e3-58e98281d12e

Deployment UUID.

model_deployments array of object optional

At least one model deployment is required.

Show child properties
accelerators array of object optional

Accelerator configuration for this deployment.

Additional nested properties not shown. Refer to the full API spec for details.
model_id string optional

Used to identify an existing deployment when updating; empty means create new.

model_provider string, one of: hugging_face optional
Example: hugging_face

Model provider.

model_slug string optional
Example: mistral/mistral-7b-instruct-v3

Model identifier (e.g. Hugging Face slug).

workload_config object optional

Workload-specific configuration (e.g. ISL/OSL in future).

Additional nested properties not shown. Refer to the full API spec for details.
name string optional
Example: new-dedicated-inference

Name of the Dedicated Inference. Must be unique within the team.

status string, one of: provisioning, updating optional
Example: provisioning
updated_at string (date-time) optional
Example: 2024-01-09T20:44:32Z
version integer optional
Example: 1

Spec version.

vpc object optional
Show child properties
uuid string (uuid) required
Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID for the Dedicated Inference.

region string optional read-only
Example: atl1

DigitalOcean region where the Dedicated Inference is hosted.

spec object optional

Structured configuration for a Dedicated Inference deployment.

Show child properties
enable_public_endpoint boolean required

Whether to expose a public LLM endpoint.

model_deployments array of object required

At least one model deployment is required.

Show child properties
accelerators array of object optional

Accelerator configuration for this deployment.

Additional nested properties not shown. Refer to the full API spec for details.
model_id string optional

Used to identify an existing deployment when updating; empty means create new.

model_provider string, one of: hugging_face optional
Example: hugging_face

Model provider.

model_slug string optional
Example: mistral/mistral-7b-instruct-v3

Model identifier (e.g. Hugging Face slug).

workload_config object optional

Workload-specific configuration (e.g. ISL/OSL in future).

Additional nested properties not shown. Refer to the full API spec for details.
name string required
Example: new-dedicated-inference

Name of the Dedicated Inference. Must be unique within the team.

region string, one of: atl1, nyc2, tor1 required
Example: atl1

DigitalOcean region where the Dedicated Inference is hosted.

version integer required
Example: 1

Spec version.

vpc object required
Show child properties
uuid string (uuid) required
Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID for the Dedicated Inference.

status string (enum) optional read-only
Example: active

Current state of the Dedicated Inference.

updated_at string (date-time) optional read-only
Example: 2024-01-09T20:44:32Z

When the Dedicated Inference was last updated.

vpc_uuid string (uuid) optional read-only
Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID of the Dedicated Inference.

401

Authentication failed due to invalid credentials.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

404

The resource was not found.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

{
  "dedicated_inference": {
    "created_at": "2024-01-09T20:44:32Z",
    "endpoints": {
      "private_endpoint_fqdn": "https://b4bfug4jc41kts2ro54if91eo-private-dedicated-inference.do-infra.ai",
      "public_endpoint_fqdn": "https://b4bfug4jc41kts2ro54if91eo-public-dedicated-inference.do-infra.ai"
    },
    "id": "6b5c619c-359c-44ca-87e2-47e98170c01d",
    "region": "atl1",
    "status": "active",
    "updated_at": "2024-01-09T20:44:32Z",
    "vpc_uuid": "997615ce-132d-4bae-9270-9ee21b395e5d"
  }
}
{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}
{
  "id": "not_found",
  "message": "The resource you requested could not be found."
}
{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}
{
  "id": "server_error",
  "message": "Unexpected server-side error"
}
{
  "id": "example_error",
  "message": "some error message"
}

PATCH Update a Dedicated Inference

/v2/dedicated-inferences/{dedicated_inference_id}
Authorizations: bearer_auth (1 scope)
Http: Bearer
Required scopes: dedicated_inference:update

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

The DigitalOcean API handles this through OAuth, an open standard for authorization. OAuth allows you to delegate access to your account. Scopes can be used to grant full access, read-only access, or access to a specific set of endpoints.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

Because of this, it is absolutely essential that you keep your OAuth tokens secure. In fact, upon generation, the web interface will only display each token a single time in order to prevent the token from being compromised.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

  • dop_v1_ for personal access tokens generated in the control panel
  • doo_v1_ for tokens generated by applications using the OAuth flow
  • dor_v1_ for OAuth refresh tokens

Scopes

Scopes act like permissions assigned to an API token. These permissions determine what actions the token can perform. You can create API tokens that grant read-only access, full access, or limited access to specific endpoints by using custom scopes.

Generally, scopes are designed to match HTTP verbs and common CRUD operations (Create, Read, Update, Delete).

HTTP Verb CRUD Operation Scope
GET Read <resource>:read
POST Create <resource>:create
PUT/PATCH Update <resource>:update
DELETE Delete <resource>:delete

For example, creating a new Droplet by making a POST request to the /v2/droplets endpoint requires the droplet:create scope while listing Droplets by making a GET request to the /v2/droplets endpoint requires the droplet:read scope.

Each endpoint below specifies which scope is required to access it when using custom scopes.

How to Authenticate with OAuth

In order to make an authenticated request, include a bearer-type Authorization header containing your OAuth token. All requests must be made over HTTPS.

Authenticate with a Bearer Authorization Header

curl -X $HTTP_METHOD -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://api.digitalocean.com/v2/$OBJECT"

Update an existing Dedicated Inference. Send a PATCH request to /v2/dedicated-inferences/{dedicated_inference_id} with updated spec and/or access_tokens. Status will move to updating and return to active when done.

Path Parameters

dedicated_inference_id string (uuid) required
Example: 6b5c619c-359c-44ca-87e2-47e98170c01d

A unique identifier for a Dedicated Inference instance.

Request Body: application/json

access_tokens object optional

Provider tokens for model access (e.g. gated Hugging Face models).

Show child properties
hugging_face_token string optional
Example: $HF_TOKEN

Hugging Face token required for gated models.

spec object optional

Structured configuration for a Dedicated Inference deployment.

Show child properties
enable_public_endpoint boolean required

Whether to expose a public LLM endpoint.

model_deployments array of object required

At least one model deployment is required.

Show child properties
accelerators array of object optional

Accelerator configuration for this deployment.

Show child properties
accelerator_slug string required
Example: gpu-mi300x1-192gb

DigitalOcean GPU slug.

scale integer required
Example: 1

Number of accelerator instances.

status string, one of: new, provisioning, active optional read-only
Example: active

Current state of the Accelerator.

type string required
Example: prefill_decode

Accelerator type (e.g. prefill_decode).

model_id string optional

Used to identify an existing deployment when updating; empty means create new.

model_provider string, one of: hugging_face optional
Example: hugging_face

Model provider.

model_slug string optional
Example: mistral/mistral-7b-instruct-v3

Model identifier (e.g. Hugging Face slug).

workload_config object optional

Workload-specific configuration (e.g. ISL/OSL in future).

name string required
Example: new-dedicated-inference

Name of the Dedicated Inference. Must be unique within the team.

region string, one of: atl1, nyc2, tor1 required
Example: atl1

DigitalOcean region where the Dedicated Inference is hosted.

version integer required
Example: 1

Spec version.

vpc object required
Show child properties
uuid string (uuid) required
Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID for the Dedicated Inference.

Content type application/json
{
  "access_tokens": {
    "hugging_face_token": "$HF_TOKEN"
  },
  "spec": {
    "enable_public_endpoint": true,
    "model_deployments": [],
    "name": "new-dedicated-inference",
    "region": "atl1",
    "version": 1,
    "vpc": {
      "uuid": "997615ce-132d-4bae-9270-9ee21b395e5d"
    }
  }
}
curl -i -X PATCH "https://api.digitalocean.com/v2/dedicated-inferences/6b5c619c-359c-44ca-87e2-47e98170c01d" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DO_TOKEN" \
  -d '{
    "spec": {
      "name": "renamed-dedicated-inference",
      "region": "atl1",
      "vpc": { "uuid": "997615ce-132d-4bae-9270-9ee21b395e5d" },
      "model_deployments": [{
        "model_slug": "mistral/mistral-7b-instruct-v3",
        "accelerator_slug": "gpu-mi300x1-192gb",
        "node_count": 3
      }]
    },
    "access_tokens": { "hugging_face_token": "$HF_TOKEN" }
  }'

Responses

202

Dedicated Inference update accepted for processing (202). Response contains only the dedicated_inference; no token is returned.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

dedicated_inference object optional

A Dedicated Inference instance.

Show child properties
created_at string (date-time) optional read-only
Example: 2024-01-09T20:44:32Z

When the Dedicated Inference was created.

endpoints object optional read-only
Show child properties
private_endpoint_fqdn string (uri) optional
Example: https://b4bfug4jc41kts2ro54if91eo-private-dedicated-inference.do-infra.ai

Private VPC FQDN of the Dedicated Inference instance.

public_endpoint_fqdn string (uri) optional
Example: https://b4bfug4jc41kts2ro54if91eo-public-dedicated-inference.do-infra.ai

Public FQDN of the Dedicated Inference instance.

id string (uuid) optional read-only
Example: 6b5c619c-359c-44ca-87e2-47e98170c01d

Unique ID of the Dedicated Inference.

pending_deployment_spec object optional

Pending deployment when status is provisioning or updating.

Show child properties
created_at string (date-time) optional
Example: 2024-01-09T20:44:32Z
enable_public_endpoint boolean optional

Whether to expose a public LLM endpoint.

id string (uuid) optional
Example: 7c6d729d-360d-44db-88e3-58e98281d12e

Deployment UUID.

model_deployments array of object optional

At least one model deployment is required.

Show child properties
accelerators array of object optional

Accelerator configuration for this deployment.

Additional nested properties not shown. Refer to the full API spec for details.
model_id string optional

Used to identify an existing deployment when updating; empty means create new.

model_provider string, one of: hugging_face optional
Example: hugging_face

Model provider.

model_slug string optional
Example: mistral/mistral-7b-instruct-v3

Model identifier (e.g. Hugging Face slug).

workload_config object optional

Workload-specific configuration (e.g. ISL/OSL in future).

Additional nested properties not shown. Refer to the full API spec for details.
name string optional
Example: new-dedicated-inference

Name of the Dedicated Inference. Must be unique within the team.

status string, one of: provisioning, updating optional
Example: provisioning
updated_at string (date-time) optional
Example: 2024-01-09T20:44:32Z
version integer optional
Example: 1

Spec version.

vpc object optional
Show child properties
uuid string (uuid) required
Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID for the Dedicated Inference.

region string optional read-only
Example: atl1

DigitalOcean region where the Dedicated Inference is hosted.

spec object optional

Structured configuration for a Dedicated Inference deployment.

Show child properties
enable_public_endpoint boolean required

Whether to expose a public LLM endpoint.

model_deployments array of object required

At least one model deployment is required.

Show child properties
accelerators array of object optional

Accelerator configuration for this deployment.

Additional nested properties not shown. Refer to the full API spec for details.
model_id string optional

Used to identify an existing deployment when updating; empty means create new.

model_provider string, one of: hugging_face optional
Example: hugging_face

Model provider.

model_slug string optional
Example: mistral/mistral-7b-instruct-v3

Model identifier (e.g. Hugging Face slug).

workload_config object optional

Workload-specific configuration (e.g. ISL/OSL in future).

Additional nested properties not shown. Refer to the full API spec for details.
name string required
Example: new-dedicated-inference

Name of the Dedicated Inference. Must be unique within the team.

region string, one of: atl1, nyc2, tor1 required
Example: atl1

DigitalOcean region where the Dedicated Inference is hosted.

version integer required
Example: 1

Spec version.

vpc object required
Show child properties
uuid string (uuid) required
Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID for the Dedicated Inference.

status string (enum) optional read-only
Example: active

Current state of the Dedicated Inference.

updated_at string (date-time) optional read-only
Example: 2024-01-09T20:44:32Z

When the Dedicated Inference was last updated.

vpc_uuid string (uuid) optional read-only
Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID of the Dedicated Inference.

401

Authentication failed due to invalid credentials.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

404

The resource was not found.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

{
  "dedicated_inference": {
    "created_at": "2024-01-09T20:44:32Z",
    "endpoints": {
      "private_endpoint_fqdn": "https://b4bfug4jc41kts2ro54if91eo-private-dedicated-inference.do-infra.ai",
      "public_endpoint_fqdn": "https://b4bfug4jc41kts2ro54if91eo-public-dedicated-inference.do-infra.ai"
    },
    "id": "6b5c619c-359c-44ca-87e2-47e98170c01d",
    "pending_deployment_spec": {
      "created_at": "2024-01-09T20:44:32Z",
      "enable_public_endpoint": true,
      "id": "7c6d729d-360d-44db-88e3-58e98281d12e",
      "model_deployments": [],
      "name": "new-dedicated-inference",
      "status": "provisioning",
      "updated_at": "2024-01-09T20:44:32Z",
      "version": 1
    },
    "region": "atl1",
    "spec": {
      "enable_public_endpoint": true,
      "model_deployments": [],
      "name": "new-dedicated-inference",
      "region": "atl1",
      "version": 1
    },
    "status": "active",
    "updated_at": "2024-01-09T20:44:32Z",
    "vpc_uuid": "997615ce-132d-4bae-9270-9ee21b395e5d"
  }
}
{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}
{
  "id": "not_found",
  "message": "The resource you requested could not be found."
}
{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}
{
  "id": "server_error",
  "message": "Unexpected server-side error"
}
{
  "id": "example_error",
  "message": "some error message"
}

DELETE Delete a Dedicated Inference

/v2/dedicated-inferences/{dedicated_inference_id}
Authorizations: bearer_auth (1 scope)
Http: Bearer
Required scopes: dedicated_inference:delete

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

The DigitalOcean API handles this through OAuth, an open standard for authorization. OAuth allows you to delegate access to your account. Scopes can be used to grant full access, read-only access, or access to a specific set of endpoints.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

Because of this, it is absolutely essential that you keep your OAuth tokens secure. In fact, upon generation, the web interface will only display each token a single time in order to prevent the token from being compromised.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

  • dop_v1_ for personal access tokens generated in the control panel
  • doo_v1_ for tokens generated by applications using the OAuth flow
  • dor_v1_ for OAuth refresh tokens

Scopes

Scopes act like permissions assigned to an API token. These permissions determine what actions the token can perform. You can create API tokens that grant read-only access, full access, or limited access to specific endpoints by using custom scopes.

Generally, scopes are designed to match HTTP verbs and common CRUD operations (Create, Read, Update, Delete).

HTTP Verb CRUD Operation Scope
GET Read <resource>:read
POST Create <resource>:create
PUT/PATCH Update <resource>:update
DELETE Delete <resource>:delete

For example, creating a new Droplet by making a POST request to the /v2/droplets endpoint requires the droplet:create scope while listing Droplets by making a GET request to the /v2/droplets endpoint requires the droplet:read scope.

Each endpoint below specifies which scope is required to access it when using custom scopes.

How to Authenticate with OAuth

In order to make an authenticated request, include a bearer-type Authorization header containing your OAuth token. All requests must be made over HTTPS.

Authenticate with a Bearer Authorization Header

curl -X $HTTP_METHOD -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://api.digitalocean.com/v2/$OBJECT"

Delete an existing Dedicated Inference. Send a DELETE request to /v2/dedicated-inferences/{dedicated_inference_id}. The response 202 Accepted indicates the request was accepted for processing.

Path Parameters

dedicated_inference_id string (uuid) required
Example: 6b5c619c-359c-44ca-87e2-47e98170c01d

A unique identifier for a Dedicated Inference instance.

curl -i -X DELETE "https://api.digitalocean.com/v2/dedicated-inferences/6b5c619c-359c-44ca-87e2-47e98170c01d" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DO_TOKEN"

Responses

202

This does not indicate the success or failure of any operation, just that the request has been accepted for processing.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

401

Authentication failed due to invalid credentials.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

404

The resource was not found.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}
{
  "id": "not_found",
  "message": "The resource you requested could not be found."
}
{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}
{
  "id": "server_error",
  "message": "Unexpected server-side error"
}
{
  "id": "example_error",
  "message": "some error message"
}

GET List Dedicated Inference Accelerators

/v2/dedicated-inferences/{dedicated_inference_id}/accelerators
Authorizations: bearer_auth (1 scope)
Http: Bearer
Required scopes: dedicated_inference:read

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

The DigitalOcean API handles this through OAuth, an open standard for authorization. OAuth allows you to delegate access to your account. Scopes can be used to grant full access, read-only access, or access to a specific set of endpoints.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

Because of this, it is absolutely essential that you keep your OAuth tokens secure. In fact, upon generation, the web interface will only display each token a single time in order to prevent the token from being compromised.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

  • dop_v1_ for personal access tokens generated in the control panel
  • doo_v1_ for tokens generated by applications using the OAuth flow
  • dor_v1_ for OAuth refresh tokens

Scopes

Scopes act like permissions assigned to an API token. These permissions determine what actions the token can perform. You can create API tokens that grant read-only access, full access, or limited access to specific endpoints by using custom scopes.

Generally, scopes are designed to match HTTP verbs and common CRUD operations (Create, Read, Update, Delete).

HTTP Verb CRUD Operation Scope
GET Read <resource>:read
POST Create <resource>:create
PUT/PATCH Update <resource>:update
DELETE Delete <resource>:delete

For example, creating a new Droplet by making a POST request to the /v2/droplets endpoint requires the droplet:create scope while listing Droplets by making a GET request to the /v2/droplets endpoint requires the droplet:read scope.

Each endpoint below specifies which scope is required to access it when using custom scopes.

How to Authenticate with OAuth

In order to make an authenticated request, include a bearer-type Authorization header containing your OAuth token. All requests must be made over HTTPS.

Authenticate with a Bearer Authorization Header

curl -X $HTTP_METHOD -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://api.digitalocean.com/v2/$OBJECT"

List all accelerators (GPUs) in use by a Dedicated Inference instance. Send a GET request to /v2/dedicated-inferences/{dedicated_inference_id}/accelerators. Optionally filter by slug and use page/per_page for pagination.

Path Parameters

dedicated_inference_id string (uuid) required
Example: 6b5c619c-359c-44ca-87e2-47e98170c01d

A unique identifier for a Dedicated Inference instance.

Query Parameters

per_page integer 1 – 200 optional
Example: 20

Number of items returned per page

Default: 20
page integer >= 1 optional
Example: 1

Which 'page' of paginated results to return.

Default: 1
slug string optional
Example: gpu-mi300x1-192gb

Filter accelerators by GPU slug.

curl -i -X GET "https://api.digitalocean.com/v2/dedicated-inferences/6b5c619c-359c-44ca-87e2-47e98170c01d/accelerators" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DO_TOKEN"

Responses

200

The response will be a JSON object with a key called accelerators. This will be set to an array of accelerator objects. Pagination uses the same links and meta structure as other list endpoints.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

accelerators array of object optional
Show child properties
created_at string (date-time) optional read-only
Example: 2024-01-09T20:44:32Z
id string (uuid) optional read-only
Example: 5b5c619c-359c-44ca-87e2-47e98170c02f

Unique ID of the accelerator.

name string optional read-only
Example: mi300x1-ghfpsf

Name of the accelerator.

role string optional read-only
Example: prefill_decode

Role of the accelerator (e.g. prefill_decode).

slug string optional read-only
Example: gpu-mi300x1-192gb

DigitalOcean GPU slug.

status string optional read-only
Example: active

Status of the accelerator.

links object optional
Show child properties
pages anyOf optional
One of:
Forward Links
last string optional
Example: https://api.digitalocean.com/v2/images?page=2

URI of the last page of the results.

next string optional
Example: https://api.digitalocean.com/v2/images?page=2

URI of the next page of the results.

Backward Links
first string optional
Example: https://api.digitalocean.com/v2/images?page=1

URI of the first page of the results.

prev string optional
Example: https://api.digitalocean.com/v2/images?page=1

URI of the previous page of the results.

meta object required
401

Authentication failed due to invalid credentials.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

404

The resource was not found.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

{
  "accelerators": [
    {
      "created_at": "2024-01-09T20:44:32Z",
      "id": "5b5c619c-359c-44ca-87e2-47e98170c02f",
      "name": "mi300x1-ghfpsf",
      "role": "prefill_and_decode",
      "slug": "gpu-mi300x1-192gb",
      "status": "active"
    }
  ],
  "links": {
    "pages": {
      "first": "https://api.digitalocean.com/v2/dedicated-inferences/6b5c619c-359c-44ca-87e2-47e98170c01d/accelerators?page=1\u0026per_page=20",
      "last": "https://api.digitalocean.com/v2/dedicated-inferences/6b5c619c-359c-44ca-87e2-47e98170c01d/accelerators?page=1\u0026per_page=20"
    }
  },
  "meta": {
    "total": 1
  }
}
{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}
{
  "id": "not_found",
  "message": "The resource you requested could not be found."
}
{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}
{
  "id": "server_error",
  "message": "Unexpected server-side error"
}
{
  "id": "example_error",
  "message": "some error message"
}

GET Get a Dedicated Inference Accelerator

/v2/dedicated-inferences/{dedicated_inference_id}/accelerators/{accelerator_id}
Authorizations: bearer_auth (1 scope)
Http: Bearer
Required scopes: dedicated_inference:read

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

The DigitalOcean API handles this through OAuth, an open standard for authorization. OAuth allows you to delegate access to your account. Scopes can be used to grant full access, read-only access, or access to a specific set of endpoints.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

Because of this, it is absolutely essential that you keep your OAuth tokens secure. In fact, upon generation, the web interface will only display each token a single time in order to prevent the token from being compromised.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

  • dop_v1_ for personal access tokens generated in the control panel
  • doo_v1_ for tokens generated by applications using the OAuth flow
  • dor_v1_ for OAuth refresh tokens

Scopes

Scopes act like permissions assigned to an API token. These permissions determine what actions the token can perform. You can create API tokens that grant read-only access, full access, or limited access to specific endpoints by using custom scopes.

Generally, scopes are designed to match HTTP verbs and common CRUD operations (Create, Read, Update, Delete).

HTTP Verb CRUD Operation Scope
GET Read <resource>:read
POST Create <resource>:create
PUT/PATCH Update <resource>:update
DELETE Delete <resource>:delete

For example, creating a new Droplet by making a POST request to the /v2/droplets endpoint requires the droplet:create scope while listing Droplets by making a GET request to the /v2/droplets endpoint requires the droplet:read scope.

Each endpoint below specifies which scope is required to access it when using custom scopes.

How to Authenticate with OAuth

In order to make an authenticated request, include a bearer-type Authorization header containing your OAuth token. All requests must be made over HTTPS.

Authenticate with a Bearer Authorization Header

curl -X $HTTP_METHOD -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://api.digitalocean.com/v2/$OBJECT"

Retrieve a single accelerator by ID for a Dedicated Inference instance. Send a GET request to /v2/dedicated-inferences/{dedicated_inference_id}/accelerators/{accelerator_id}.

Path Parameters

dedicated_inference_id string (uuid) required
Example: 6b5c619c-359c-44ca-87e2-47e98170c01d

A unique identifier for a Dedicated Inference instance.

accelerator_id string (uuid) required
Example: 5b5c619c-359c-44ca-87e2-47e98170c02f

A unique identifier for a Dedicated Inference accelerator.

curl -i -X GET "https://api.digitalocean.com/v2/dedicated-inferences/6b5c619c-359c-44ca-87e2-47e98170c01d/accelerators/5b5c619c-359c-44ca-87e2-47e98170c02f" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DO_TOKEN"

Responses

200

Single accelerator object.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

created_at string (date-time) optional read-only
Example: 2024-01-09T20:44:32Z
id string (uuid) optional read-only
Example: 5b5c619c-359c-44ca-87e2-47e98170c02f

Unique ID of the accelerator.

name string optional read-only
Example: mi300x1-ghfpsf

Name of the accelerator.

role string optional read-only
Example: prefill_decode

Role of the accelerator (e.g. prefill_decode).

slug string optional read-only
Example: gpu-mi300x1-192gb

DigitalOcean GPU slug.

status string optional read-only
Example: active

Status of the accelerator.

401

Authentication failed due to invalid credentials.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

404

The resource was not found.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

{
  "created_at": "2024-01-09T20:44:32Z",
  "id": "5b5c619c-359c-44ca-87e2-47e98170c02f",
  "name": "mi300x1-ghfpsf",
  "role": "prefill_decode",
  "slug": "gpu-mi300x1-192gb",
  "status": "active"
}
{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}
{
  "id": "not_found",
  "message": "The resource you requested could not be found."
}
{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}
{
  "id": "server_error",
  "message": "Unexpected server-side error"
}
{
  "id": "example_error",
  "message": "some error message"
}

GET Get Dedicated Inference CA Certificate

/v2/dedicated-inferences/{dedicated_inference_id}/ca
Authorizations: bearer_auth (1 scope)
Http: Bearer
Required scopes: dedicated_inference:read

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

The DigitalOcean API handles this through OAuth, an open standard for authorization. OAuth allows you to delegate access to your account. Scopes can be used to grant full access, read-only access, or access to a specific set of endpoints.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

Because of this, it is absolutely essential that you keep your OAuth tokens secure. In fact, upon generation, the web interface will only display each token a single time in order to prevent the token from being compromised.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

  • dop_v1_ for personal access tokens generated in the control panel
  • doo_v1_ for tokens generated by applications using the OAuth flow
  • dor_v1_ for OAuth refresh tokens

Scopes

Scopes act like permissions assigned to an API token. These permissions determine what actions the token can perform. You can create API tokens that grant read-only access, full access, or limited access to specific endpoints by using custom scopes.

Generally, scopes are designed to match HTTP verbs and common CRUD operations (Create, Read, Update, Delete).

HTTP Verb CRUD Operation Scope
GET Read <resource>:read
POST Create <resource>:create
PUT/PATCH Update <resource>:update
DELETE Delete <resource>:delete

For example, creating a new Droplet by making a POST request to the /v2/droplets endpoint requires the droplet:create scope while listing Droplets by making a GET request to the /v2/droplets endpoint requires the droplet:read scope.

Each endpoint below specifies which scope is required to access it when using custom scopes.

How to Authenticate with OAuth

In order to make an authenticated request, include a bearer-type Authorization header containing your OAuth token. All requests must be made over HTTPS.

Authenticate with a Bearer Authorization Header

curl -X $HTTP_METHOD -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://api.digitalocean.com/v2/$OBJECT"

Get the CA certificate for a Dedicated Inference instance (base64-encoded). Required for private endpoint connectivity. Send a GET request to /v2/dedicated-inferences/{dedicated_inference_id}/ca.

Path Parameters

dedicated_inference_id string (uuid) required
Example: 6b5c619c-359c-44ca-87e2-47e98170c01d

A unique identifier for a Dedicated Inference instance.

curl -i -X GET "https://api.digitalocean.com/v2/dedicated-inferences/6b5c619c-359c-44ca-87e2-47e98170c01d/ca" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DO_TOKEN"

Responses

200

CA certificate for the Dedicated Inference (base64-encoded). Required for private endpoint connectivity.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

cert string required

Base64-encoded CA certificate.

401

Authentication failed due to invalid credentials.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

404

The resource was not found.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

{
  "cert": "LS0tLS1CRUdJTi..."
}
{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}
{
  "id": "not_found",
  "message": "The resource you requested could not be found."
}
{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}
{
  "id": "server_error",
  "message": "Unexpected server-side error"
}
{
  "id": "example_error",
  "message": "some error message"
}

GET List Dedicated Inference Tokens

/v2/dedicated-inferences/{dedicated_inference_id}/tokens
Authorizations: bearer_auth (1 scope)
Http: Bearer
Required scopes: dedicated_inference_tokens:read

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

The DigitalOcean API handles this through OAuth, an open standard for authorization. OAuth allows you to delegate access to your account. Scopes can be used to grant full access, read-only access, or access to a specific set of endpoints.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

Because of this, it is absolutely essential that you keep your OAuth tokens secure. In fact, upon generation, the web interface will only display each token a single time in order to prevent the token from being compromised.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

  • dop_v1_ for personal access tokens generated in the control panel
  • doo_v1_ for tokens generated by applications using the OAuth flow
  • dor_v1_ for OAuth refresh tokens

Scopes

Scopes act like permissions assigned to an API token. These permissions determine what actions the token can perform. You can create API tokens that grant read-only access, full access, or limited access to specific endpoints by using custom scopes.

Generally, scopes are designed to match HTTP verbs and common CRUD operations (Create, Read, Update, Delete).

HTTP Verb CRUD Operation Scope
GET Read <resource>:read
POST Create <resource>:create
PUT/PATCH Update <resource>:update
DELETE Delete <resource>:delete

For example, creating a new Droplet by making a POST request to the /v2/droplets endpoint requires the droplet:create scope while listing Droplets by making a GET request to the /v2/droplets endpoint requires the droplet:read scope.

Each endpoint below specifies which scope is required to access it when using custom scopes.

How to Authenticate with OAuth

In order to make an authenticated request, include a bearer-type Authorization header containing your OAuth token. All requests must be made over HTTPS.

Authenticate with a Bearer Authorization Header

curl -X $HTTP_METHOD -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://api.digitalocean.com/v2/$OBJECT"

List all access tokens for a Dedicated Inference instance. Token values are not returned; only id, name, and created_at. Send a GET request to /v2/dedicated-inferences/{dedicated_inference_id}/tokens.

Path Parameters

dedicated_inference_id string (uuid) required
Example: 6b5c619c-359c-44ca-87e2-47e98170c01d

A unique identifier for a Dedicated Inference instance.

Query Parameters

per_page integer 1 – 200 optional
Example: 2

Number of items returned per page

Default: 20
page integer >= 1 optional
Example: 1

Which 'page' of paginated results to return.

Default: 1
curl -i -X GET "https://api.digitalocean.com/v2/dedicated-inferences/6b5c619c-359c-44ca-87e2-47e98170c01d/tokens?page=1&per_page=20" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DO_TOKEN"

Responses

200

The response will be a JSON object with a key called tokens. This will be set to an array of objects (id, name, created_at only; value is not returned). Pagination uses the same links and meta structure as other list endpoints (e.g. VPC peerings).

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

tokens array of object optional
Show child properties
created_at string (date-time) optional read-only
Example: 2024-01-09T20:44:32Z
id string (uuid) optional read-only
Example: 01333f14-a903-4b8e-92b3-363a767aa052

Unique ID of the token.

name string optional read-only
Example: first-token

Name of the token.

value string optional read-only
Example: di_xxxxxxxxxxxxxxxxxxxxxxxx

Token value; only returned once on create. Store securely.

links object optional
Show child properties
pages anyOf optional
One of:
Forward Links
last string optional
Example: https://api.digitalocean.com/v2/images?page=2

URI of the last page of the results.

next string optional
Example: https://api.digitalocean.com/v2/images?page=2

URI of the next page of the results.

Backward Links
first string optional
Example: https://api.digitalocean.com/v2/images?page=1

URI of the first page of the results.

prev string optional
Example: https://api.digitalocean.com/v2/images?page=1

URI of the previous page of the results.

meta object required
401

Authentication failed due to invalid credentials.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

404

The resource was not found.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

{
  "links": {
    "pages": {
      "first": "https://api.digitalocean.com/v2/dedicated-inferences/6b5c619c-359c-44ca-87e2-47e98170c01d/tokens?page=1\u0026per_page=20",
      "last": "https://api.digitalocean.com/v2/dedicated-inferences/6b5c619c-359c-44ca-87e2-47e98170c01d/tokens?page=1\u0026per_page=20"
    }
  },
  "meta": {
    "total": 0
  },
  "tokens": []
}
{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}
{
  "id": "not_found",
  "message": "The resource you requested could not be found."
}
{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}
{
  "id": "server_error",
  "message": "Unexpected server-side error"
}
{
  "id": "example_error",
  "message": "some error message"
}

POST Create a Dedicated Inference Token

/v2/dedicated-inferences/{dedicated_inference_id}/tokens
Authorizations: bearer_auth (1 scope)
Http: Bearer
Required scopes: dedicated_inference_tokens:create

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

The DigitalOcean API handles this through OAuth, an open standard for authorization. OAuth allows you to delegate access to your account. Scopes can be used to grant full access, read-only access, or access to a specific set of endpoints.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

Because of this, it is absolutely essential that you keep your OAuth tokens secure. In fact, upon generation, the web interface will only display each token a single time in order to prevent the token from being compromised.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

  • dop_v1_ for personal access tokens generated in the control panel
  • doo_v1_ for tokens generated by applications using the OAuth flow
  • dor_v1_ for OAuth refresh tokens

Scopes

Scopes act like permissions assigned to an API token. These permissions determine what actions the token can perform. You can create API tokens that grant read-only access, full access, or limited access to specific endpoints by using custom scopes.

Generally, scopes are designed to match HTTP verbs and common CRUD operations (Create, Read, Update, Delete).

HTTP Verb CRUD Operation Scope
GET Read <resource>:read
POST Create <resource>:create
PUT/PATCH Update <resource>:update
DELETE Delete <resource>:delete

For example, creating a new Droplet by making a POST request to the /v2/droplets endpoint requires the droplet:create scope while listing Droplets by making a GET request to the /v2/droplets endpoint requires the droplet:read scope.

Each endpoint below specifies which scope is required to access it when using custom scopes.

How to Authenticate with OAuth

In order to make an authenticated request, include a bearer-type Authorization header containing your OAuth token. All requests must be made over HTTPS.

Authenticate with a Bearer Authorization Header

curl -X $HTTP_METHOD -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://api.digitalocean.com/v2/$OBJECT"

Create a new access token for a Dedicated Inference instance. Send a POST request to /v2/dedicated-inferences/{dedicated_inference_id}/tokens with a name. The token value is returned only once in the response; store it securely.

Path Parameters

dedicated_inference_id string (uuid) required
Example: 6b5c619c-359c-44ca-87e2-47e98170c01d

A unique identifier for a Dedicated Inference instance.

Request Body: application/json

name string required
Example: new-inference-token

Name for the new token.

Content type application/json
{
  "name": "new-inference-token"
}
curl -i -X POST "https://api.digitalocean.com/v2/dedicated-inferences/6b5c619c-359c-44ca-87e2-47e98170c01d/tokens" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DO_TOKEN" \
  -d '{"name": "new-inference-token"}'

Responses

202

Token created; value is returned only once. Store securely.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

token object optional

Access token for authenticating to Dedicated Inference endpoints.

Show child properties
created_at string (date-time) optional read-only
Example: 2024-01-09T20:44:32Z
id string (uuid) optional read-only
Example: 01333f14-a903-4b8e-92b3-363a767aa052

Unique ID of the token.

name string optional read-only
Example: first-token

Name of the token.

value string optional read-only
Example: di_xxxxxxxxxxxxxxxxxxxxxxxx

Token value; only returned once on create. Store securely.

401

Authentication failed due to invalid credentials.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

404

The resource was not found.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

{
  "token": {
    "created_at": "2024-01-09T20:44:32Z",
    "id": "01333f14-a903-4b8e-92b3-363a767aa052",
    "name": "first-token",
    "value": "di_xxxxxxxxxxxxxxxxxxxxxxxx"
  }
}
{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}
{
  "id": "not_found",
  "message": "The resource you requested could not be found."
}
{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}
{
  "id": "server_error",
  "message": "Unexpected server-side error"
}
{
  "id": "example_error",
  "message": "some error message"
}

DELETE Revoke a Dedicated Inference Token

/v2/dedicated-inferences/{dedicated_inference_id}/tokens/{token_id}
Authorizations: bearer_auth (1 scope)
Http: Bearer
Required scopes: dedicated_inference_tokens:delete

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

The DigitalOcean API handles this through OAuth, an open standard for authorization. OAuth allows you to delegate access to your account. Scopes can be used to grant full access, read-only access, or access to a specific set of endpoints.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

Because of this, it is absolutely essential that you keep your OAuth tokens secure. In fact, upon generation, the web interface will only display each token a single time in order to prevent the token from being compromised.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

  • dop_v1_ for personal access tokens generated in the control panel
  • doo_v1_ for tokens generated by applications using the OAuth flow
  • dor_v1_ for OAuth refresh tokens

Scopes

Scopes act like permissions assigned to an API token. These permissions determine what actions the token can perform. You can create API tokens that grant read-only access, full access, or limited access to specific endpoints by using custom scopes.

Generally, scopes are designed to match HTTP verbs and common CRUD operations (Create, Read, Update, Delete).

HTTP Verb CRUD Operation Scope
GET Read <resource>:read
POST Create <resource>:create
PUT/PATCH Update <resource>:update
DELETE Delete <resource>:delete

For example, creating a new Droplet by making a POST request to the /v2/droplets endpoint requires the droplet:create scope while listing Droplets by making a GET request to the /v2/droplets endpoint requires the droplet:read scope.

Each endpoint below specifies which scope is required to access it when using custom scopes.

How to Authenticate with OAuth

In order to make an authenticated request, include a bearer-type Authorization header containing your OAuth token. All requests must be made over HTTPS.

Authenticate with a Bearer Authorization Header

curl -X $HTTP_METHOD -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://api.digitalocean.com/v2/$OBJECT"

Revoke (delete) an access token for a Dedicated Inference instance. Send a DELETE request to /v2/dedicated-inferences/{dedicated_inference_id}/tokens/{token_id}.

Path Parameters

dedicated_inference_id string (uuid) required
Example: 6b5c619c-359c-44ca-87e2-47e98170c01d

A unique identifier for a Dedicated Inference instance.

token_id string (uuid) required
Example: f11d4795-c1db-4ac3-9aa6-a0ea3c58877e

A unique identifier for a Dedicated Inference access token.

curl -i -X DELETE "https://api.digitalocean.com/v2/dedicated-inferences/6b5c619c-359c-44ca-87e2-47e98170c01d/tokens/f11d4795-c1db-4ac3-9aa6-a0ea3c58877e" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DO_TOKEN"

Responses

204

Token revoked.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

401

Authentication failed due to invalid credentials.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

404

The resource was not found.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

id string required
Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required
Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional
Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}
{
  "id": "not_found",
  "message": "The resource you requested could not be found."
}
{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}
{
  "id": "server_error",
  "message": "Unexpected server-side error"
}
{
  "id": "example_error",
  "message": "some error message"
}

We can't find any results for your search.

Try using different keywords or simplifying your search terms.