Give Feedback

Dedicated Inference

Validated on 9 Oct 2024 • Last edited on 9 Apr 2026

Copy page as Markdown View page as Markdown

Dedicated Inference delivers scalable production-grade LLM hosting on DigitalOcean. Create, list, get, update, and delete Dedicated Inference instances; manage accelerators, CA certificate, sizes, GPU model config, and access tokens.

Base URL https://api.digitalocean.com

Endpoints

GET List Dedicated Inferences POST Create a Dedicated Inference GET Get Dedicated Inference GPU Model Config GET List Dedicated Inference Sizes GET Get a Dedicated Inference PATCH Update a Dedicated Inference DELETE Delete a Dedicated Inference GET List Dedicated Inference Accelerators GET Get a Dedicated Inference Accelerator GET Get Dedicated Inference CA Certificate GET List Dedicated Inference Tokens POST Create a Dedicated Inference Token DELETE Revoke a Dedicated Inference Token

GET List Dedicated Inferences

/v2/dedicated-inferences

Authorizations: bearer_auth (1 scope)

Http: Bearer

Required scopes: dedicated_inference:read

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

The DigitalOcean API handles this through OAuth, an open standard for authorization. OAuth allows you to delegate access to your account. Scopes can be used to grant full access, read-only access, or access to a specific set of endpoints.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

Because of this, it is absolutely essential that you keep your OAuth tokens secure. In fact, upon generation, the web interface will only display each token a single time in order to prevent the token from being compromised.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

dop_v1_ for personal access tokens generated in the control panel
doo_v1_ for tokens generated by applications using the OAuth flow
dor_v1_ for OAuth refresh tokens

Scopes

Scopes act like permissions assigned to an API token. These permissions determine what actions the token can perform. You can create API tokens that grant read-only access, full access, or limited access to specific endpoints by using custom scopes.

Generally, scopes are designed to match HTTP verbs and common CRUD operations (Create, Read, Update, Delete).

HTTP Verb	CRUD Operation	Scope
GET	Read	`<resource>:read`
POST	Create	`<resource>:create`
PUT/PATCH	Update	`<resource>:update`
DELETE	Delete	`<resource>:delete`

For example, creating a new Droplet by making a POST request to the /v2/droplets endpoint requires the droplet:create scope while listing Droplets by making a GET request to the /v2/droplets endpoint requires the droplet:read scope.

Each endpoint below specifies which scope is required to access it when using custom scopes.

How to Authenticate with OAuth

In order to make an authenticated request, include a bearer-type Authorization header containing your OAuth token. All requests must be made over HTTPS.

Authenticate with a Bearer Authorization Header

curl -X $HTTP_METHOD -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://api.digitalocean.com/v2/$OBJECT"

List all Dedicated Inference instances for your team. Send a GET request to /v2/dedicated-inferences. You may filter by region and use page and per_page for pagination.

Query Parameters

per_page integer 1 – 200 optional

Example: 2

Number of items returned per page

Default: 20

page integer >= 1 optional

Example: 1

Which 'page' of paginated results to return.

Default: 1

region string, one of: nyc2, tor1, atl1 optional

Example: atl1

Filter by region. Dedicated Inference is only available in nyc2, tor1, and atl1.

Request: `/v2/dedicated-inferences`

cURL

curl -i -X GET "https://api.digitalocean.com/v2/dedicated-inferences?region=nyc2&page=1&per_page=20" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DO_TOKEN"

Responses

200

The response will be a JSON object with a key called dedicated_inferences. This will be set to an array of objects, each of which will contain the standard attributes associated with a Dedicated Inference. Pagination uses the same links and meta structure as other list endpoints.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

dedicated_inferences array of object required

Array of Dedicated Inference instances.

Show child properties

created_at string (date-time) optional read-only

Example: 2024-01-09T20:44:32Z

When the Dedicated Inference was created.

endpoints object optional read-only

Show child properties

private_endpoint_fqdn string (uri) optional

Example: https://b4bfug4jc41kts2ro54if91eo-private-dedicated-inference.do-infra.ai

Private VPC FQDN of the Dedicated Inference instance.

public_endpoint_fqdn string (uri) optional

Example: https://b4bfug4jc41kts2ro54if91eo-public-dedicated-inference.do-infra.ai

Public FQDN of the Dedicated Inference instance.

id string (uuid) optional read-only

Example: 6b5c619c-359c-44ca-87e2-47e98170c01d

Unique ID of the Dedicated Inference.

pending_deployment_spec object optional

Pending deployment when status is provisioning or updating.

Show child properties

created_at string (date-time) optional

Example: 2024-01-09T20:44:32Z

enable_public_endpoint boolean optional

Whether to expose a public LLM endpoint.

id string (uuid) optional

Example: 7c6d729d-360d-44db-88e3-58e98281d12e

Deployment UUID.

model_deployments array of object optional

At least one model deployment is required.

Show child properties

accelerators array of object optional

Accelerator configuration for this deployment.

Additional nested properties not shown. Refer to the full API spec for details.

model_id string optional

Used to identify an existing deployment when updating; empty means create new.

model_provider string, one of: hugging_face optional

Example: hugging_face

Model provider.

model_slug string optional

Example: mistral/mistral-7b-instruct-v3

Model identifier (e.g. Hugging Face slug).

workload_config object optional

Workload-specific configuration (e.g. ISL/OSL in future).

Additional nested properties not shown. Refer to the full API spec for details.

name string optional

Example: new-dedicated-inference

Name of the Dedicated Inference. Must be unique within the team.

status string, one of: provisioning, updating optional

Example: provisioning

updated_at string (date-time) optional

Example: 2024-01-09T20:44:32Z

version integer optional

Example: 1

Spec version.

vpc object optional

Show child properties

uuid string (uuid) required

Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID for the Dedicated Inference.

region string optional read-only

Example: atl1

DigitalOcean region where the Dedicated Inference is hosted.

spec object optional

Structured configuration for a Dedicated Inference deployment.

Show child properties

enable_public_endpoint boolean required

Whether to expose a public LLM endpoint.

model_deployments array of object required

At least one model deployment is required.

Show child properties

accelerators array of object optional

Accelerator configuration for this deployment.

Additional nested properties not shown. Refer to the full API spec for details.

model_id string optional

Used to identify an existing deployment when updating; empty means create new.

model_provider string, one of: hugging_face optional

Example: hugging_face

Model provider.

model_slug string optional

Example: mistral/mistral-7b-instruct-v3

Model identifier (e.g. Hugging Face slug).

workload_config object optional

Workload-specific configuration (e.g. ISL/OSL in future).

Additional nested properties not shown. Refer to the full API spec for details.

name string required

Example: new-dedicated-inference

Name of the Dedicated Inference. Must be unique within the team.

region string, one of: atl1, nyc2, tor1 required

Example: atl1

DigitalOcean region where the Dedicated Inference is hosted.

version integer required

Example: 1

Spec version.

vpc object required

Show child properties

uuid string (uuid) required

Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID for the Dedicated Inference.

status string (enum) optional read-only

Example: active

Current state of the Dedicated Inference.

updated_at string (date-time) optional read-only

Example: 2024-01-09T20:44:32Z

When the Dedicated Inference was last updated.

vpc_uuid string (uuid) optional read-only

Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID of the Dedicated Inference.

links object required

Show child properties

pages object optional

Pagination links (first, prev, next, last).

meta object required

Show child properties

total integer required

Example: 1

Total number of results.

401

Authentication failed due to invalid credentials.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

Response

200

{
  "dedicated_inferences": [
    {
      "created_at": "2024-01-09T20:44:32Z",
      "endpoints": {
        "private_endpoint_fqdn": "https://b4bfug4jc41kts2ro54if91eo-private-dedicated-inference.do-infra.ai",
        "public_endpoint_fqdn": "https://b4bfug4jc41kts2ro54if91eo-public-dedicated-inference.do-infra.ai"
      },
      "id": "6b5c619c-359c-44ca-87e2-47e98170c01d",
      "region": "atl1",
      "status": "active",
      "updated_at": "2024-01-09T20:44:32Z",
      "vpc_uuid": "997615ce-132d-4bae-9270-9ee21b395e5d"
    }
  ],
  "links": {
    "pages": {
      "first": "https://api.digitalocean.com/v2/dedicated-inferences?page=1\u0026per_page=20",
      "last": "https://api.digitalocean.com/v2/dedicated-inferences?page=1\u0026per_page=20"
    }
  },
  "meta": {
    "total": 1
  }
}

401

{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}

429

{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}

500

{
  "id": "server_error",
  "message": "Unexpected server-side error"
}

default

{
  "id": "example_error",
  "message": "some error message"
}

POST Create a Dedicated Inference

/v2/dedicated-inferences

Authorizations: bearer_auth (1 scope)

Http: Bearer

Required scopes: dedicated_inference:create

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

dop_v1_ for personal access tokens generated in the control panel
doo_v1_ for tokens generated by applications using the OAuth flow
dor_v1_ for OAuth refresh tokens

Scopes

Generally, scopes are designed to match HTTP verbs and common CRUD operations (Create, Read, Update, Delete).

HTTP Verb	CRUD Operation	Scope
GET	Read	`<resource>:read`
POST	Create	`<resource>:create`
PUT/PATCH	Update	`<resource>:update`
DELETE	Delete	`<resource>:delete`

Each endpoint below specifies which scope is required to access it when using custom scopes.

How to Authenticate with OAuth

In order to make an authenticated request, include a bearer-type Authorization header containing your OAuth token. All requests must be made over HTTPS.

Authenticate with a Bearer Authorization Header

curl -X $HTTP_METHOD -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://api.digitalocean.com/v2/$OBJECT"

Create a new Dedicated Inference for your team. Send a POST request to /v2/dedicated-inferences with a spec object (version, name, region, vpc, enable_public_endpoint, model_deployments) and optional access_tokens (e.g. hugging_face_token for gated models). The response code 202 Accepted indicates the request was accepted for processing; it does not indicate success or failure. The token value is returned only on create; store it securely.

Request Body: `application/json`

access_tokens object optional

Example: {"hugging_face_token":"$HF_TOKEN"}

Key-value pairs for provider tokens (e.g. Hugging Face).

spec object required

Structured configuration for a Dedicated Inference deployment.

Show child properties

enable_public_endpoint boolean required

Whether to expose a public LLM endpoint.

model_deployments array of object required

At least one model deployment is required.

Show child properties

accelerators array of object optional

Accelerator configuration for this deployment.

Show child properties

accelerator_slug string required

Example: gpu-mi300x1-192gb

DigitalOcean GPU slug.

scale integer required

Example: 1

Number of accelerator instances.

status string, one of: new, provisioning, active optional read-only

Example: active

Current state of the Accelerator.

type string required

Example: prefill_decode

Accelerator type (e.g. prefill_decode).

model_id string optional

Used to identify an existing deployment when updating; empty means create new.

model_provider string, one of: hugging_face optional

Example: hugging_face

Model provider.

model_slug string optional

Example: mistral/mistral-7b-instruct-v3

Model identifier (e.g. Hugging Face slug).

workload_config object optional

Workload-specific configuration (e.g. ISL/OSL in future).

name string required

Example: new-dedicated-inference

Name of the Dedicated Inference. Must be unique within the team.

region string, one of: atl1, nyc2, tor1 required

Example: atl1

DigitalOcean region where the Dedicated Inference is hosted.

version integer required

Example: 1

Spec version.

vpc object required

Show child properties

uuid string (uuid) required

Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID for the Dedicated Inference.

Request: `/v2/dedicated-inferences`

Payload

Content type application/json

{
  "access_tokens": {
    "hugging_face_token": "$HF_TOKEN"
  },
  "spec": {
    "enable_public_endpoint": true,
    "model_deployments": [],
    "name": "new-dedicated-inference",
    "region": "atl1",
    "version": 1,
    "vpc": {
      "uuid": "997615ce-132d-4bae-9270-9ee21b395e5d"
    }
  }
}

cURL

curl -i -X POST "https://api.digitalocean.com/v2/dedicated-inferences" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DO_TOKEN" \
  -d '{
    "spec": {
      "version": 1,
      "name": "new-dedicated-inference",
      "region": "atl1",
      "vpc": { "uuid": "7e5c619c-359c-44ca-87e2-47e98170c012" },
      "enable_public_endpoint": true,
      "model_deployments": [{
        "model_slug": "mistral/mistral-7b-instruct-v3",
        "model_provider": "hugging_face",
        "workload_config": {},
        "accelerators": [{
          "scale": 2,
          "type": "prefill_decode",
          "accelerator_slug": "gpu-mi300x1-192gb"
        }]
      }]
    },
    "access_tokens": { "hugging_face_token": "$HF_TOKEN" }
  }'

Responses

202

Dedicated Inference create/update accepted for processing (202). Success or failure is not indicated by the response code.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

dedicated_inference object optional

A Dedicated Inference instance.

Show child properties

created_at string (date-time) optional read-only

Example: 2024-01-09T20:44:32Z

When the Dedicated Inference was created.

endpoints object optional read-only

Show child properties

private_endpoint_fqdn string (uri) optional

Example: https://b4bfug4jc41kts2ro54if91eo-private-dedicated-inference.do-infra.ai

Private VPC FQDN of the Dedicated Inference instance.

public_endpoint_fqdn string (uri) optional

Example: https://b4bfug4jc41kts2ro54if91eo-public-dedicated-inference.do-infra.ai

Public FQDN of the Dedicated Inference instance.

id string (uuid) optional read-only

Example: 6b5c619c-359c-44ca-87e2-47e98170c01d

Unique ID of the Dedicated Inference.

pending_deployment_spec object optional

Pending deployment when status is provisioning or updating.

Show child properties

created_at string (date-time) optional

Example: 2024-01-09T20:44:32Z

enable_public_endpoint boolean optional

Whether to expose a public LLM endpoint.

id string (uuid) optional

Example: 7c6d729d-360d-44db-88e3-58e98281d12e

Deployment UUID.

model_deployments array of object optional

At least one model deployment is required.

Show child properties

accelerators array of object optional

Accelerator configuration for this deployment.

Additional nested properties not shown. Refer to the full API spec for details.

model_id string optional

Used to identify an existing deployment when updating; empty means create new.

model_provider string, one of: hugging_face optional

Example: hugging_face

Model provider.

model_slug string optional

Example: mistral/mistral-7b-instruct-v3

Model identifier (e.g. Hugging Face slug).

workload_config object optional

Workload-specific configuration (e.g. ISL/OSL in future).

Additional nested properties not shown. Refer to the full API spec for details.

name string optional

Example: new-dedicated-inference

Name of the Dedicated Inference. Must be unique within the team.

status string, one of: provisioning, updating optional

Example: provisioning

updated_at string (date-time) optional

Example: 2024-01-09T20:44:32Z

version integer optional

Example: 1

Spec version.

vpc object optional

Show child properties

uuid string (uuid) required

Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID for the Dedicated Inference.

region string optional read-only

Example: atl1

DigitalOcean region where the Dedicated Inference is hosted.

spec object optional

Structured configuration for a Dedicated Inference deployment.

Show child properties

enable_public_endpoint boolean required

Whether to expose a public LLM endpoint.

model_deployments array of object required

At least one model deployment is required.

Show child properties

accelerators array of object optional

Accelerator configuration for this deployment.

Additional nested properties not shown. Refer to the full API spec for details.

model_id string optional

Used to identify an existing deployment when updating; empty means create new.

model_provider string, one of: hugging_face optional

Example: hugging_face

Model provider.

model_slug string optional

Example: mistral/mistral-7b-instruct-v3

Model identifier (e.g. Hugging Face slug).

workload_config object optional

Workload-specific configuration (e.g. ISL/OSL in future).

Additional nested properties not shown. Refer to the full API spec for details.

name string required

Example: new-dedicated-inference

Name of the Dedicated Inference. Must be unique within the team.

region string, one of: atl1, nyc2, tor1 required

Example: atl1

DigitalOcean region where the Dedicated Inference is hosted.

version integer required

Example: 1

Spec version.

vpc object required

Show child properties

uuid string (uuid) required

Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID for the Dedicated Inference.

status string (enum) optional read-only

Example: active

Current state of the Dedicated Inference.

updated_at string (date-time) optional read-only

Example: 2024-01-09T20:44:32Z

When the Dedicated Inference was last updated.

vpc_uuid string (uuid) optional read-only

Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID of the Dedicated Inference.

token object optional

Access token for authenticating to Dedicated Inference endpoints.

Show child properties

created_at string (date-time) optional read-only

Example: 2024-01-09T20:44:32Z

id string (uuid) optional read-only

Example: 01333f14-a903-4b8e-92b3-363a767aa052

Unique ID of the token.

is_managed boolean optional read-only

Example: false

When true, the token is managed by DigitalOcean (for example, system-provisioned). When false, the token was created by the user.

name string optional read-only

Example: first-token

Name of the token.

value string optional read-only

Example: di_xxxxxxxxxxxxxxxxxxxxxxxx

Token value; only returned once on create. Store securely.

401

Authentication failed due to invalid credentials.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

Response

202

{
  "dedicated_inference": {
    "created_at": "2024-01-09T20:44:32Z",
    "endpoints": {
      "private_endpoint_fqdn": "https://b4bfug4jc41kts2ro54if91eo-private-dedicated-inference.do-infra.ai",
      "public_endpoint_fqdn": "https://b4bfug4jc41kts2ro54if91eo-public-dedicated-inference.do-infra.ai"
    },
    "id": "6b5c619c-359c-44ca-87e2-47e98170c01d",
    "pending_deployment_spec": {
      "created_at": "2024-01-09T20:44:32Z",
      "enable_public_endpoint": true,
      "id": "7c6d729d-360d-44db-88e3-58e98281d12e",
      "model_deployments": [],
      "name": "new-dedicated-inference",
      "status": "provisioning",
      "updated_at": "2024-01-09T20:44:32Z",
      "version": 1
    },
    "region": "atl1",
    "spec": {
      "enable_public_endpoint": true,
      "model_deployments": [],
      "name": "new-dedicated-inference",
      "region": "atl1",
      "version": 1
    },
    "status": "active",
    "updated_at": "2024-01-09T20:44:32Z",
    "vpc_uuid": "997615ce-132d-4bae-9270-9ee21b395e5d"
  },
  "token": {
    "created_at": "2024-01-09T20:44:32Z",
    "id": "01333f14-a903-4b8e-92b3-363a767aa052",
    "is_managed": false,
    "name": "first-token",
    "value": "di_xxxxxxxxxxxxxxxxxxxxxxxx"
  }
}

401

{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}

429

{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}

500

{
  "id": "server_error",
  "message": "Unexpected server-side error"
}

default

{
  "id": "example_error",
  "message": "some error message"
}

GET Get Dedicated Inference GPU Model Config

/v2/dedicated-inferences/gpu-model-config

Authorizations: bearer_auth (1 scope)

Http: Bearer

Required scopes: dedicated_inference:read

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

dop_v1_ for personal access tokens generated in the control panel
doo_v1_ for tokens generated by applications using the OAuth flow
dor_v1_ for OAuth refresh tokens

Scopes

Generally, scopes are designed to match HTTP verbs and common CRUD operations (Create, Read, Update, Delete).

HTTP Verb	CRUD Operation	Scope
GET	Read	`<resource>:read`
POST	Create	`<resource>:create`
PUT/PATCH	Update	`<resource>:update`
DELETE	Delete	`<resource>:delete`

Each endpoint below specifies which scope is required to access it when using custom scopes.

How to Authenticate with OAuth

In order to make an authenticated request, include a bearer-type Authorization header containing your OAuth token. All requests must be made over HTTPS.

Authenticate with a Bearer Authorization Header

curl -X $HTTP_METHOD -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://api.digitalocean.com/v2/$OBJECT"

Get supported GPU and model configurations for Dedicated Inference. Use this to discover supported GPU slugs and model slugs (e.g. Hugging Face). Send a GET request to /v2/dedicated-inferences/gpu-model-config.

Request: `/v2/dedicated-inferences/gpu-model-config`

cURL

curl -i -X GET "https://api.digitalocean.com/v2/dedicated-inferences/gpu-model-config" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DO_TOKEN"

Responses

200

GPU model configs (gpu_slugs, model_slug, model_name, is_gated_model).

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

gpu_model_configs array of object optional

Show child properties

gpu_slugs array of string optional

Example: ["gpu-mi300x1-192gb"]

is_gated_model boolean optional

Whether the model requires gated access (e.g. Hugging Face token).

model_name string optional

Example: Mistral-7B-Instruct-v0.3

model_slug string optional

Example: mistral/mistral-7b-instruct-v3

401

Authentication failed due to invalid credentials.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

Response

200

{
  "gpu_model_configs": [
    {
      "gpu_slugs": [
        "gpu-mi300x1-192gb"
      ],
      "is_gated_model": true,
      "model_name": "Mistral-7B-Instruct-v0.3",
      "model_slug": "mistral/mistral-7b-instruct-v3"
    }
  ]
}

401

{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}

429

{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}

500

{
  "id": "server_error",
  "message": "Unexpected server-side error"
}

default

{
  "id": "example_error",
  "message": "some error message"
}

GET List Dedicated Inference Sizes

/v2/dedicated-inferences/sizes

Authorizations: bearer_auth (1 scope)

Http: Bearer

Required scopes: dedicated_inference:read

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

dop_v1_ for personal access tokens generated in the control panel
doo_v1_ for tokens generated by applications using the OAuth flow
dor_v1_ for OAuth refresh tokens

Scopes

Generally, scopes are designed to match HTTP verbs and common CRUD operations (Create, Read, Update, Delete).

HTTP Verb	CRUD Operation	Scope
GET	Read	`<resource>:read`
POST	Create	`<resource>:create`
PUT/PATCH	Update	`<resource>:update`
DELETE	Delete	`<resource>:delete`

Each endpoint below specifies which scope is required to access it when using custom scopes.

How to Authenticate with OAuth

In order to make an authenticated request, include a bearer-type Authorization header containing your OAuth token. All requests must be made over HTTPS.

Authenticate with a Bearer Authorization Header

curl -X $HTTP_METHOD -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://api.digitalocean.com/v2/$OBJECT"

Get available Dedicated Inference sizes and pricing for supported GPUs. Send a GET request to /v2/dedicated-inferences/sizes.

Request: `/v2/dedicated-inferences/sizes`

cURL

curl -i -X GET "https://api.digitalocean.com/v2/dedicated-inferences/sizes" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DO_TOKEN"

Responses

200

Enabled regions and sizes with pricing.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

enabled_regions array of string optional

Example: ["nyc2","atl1"]

Regions where Dedicated Inference is available.

sizes array of object optional

Show child properties

currency string optional

Example: USD

gpu_slug string optional

Example: gpu-mi300x1-192gb

price_per_hour string optional

Example: 2

region string optional

Example: nyc2

401

Authentication failed due to invalid credentials.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

Response

200

{
  "enabled_regions": [
    "nyc2",
    "atl1"
  ],
  "sizes": [
    {
      "currency": "USD",
      "gpu_slug": "gpu-mi300x1-192gb",
      "price_per_hour": "2",
      "region": "nyc2"
    }
  ]
}

401

{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}

429

{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}

500

{
  "id": "server_error",
  "message": "Unexpected server-side error"
}

default

{
  "id": "example_error",
  "message": "some error message"
}

GET Get a Dedicated Inference

/v2/dedicated-inferences/{dedicated_inference_id}

Authorizations: bearer_auth (1 scope)

Http: Bearer

Required scopes: dedicated_inference:read

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

dop_v1_ for personal access tokens generated in the control panel
doo_v1_ for tokens generated by applications using the OAuth flow
dor_v1_ for OAuth refresh tokens

Scopes

Generally, scopes are designed to match HTTP verbs and common CRUD operations (Create, Read, Update, Delete).

HTTP Verb	CRUD Operation	Scope
GET	Read	`<resource>:read`
POST	Create	`<resource>:create`
PUT/PATCH	Update	`<resource>:update`
DELETE	Delete	`<resource>:delete`

Each endpoint below specifies which scope is required to access it when using custom scopes.

How to Authenticate with OAuth

In order to make an authenticated request, include a bearer-type Authorization header containing your OAuth token. All requests must be made over HTTPS.

Authenticate with a Bearer Authorization Header

curl -X $HTTP_METHOD -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://api.digitalocean.com/v2/$OBJECT"

Retrieve an existing Dedicated Inference by ID. Send a GET request to /v2/dedicated-inferences/{dedicated_inference_id}. The status in the response is one of active, new, provisioning, updating, deleting, or error.

Path Parameters

dedicated_inference_id string (uuid) required

Example: 6b5c619c-359c-44ca-87e2-47e98170c01d

A unique identifier for a Dedicated Inference instance.

Request: `/v2/dedicated-inferences/{dedicated_inference_id}`

cURL

curl -i -X GET "https://api.digitalocean.com/v2/dedicated-inferences/6b5c619c-359c-44ca-87e2-47e98170c01d" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DO_TOKEN"

Responses

200

Response containing a single Dedicated Inference instance.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

dedicated_inference object optional

A Dedicated Inference instance.

Show child properties

created_at string (date-time) optional read-only

Example: 2024-01-09T20:44:32Z

When the Dedicated Inference was created.

endpoints object optional read-only

Show child properties

private_endpoint_fqdn string (uri) optional

Example: https://b4bfug4jc41kts2ro54if91eo-private-dedicated-inference.do-infra.ai

Private VPC FQDN of the Dedicated Inference instance.

public_endpoint_fqdn string (uri) optional

Example: https://b4bfug4jc41kts2ro54if91eo-public-dedicated-inference.do-infra.ai

Public FQDN of the Dedicated Inference instance.

id string (uuid) optional read-only

Example: 6b5c619c-359c-44ca-87e2-47e98170c01d

Unique ID of the Dedicated Inference.

pending_deployment_spec object optional

Pending deployment when status is provisioning or updating.

Show child properties

created_at string (date-time) optional

Example: 2024-01-09T20:44:32Z

enable_public_endpoint boolean optional

Whether to expose a public LLM endpoint.

id string (uuid) optional

Example: 7c6d729d-360d-44db-88e3-58e98281d12e

Deployment UUID.

model_deployments array of object optional

At least one model deployment is required.

Show child properties

accelerators array of object optional

Accelerator configuration for this deployment.

Additional nested properties not shown. Refer to the full API spec for details.

model_id string optional

Used to identify an existing deployment when updating; empty means create new.

model_provider string, one of: hugging_face optional

Example: hugging_face

Model provider.

model_slug string optional

Example: mistral/mistral-7b-instruct-v3

Model identifier (e.g. Hugging Face slug).

workload_config object optional

Workload-specific configuration (e.g. ISL/OSL in future).

Additional nested properties not shown. Refer to the full API spec for details.

name string optional

Example: new-dedicated-inference

Name of the Dedicated Inference. Must be unique within the team.

status string, one of: provisioning, updating optional

Example: provisioning

updated_at string (date-time) optional

Example: 2024-01-09T20:44:32Z

version integer optional

Example: 1

Spec version.

vpc object optional

Show child properties

uuid string (uuid) required

Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID for the Dedicated Inference.

region string optional read-only

Example: atl1

DigitalOcean region where the Dedicated Inference is hosted.

spec object optional

Structured configuration for a Dedicated Inference deployment.

Show child properties

enable_public_endpoint boolean required

Whether to expose a public LLM endpoint.

model_deployments array of object required

At least one model deployment is required.

Show child properties

accelerators array of object optional

Accelerator configuration for this deployment.

Additional nested properties not shown. Refer to the full API spec for details.

model_id string optional

Used to identify an existing deployment when updating; empty means create new.

model_provider string, one of: hugging_face optional

Example: hugging_face

Model provider.

model_slug string optional

Example: mistral/mistral-7b-instruct-v3

Model identifier (e.g. Hugging Face slug).

workload_config object optional

Workload-specific configuration (e.g. ISL/OSL in future).

Additional nested properties not shown. Refer to the full API spec for details.

name string required

Example: new-dedicated-inference

Name of the Dedicated Inference. Must be unique within the team.

region string, one of: atl1, nyc2, tor1 required

Example: atl1

DigitalOcean region where the Dedicated Inference is hosted.

version integer required

Example: 1

Spec version.

vpc object required

Show child properties

uuid string (uuid) required

Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID for the Dedicated Inference.

status string (enum) optional read-only

Example: active

Current state of the Dedicated Inference.

updated_at string (date-time) optional read-only

Example: 2024-01-09T20:44:32Z

When the Dedicated Inference was last updated.

vpc_uuid string (uuid) optional read-only

Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID of the Dedicated Inference.

401

Authentication failed due to invalid credentials.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

404

The resource was not found.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

Response

200

{
  "dedicated_inference": {
    "created_at": "2024-01-09T20:44:32Z",
    "endpoints": {
      "private_endpoint_fqdn": "https://b4bfug4jc41kts2ro54if91eo-private-dedicated-inference.do-infra.ai",
      "public_endpoint_fqdn": "https://b4bfug4jc41kts2ro54if91eo-public-dedicated-inference.do-infra.ai"
    },
    "id": "6b5c619c-359c-44ca-87e2-47e98170c01d",
    "region": "atl1",
    "status": "active",
    "updated_at": "2024-01-09T20:44:32Z",
    "vpc_uuid": "997615ce-132d-4bae-9270-9ee21b395e5d"
  }
}

401

{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}

404

{
  "id": "not_found",
  "message": "The resource you requested could not be found."
}

429

{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}

500

{
  "id": "server_error",
  "message": "Unexpected server-side error"
}

default

{
  "id": "example_error",
  "message": "some error message"
}

PATCH Update a Dedicated Inference

/v2/dedicated-inferences/{dedicated_inference_id}

Authorizations: bearer_auth (1 scope)

Http: Bearer

Required scopes: dedicated_inference:update

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

dop_v1_ for personal access tokens generated in the control panel
doo_v1_ for tokens generated by applications using the OAuth flow
dor_v1_ for OAuth refresh tokens

Scopes

Generally, scopes are designed to match HTTP verbs and common CRUD operations (Create, Read, Update, Delete).

HTTP Verb	CRUD Operation	Scope
GET	Read	`<resource>:read`
POST	Create	`<resource>:create`
PUT/PATCH	Update	`<resource>:update`
DELETE	Delete	`<resource>:delete`

Each endpoint below specifies which scope is required to access it when using custom scopes.

How to Authenticate with OAuth

In order to make an authenticated request, include a bearer-type Authorization header containing your OAuth token. All requests must be made over HTTPS.

Authenticate with a Bearer Authorization Header

curl -X $HTTP_METHOD -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://api.digitalocean.com/v2/$OBJECT"

Update an existing Dedicated Inference. Send a PATCH request to /v2/dedicated-inferences/{dedicated_inference_id} with updated spec and/or access_tokens. Status will move to updating and return to active when done.

Path Parameters

dedicated_inference_id string (uuid) required

Example: 6b5c619c-359c-44ca-87e2-47e98170c01d

A unique identifier for a Dedicated Inference instance.

Request Body: `application/json`

access_tokens object optional

Provider tokens for model access (e.g. gated Hugging Face models).

Show child properties

hugging_face_token string optional

Example: $HF_TOKEN

Hugging Face token required for gated models.

spec object optional

Structured configuration for a Dedicated Inference deployment.

Show child properties

enable_public_endpoint boolean required

Whether to expose a public LLM endpoint.

model_deployments array of object required

At least one model deployment is required.

Show child properties

accelerators array of object optional

Accelerator configuration for this deployment.

Show child properties

accelerator_slug string required

Example: gpu-mi300x1-192gb

DigitalOcean GPU slug.

scale integer required

Example: 1

Number of accelerator instances.

status string, one of: new, provisioning, active optional read-only

Example: active

Current state of the Accelerator.

type string required

Example: prefill_decode

Accelerator type (e.g. prefill_decode).

model_id string optional

Used to identify an existing deployment when updating; empty means create new.

model_provider string, one of: hugging_face optional

Example: hugging_face

Model provider.

model_slug string optional

Example: mistral/mistral-7b-instruct-v3

Model identifier (e.g. Hugging Face slug).

workload_config object optional

Workload-specific configuration (e.g. ISL/OSL in future).

name string required

Example: new-dedicated-inference

Name of the Dedicated Inference. Must be unique within the team.

region string, one of: atl1, nyc2, tor1 required

Example: atl1

DigitalOcean region where the Dedicated Inference is hosted.

version integer required

Example: 1

Spec version.

vpc object required

Show child properties

uuid string (uuid) required

Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID for the Dedicated Inference.

Request: `/v2/dedicated-inferences/{dedicated_inference_id}`

Payload

Content type application/json

{
  "access_tokens": {
    "hugging_face_token": "$HF_TOKEN"
  },
  "spec": {
    "enable_public_endpoint": true,
    "model_deployments": [],
    "name": "new-dedicated-inference",
    "region": "atl1",
    "version": 1,
    "vpc": {
      "uuid": "997615ce-132d-4bae-9270-9ee21b395e5d"
    }
  }
}

cURL

curl -i -X PATCH "https://api.digitalocean.com/v2/dedicated-inferences/6b5c619c-359c-44ca-87e2-47e98170c01d" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DO_TOKEN" \
  -d '{
    "spec": {
      "name": "renamed-dedicated-inference",
      "region": "atl1",
      "vpc": { "uuid": "997615ce-132d-4bae-9270-9ee21b395e5d" },
      "model_deployments": [{
        "model_slug": "mistral/mistral-7b-instruct-v3",
        "accelerator_slug": "gpu-mi300x1-192gb",
        "node_count": 3
      }]
    },
    "access_tokens": { "hugging_face_token": "$HF_TOKEN" }
  }'

Responses

202

Dedicated Inference update accepted for processing (202). Response contains only the dedicated_inference; no token is returned.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

dedicated_inference object optional

A Dedicated Inference instance.

Show child properties

created_at string (date-time) optional read-only

Example: 2024-01-09T20:44:32Z

When the Dedicated Inference was created.

endpoints object optional read-only

Show child properties

private_endpoint_fqdn string (uri) optional

Example: https://b4bfug4jc41kts2ro54if91eo-private-dedicated-inference.do-infra.ai

Private VPC FQDN of the Dedicated Inference instance.

public_endpoint_fqdn string (uri) optional

Example: https://b4bfug4jc41kts2ro54if91eo-public-dedicated-inference.do-infra.ai

Public FQDN of the Dedicated Inference instance.

id string (uuid) optional read-only

Example: 6b5c619c-359c-44ca-87e2-47e98170c01d

Unique ID of the Dedicated Inference.

pending_deployment_spec object optional

Pending deployment when status is provisioning or updating.

Show child properties

created_at string (date-time) optional

Example: 2024-01-09T20:44:32Z

enable_public_endpoint boolean optional

Whether to expose a public LLM endpoint.

id string (uuid) optional

Example: 7c6d729d-360d-44db-88e3-58e98281d12e

Deployment UUID.

model_deployments array of object optional

At least one model deployment is required.

Show child properties

accelerators array of object optional

Accelerator configuration for this deployment.

Additional nested properties not shown. Refer to the full API spec for details.

model_id string optional

Used to identify an existing deployment when updating; empty means create new.

model_provider string, one of: hugging_face optional

Example: hugging_face

Model provider.

model_slug string optional

Example: mistral/mistral-7b-instruct-v3

Model identifier (e.g. Hugging Face slug).

workload_config object optional

Workload-specific configuration (e.g. ISL/OSL in future).

Additional nested properties not shown. Refer to the full API spec for details.

name string optional

Example: new-dedicated-inference

Name of the Dedicated Inference. Must be unique within the team.

status string, one of: provisioning, updating optional

Example: provisioning

updated_at string (date-time) optional

Example: 2024-01-09T20:44:32Z

version integer optional

Example: 1

Spec version.

vpc object optional

Show child properties

uuid string (uuid) required

Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID for the Dedicated Inference.

region string optional read-only

Example: atl1

DigitalOcean region where the Dedicated Inference is hosted.

spec object optional

Structured configuration for a Dedicated Inference deployment.

Show child properties

enable_public_endpoint boolean required

Whether to expose a public LLM endpoint.

model_deployments array of object required

At least one model deployment is required.

Show child properties

accelerators array of object optional

Accelerator configuration for this deployment.

Additional nested properties not shown. Refer to the full API spec for details.

model_id string optional

Used to identify an existing deployment when updating; empty means create new.

model_provider string, one of: hugging_face optional

Example: hugging_face

Model provider.

model_slug string optional

Example: mistral/mistral-7b-instruct-v3

Model identifier (e.g. Hugging Face slug).

workload_config object optional

Workload-specific configuration (e.g. ISL/OSL in future).

Additional nested properties not shown. Refer to the full API spec for details.

name string required

Example: new-dedicated-inference

Name of the Dedicated Inference. Must be unique within the team.

region string, one of: atl1, nyc2, tor1 required

Example: atl1

DigitalOcean region where the Dedicated Inference is hosted.

version integer required

Example: 1

Spec version.

vpc object required

Show child properties

uuid string (uuid) required

Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID for the Dedicated Inference.

status string (enum) optional read-only

Example: active

Current state of the Dedicated Inference.

updated_at string (date-time) optional read-only

Example: 2024-01-09T20:44:32Z

When the Dedicated Inference was last updated.

vpc_uuid string (uuid) optional read-only

Example: 997615ce-132d-4bae-9270-9ee21b395e5d

VPC UUID of the Dedicated Inference.

401

Authentication failed due to invalid credentials.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

404

The resource was not found.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

Response

202

{
  "dedicated_inference": {
    "created_at": "2024-01-09T20:44:32Z",
    "endpoints": {
      "private_endpoint_fqdn": "https://b4bfug4jc41kts2ro54if91eo-private-dedicated-inference.do-infra.ai",
      "public_endpoint_fqdn": "https://b4bfug4jc41kts2ro54if91eo-public-dedicated-inference.do-infra.ai"
    },
    "id": "6b5c619c-359c-44ca-87e2-47e98170c01d",
    "pending_deployment_spec": {
      "created_at": "2024-01-09T20:44:32Z",
      "enable_public_endpoint": true,
      "id": "7c6d729d-360d-44db-88e3-58e98281d12e",
      "model_deployments": [],
      "name": "new-dedicated-inference",
      "status": "provisioning",
      "updated_at": "2024-01-09T20:44:32Z",
      "version": 1
    },
    "region": "atl1",
    "spec": {
      "enable_public_endpoint": true,
      "model_deployments": [],
      "name": "new-dedicated-inference",
      "region": "atl1",
      "version": 1
    },
    "status": "active",
    "updated_at": "2024-01-09T20:44:32Z",
    "vpc_uuid": "997615ce-132d-4bae-9270-9ee21b395e5d"
  }
}

401

{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}

404

{
  "id": "not_found",
  "message": "The resource you requested could not be found."
}

429

{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}

500

{
  "id": "server_error",
  "message": "Unexpected server-side error"
}

default

{
  "id": "example_error",
  "message": "some error message"
}

DELETE Delete a Dedicated Inference

/v2/dedicated-inferences/{dedicated_inference_id}

Authorizations: bearer_auth (1 scope)

Http: Bearer

Required scopes: dedicated_inference:delete

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

dop_v1_ for personal access tokens generated in the control panel
doo_v1_ for tokens generated by applications using the OAuth flow
dor_v1_ for OAuth refresh tokens

Scopes

Generally, scopes are designed to match HTTP verbs and common CRUD operations (Create, Read, Update, Delete).

HTTP Verb	CRUD Operation	Scope
GET	Read	`<resource>:read`
POST	Create	`<resource>:create`
PUT/PATCH	Update	`<resource>:update`
DELETE	Delete	`<resource>:delete`

Each endpoint below specifies which scope is required to access it when using custom scopes.

How to Authenticate with OAuth

In order to make an authenticated request, include a bearer-type Authorization header containing your OAuth token. All requests must be made over HTTPS.

Authenticate with a Bearer Authorization Header

curl -X $HTTP_METHOD -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://api.digitalocean.com/v2/$OBJECT"

Delete an existing Dedicated Inference. Send a DELETE request to /v2/dedicated-inferences/{dedicated_inference_id}. The response 202 Accepted indicates the request was accepted for processing.

Path Parameters

dedicated_inference_id string (uuid) required

Example: 6b5c619c-359c-44ca-87e2-47e98170c01d

A unique identifier for a Dedicated Inference instance.

Request: `/v2/dedicated-inferences/{dedicated_inference_id}`

cURL

curl -i -X DELETE "https://api.digitalocean.com/v2/dedicated-inferences/6b5c619c-359c-44ca-87e2-47e98170c01d" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DO_TOKEN"

Responses

202

This does not indicate the success or failure of any operation, just that the request has been accepted for processing.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

401

Authentication failed due to invalid credentials.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

404

The resource was not found.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

Response

401

{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}

404

{
  "id": "not_found",
  "message": "The resource you requested could not be found."
}

429

{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}

500

{
  "id": "server_error",
  "message": "Unexpected server-side error"
}

default

{
  "id": "example_error",
  "message": "some error message"
}

GET List Dedicated Inference Accelerators

/v2/dedicated-inferences/{dedicated_inference_id}/accelerators

Authorizations: bearer_auth (1 scope)

Http: Bearer

Required scopes: dedicated_inference:read

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

dop_v1_ for personal access tokens generated in the control panel
doo_v1_ for tokens generated by applications using the OAuth flow
dor_v1_ for OAuth refresh tokens

Scopes

Generally, scopes are designed to match HTTP verbs and common CRUD operations (Create, Read, Update, Delete).

HTTP Verb	CRUD Operation	Scope
GET	Read	`<resource>:read`
POST	Create	`<resource>:create`
PUT/PATCH	Update	`<resource>:update`
DELETE	Delete	`<resource>:delete`

Each endpoint below specifies which scope is required to access it when using custom scopes.

How to Authenticate with OAuth

In order to make an authenticated request, include a bearer-type Authorization header containing your OAuth token. All requests must be made over HTTPS.

Authenticate with a Bearer Authorization Header

curl -X $HTTP_METHOD -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://api.digitalocean.com/v2/$OBJECT"

List all accelerators (GPUs) in use by a Dedicated Inference instance. Send a GET request to /v2/dedicated-inferences/{dedicated_inference_id}/accelerators. Optionally filter by slug and use page/per_page for pagination.

Path Parameters

dedicated_inference_id string (uuid) required

Example: 6b5c619c-359c-44ca-87e2-47e98170c01d

A unique identifier for a Dedicated Inference instance.

Query Parameters

per_page integer 1 – 200 optional

Example: 20

Number of items returned per page

Default: 20

page integer >= 1 optional

Example: 1

Which 'page' of paginated results to return.

Default: 1

slug string optional

Example: gpu-mi300x1-192gb

Filter accelerators by GPU slug.

Request: `/v2/dedicated-inferences/{dedicated_inference_id}/accelerators`

cURL

curl -i -X GET "https://api.digitalocean.com/v2/dedicated-inferences/6b5c619c-359c-44ca-87e2-47e98170c01d/accelerators" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DO_TOKEN"

Responses

200

The response will be a JSON object with a key called accelerators. This will be set to an array of accelerator objects. Pagination uses the same links and meta structure as other list endpoints.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

accelerators array of object optional

Show child properties

created_at string (date-time) optional read-only

Example: 2024-01-09T20:44:32Z

id string (uuid) optional read-only

Example: 5b5c619c-359c-44ca-87e2-47e98170c02f

Unique ID of the accelerator.

name string optional read-only

Example: mi300x1-ghfpsf

Name of the accelerator.

role string optional read-only

Example: prefill_decode

Role of the accelerator (e.g. prefill_decode).

slug string optional read-only

Example: gpu-mi300x1-192gb

DigitalOcean GPU slug.

status string optional read-only

Example: active

Status of the accelerator.

links object optional

Show child properties

pages anyOf optional

One of:

Forward Links

last string optional

Example: https://api.digitalocean.com/v2/images?page=2

URI of the last page of the results.

next string optional

Example: https://api.digitalocean.com/v2/images?page=2

URI of the next page of the results.

Backward Links

first string optional

Example: https://api.digitalocean.com/v2/images?page=1

URI of the first page of the results.

prev string optional

Example: https://api.digitalocean.com/v2/images?page=1

URI of the previous page of the results.

meta object required

401

Authentication failed due to invalid credentials.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

404

The resource was not found.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

Response

200

{
  "accelerators": [
    {
      "created_at": "2024-01-09T20:44:32Z",
      "id": "5b5c619c-359c-44ca-87e2-47e98170c02f",
      "name": "mi300x1-ghfpsf",
      "role": "prefill_and_decode",
      "slug": "gpu-mi300x1-192gb",
      "status": "active"
    }
  ],
  "links": {
    "pages": {
      "first": "https://api.digitalocean.com/v2/dedicated-inferences/6b5c619c-359c-44ca-87e2-47e98170c01d/accelerators?page=1\u0026per_page=20",
      "last": "https://api.digitalocean.com/v2/dedicated-inferences/6b5c619c-359c-44ca-87e2-47e98170c01d/accelerators?page=1\u0026per_page=20"
    }
  },
  "meta": {
    "total": 1
  }
}

401

{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}

404

{
  "id": "not_found",
  "message": "The resource you requested could not be found."
}

429

{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}

500

{
  "id": "server_error",
  "message": "Unexpected server-side error"
}

default

{
  "id": "example_error",
  "message": "some error message"
}

GET Get a Dedicated Inference Accelerator

/v2/dedicated-inferences/{dedicated_inference_id}/accelerators/{accelerator_id}

Authorizations: bearer_auth (1 scope)

Http: Bearer

Required scopes: dedicated_inference:read

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

dop_v1_ for personal access tokens generated in the control panel
doo_v1_ for tokens generated by applications using the OAuth flow
dor_v1_ for OAuth refresh tokens

Scopes

Generally, scopes are designed to match HTTP verbs and common CRUD operations (Create, Read, Update, Delete).

HTTP Verb	CRUD Operation	Scope
GET	Read	`<resource>:read`
POST	Create	`<resource>:create`
PUT/PATCH	Update	`<resource>:update`
DELETE	Delete	`<resource>:delete`

Each endpoint below specifies which scope is required to access it when using custom scopes.

How to Authenticate with OAuth

In order to make an authenticated request, include a bearer-type Authorization header containing your OAuth token. All requests must be made over HTTPS.

Authenticate with a Bearer Authorization Header

curl -X $HTTP_METHOD -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://api.digitalocean.com/v2/$OBJECT"

Retrieve a single accelerator by ID for a Dedicated Inference instance. Send a GET request to /v2/dedicated-inferences/{dedicated_inference_id}/accelerators/{accelerator_id}.

Path Parameters

dedicated_inference_id string (uuid) required

Example: 6b5c619c-359c-44ca-87e2-47e98170c01d

A unique identifier for a Dedicated Inference instance.

accelerator_id string (uuid) required

Example: 5b5c619c-359c-44ca-87e2-47e98170c02f

A unique identifier for a Dedicated Inference accelerator.

Request: `/v2/dedicated-inferences/{dedicated_inference_id}/accelerators/{accelerator_id}`

cURL

curl -i -X GET "https://api.digitalocean.com/v2/dedicated-inferences/6b5c619c-359c-44ca-87e2-47e98170c01d/accelerators/5b5c619c-359c-44ca-87e2-47e98170c02f" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DO_TOKEN"

Responses

200

Single accelerator object.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

created_at string (date-time) optional read-only

Example: 2024-01-09T20:44:32Z

id string (uuid) optional read-only

Example: 5b5c619c-359c-44ca-87e2-47e98170c02f

Unique ID of the accelerator.

name string optional read-only

Example: mi300x1-ghfpsf

Name of the accelerator.

role string optional read-only

Example: prefill_decode

Role of the accelerator (e.g. prefill_decode).

slug string optional read-only

Example: gpu-mi300x1-192gb

DigitalOcean GPU slug.

status string optional read-only

Example: active

Status of the accelerator.

401

Authentication failed due to invalid credentials.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

404

The resource was not found.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

Response

200

{
  "created_at": "2024-01-09T20:44:32Z",
  "id": "5b5c619c-359c-44ca-87e2-47e98170c02f",
  "name": "mi300x1-ghfpsf",
  "role": "prefill_decode",
  "slug": "gpu-mi300x1-192gb",
  "status": "active"
}

401

{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}

404

{
  "id": "not_found",
  "message": "The resource you requested could not be found."
}

429

{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}

500

{
  "id": "server_error",
  "message": "Unexpected server-side error"
}

default

{
  "id": "example_error",
  "message": "some error message"
}

GET Get Dedicated Inference CA Certificate

/v2/dedicated-inferences/{dedicated_inference_id}/ca

Authorizations: bearer_auth (1 scope)

Http: Bearer

Required scopes: dedicated_inference:read

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

dop_v1_ for personal access tokens generated in the control panel
doo_v1_ for tokens generated by applications using the OAuth flow
dor_v1_ for OAuth refresh tokens

Scopes

Generally, scopes are designed to match HTTP verbs and common CRUD operations (Create, Read, Update, Delete).

HTTP Verb	CRUD Operation	Scope
GET	Read	`<resource>:read`
POST	Create	`<resource>:create`
PUT/PATCH	Update	`<resource>:update`
DELETE	Delete	`<resource>:delete`

Each endpoint below specifies which scope is required to access it when using custom scopes.

How to Authenticate with OAuth

In order to make an authenticated request, include a bearer-type Authorization header containing your OAuth token. All requests must be made over HTTPS.

Authenticate with a Bearer Authorization Header

curl -X $HTTP_METHOD -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://api.digitalocean.com/v2/$OBJECT"

Get the CA certificate for a Dedicated Inference instance (base64-encoded). Required for private endpoint connectivity. Send a GET request to /v2/dedicated-inferences/{dedicated_inference_id}/ca.

Path Parameters

dedicated_inference_id string (uuid) required

Example: 6b5c619c-359c-44ca-87e2-47e98170c01d

A unique identifier for a Dedicated Inference instance.

Request: `/v2/dedicated-inferences/{dedicated_inference_id}/ca`

cURL

curl -i -X GET "https://api.digitalocean.com/v2/dedicated-inferences/6b5c619c-359c-44ca-87e2-47e98170c01d/ca" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DO_TOKEN"

Responses

200

CA certificate for the Dedicated Inference (base64-encoded). Required for private endpoint connectivity.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

cert string required

Base64-encoded CA certificate.

401

Authentication failed due to invalid credentials.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

404

The resource was not found.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

Response

200

{
  "cert": "LS0tLS1CRUdJTi..."
}

401

{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}

404

{
  "id": "not_found",
  "message": "The resource you requested could not be found."
}

429

{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}

500

{
  "id": "server_error",
  "message": "Unexpected server-side error"
}

default

{
  "id": "example_error",
  "message": "some error message"
}

GET List Dedicated Inference Tokens

/v2/dedicated-inferences/{dedicated_inference_id}/tokens

Authorizations: bearer_auth (1 scope)

Http: Bearer

Required scopes: dedicated_inference_tokens:read

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

dop_v1_ for personal access tokens generated in the control panel
doo_v1_ for tokens generated by applications using the OAuth flow
dor_v1_ for OAuth refresh tokens

Scopes

Generally, scopes are designed to match HTTP verbs and common CRUD operations (Create, Read, Update, Delete).

HTTP Verb	CRUD Operation	Scope
GET	Read	`<resource>:read`
POST	Create	`<resource>:create`
PUT/PATCH	Update	`<resource>:update`
DELETE	Delete	`<resource>:delete`

Each endpoint below specifies which scope is required to access it when using custom scopes.

How to Authenticate with OAuth

In order to make an authenticated request, include a bearer-type Authorization header containing your OAuth token. All requests must be made over HTTPS.

Authenticate with a Bearer Authorization Header

curl -X $HTTP_METHOD -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://api.digitalocean.com/v2/$OBJECT"

List all access tokens for a Dedicated Inference instance. Token values are not returned; only id, name, created_at, and is_managed. Send a GET request to /v2/dedicated-inferences/{dedicated_inference_id}/tokens.

Path Parameters

dedicated_inference_id string (uuid) required

Example: 6b5c619c-359c-44ca-87e2-47e98170c01d

A unique identifier for a Dedicated Inference instance.

Query Parameters

per_page integer 1 – 200 optional

Example: 2

Number of items returned per page

Default: 20

page integer >= 1 optional

Example: 1

Which 'page' of paginated results to return.

Default: 1

Request: `/v2/dedicated-inferences/{dedicated_inference_id}/tokens`

cURL

curl -i -X GET "https://api.digitalocean.com/v2/dedicated-inferences/6b5c619c-359c-44ca-87e2-47e98170c01d/tokens?page=1&per_page=20" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DO_TOKEN"

Responses

200

The response will be a JSON object with a key called tokens. This will be set to an array of objects (id, name, created_at, is_managed; value is not returned). Pagination uses the same links and meta structure as other list endpoints (e.g. VPC peerings).

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

tokens array of object optional

Show child properties

created_at string (date-time) optional read-only

Example: 2024-01-09T20:44:32Z

id string (uuid) optional read-only

Example: 01333f14-a903-4b8e-92b3-363a767aa052

Unique ID of the token.

is_managed boolean optional read-only

Example: false

When true, the token is managed by DigitalOcean (for example, system-provisioned). When false, the token was created by the user.

name string optional read-only

Example: first-token

Name of the token.

value string optional read-only

Example: di_xxxxxxxxxxxxxxxxxxxxxxxx

Token value; only returned once on create. Store securely.

links object optional

Show child properties

pages anyOf optional

One of:

Forward Links

last string optional

Example: https://api.digitalocean.com/v2/images?page=2

URI of the last page of the results.

next string optional

Example: https://api.digitalocean.com/v2/images?page=2

URI of the next page of the results.

Backward Links

first string optional

Example: https://api.digitalocean.com/v2/images?page=1

URI of the first page of the results.

prev string optional

Example: https://api.digitalocean.com/v2/images?page=1

URI of the previous page of the results.

meta object required

401

Authentication failed due to invalid credentials.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

404

The resource was not found.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

Response

200

{
  "links": {
    "pages": {
      "first": "https://api.digitalocean.com/v2/dedicated-inferences/6b5c619c-359c-44ca-87e2-47e98170c01d/tokens?page=1\u0026per_page=20",
      "last": "https://api.digitalocean.com/v2/dedicated-inferences/6b5c619c-359c-44ca-87e2-47e98170c01d/tokens?page=1\u0026per_page=20"
    }
  },
  "meta": {
    "total": 0
  },
  "tokens": []
}

401

{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}

404

{
  "id": "not_found",
  "message": "The resource you requested could not be found."
}

429

{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}

500

{
  "id": "server_error",
  "message": "Unexpected server-side error"
}

default

{
  "id": "example_error",
  "message": "some error message"
}

POST Create a Dedicated Inference Token

/v2/dedicated-inferences/{dedicated_inference_id}/tokens

Authorizations: bearer_auth (1 scope)

Http: Bearer

Required scopes: dedicated_inference_tokens:create

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

dop_v1_ for personal access tokens generated in the control panel
doo_v1_ for tokens generated by applications using the OAuth flow
dor_v1_ for OAuth refresh tokens

Scopes

Generally, scopes are designed to match HTTP verbs and common CRUD operations (Create, Read, Update, Delete).

HTTP Verb	CRUD Operation	Scope
GET	Read	`<resource>:read`
POST	Create	`<resource>:create`
PUT/PATCH	Update	`<resource>:update`
DELETE	Delete	`<resource>:delete`

Each endpoint below specifies which scope is required to access it when using custom scopes.

How to Authenticate with OAuth

In order to make an authenticated request, include a bearer-type Authorization header containing your OAuth token. All requests must be made over HTTPS.

Authenticate with a Bearer Authorization Header

curl -X $HTTP_METHOD -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://api.digitalocean.com/v2/$OBJECT"

Create a new access token for a Dedicated Inference instance. Send a POST request to /v2/dedicated-inferences/{dedicated_inference_id}/tokens with a name. The token value is returned only once in the response; store it securely.

Path Parameters

dedicated_inference_id string (uuid) required

Example: 6b5c619c-359c-44ca-87e2-47e98170c01d

A unique identifier for a Dedicated Inference instance.

Request Body: `application/json`

name string required

Example: new-inference-token

Name for the new token.

Request: `/v2/dedicated-inferences/{dedicated_inference_id}/tokens`

Payload

Content type application/json

{
  "name": "new-inference-token"
}

cURL

curl -i -X POST "https://api.digitalocean.com/v2/dedicated-inferences/6b5c619c-359c-44ca-87e2-47e98170c01d/tokens" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DO_TOKEN" \
  -d '{"name": "new-inference-token"}'

Responses

202

Token created; value is returned only once. Store securely.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

token object optional

Access token for authenticating to Dedicated Inference endpoints.

Show child properties

created_at string (date-time) optional read-only

Example: 2024-01-09T20:44:32Z

id string (uuid) optional read-only

Example: 01333f14-a903-4b8e-92b3-363a767aa052

Unique ID of the token.

is_managed boolean optional read-only

Example: false

When true, the token is managed by DigitalOcean (for example, system-provisioned). When false, the token was created by the user.

name string optional read-only

Example: first-token

Name of the token.

value string optional read-only

Example: di_xxxxxxxxxxxxxxxxxxxxxxxx

Token value; only returned once on create. Store securely.

401

Authentication failed due to invalid credentials.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

404

The resource was not found.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

Response

202

{
  "token": {
    "created_at": "2024-01-09T20:44:32Z",
    "id": "01333f14-a903-4b8e-92b3-363a767aa052",
    "is_managed": false,
    "name": "first-token",
    "value": "di_xxxxxxxxxxxxxxxxxxxxxxxx"
  }
}

401

{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}

404

{
  "id": "not_found",
  "message": "The resource you requested could not be found."
}

429

{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}

500

{
  "id": "server_error",
  "message": "Unexpected server-side error"
}

default

{
  "id": "example_error",
  "message": "some error message"
}

DELETE Revoke a Dedicated Inference Token

/v2/dedicated-inferences/{dedicated_inference_id}/tokens/{token_id}

Authorizations: bearer_auth (1 scope)

Http: Bearer

Required scopes: dedicated_inference_tokens:delete

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

dop_v1_ for personal access tokens generated in the control panel
doo_v1_ for tokens generated by applications using the OAuth flow
dor_v1_ for OAuth refresh tokens

Scopes

Generally, scopes are designed to match HTTP verbs and common CRUD operations (Create, Read, Update, Delete).

HTTP Verb	CRUD Operation	Scope
GET	Read	`<resource>:read`
POST	Create	`<resource>:create`
PUT/PATCH	Update	`<resource>:update`
DELETE	Delete	`<resource>:delete`

Each endpoint below specifies which scope is required to access it when using custom scopes.

How to Authenticate with OAuth

In order to make an authenticated request, include a bearer-type Authorization header containing your OAuth token. All requests must be made over HTTPS.

Authenticate with a Bearer Authorization Header

curl -X $HTTP_METHOD -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://api.digitalocean.com/v2/$OBJECT"

Revoke (delete) an access token for a Dedicated Inference instance. Send a DELETE request to /v2/dedicated-inferences/{dedicated_inference_id}/tokens/{token_id}.

Path Parameters

dedicated_inference_id string (uuid) required

Example: 6b5c619c-359c-44ca-87e2-47e98170c01d

A unique identifier for a Dedicated Inference instance.

token_id string (uuid) required

Example: f11d4795-c1db-4ac3-9aa6-a0ea3c58877e

A unique identifier for a Dedicated Inference access token.

Request: `/v2/dedicated-inferences/{dedicated_inference_id}/tokens/{token_id}`

cURL

curl -i -X DELETE "https://api.digitalocean.com/v2/dedicated-inferences/6b5c619c-359c-44ca-87e2-47e98170c01d/tokens/f11d4795-c1db-4ac3-9aa6-a0ea3c58877e" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DO_TOKEN"

Responses

204

Token revoked.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

401

Authentication failed due to invalid credentials.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

404

The resource was not found.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

Response

401

{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}

404

{
  "id": "not_found",
  "message": "The resource you requested could not be found."
}

429

{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}

500

{
  "id": "server_error",
  "message": "Unexpected server-side error"
}

default

{
  "id": "example_error",
  "message": "some error message"
}

Dedicated Inference

Endpoints

GET List Dedicated Inferences

OAuth Authentication

Scopes

How to Authenticate with OAuth

Authenticate with a Bearer Authorization Header

Query Parameters

Request: /v2/dedicated-inferences

Responses

Response

POST Create a Dedicated Inference

OAuth Authentication

Scopes

How to Authenticate with OAuth

Authenticate with a Bearer Authorization Header

Request Body: application/json

Request: /v2/dedicated-inferences

Responses

Response

GET Get Dedicated Inference GPU Model Config

OAuth Authentication

Scopes

How to Authenticate with OAuth

Authenticate with a Bearer Authorization Header

Request: /v2/dedicated-inferences/gpu-model-config

Responses

Response

GET List Dedicated Inference Sizes

OAuth Authentication

Scopes

How to Authenticate with OAuth

Authenticate with a Bearer Authorization Header

Request: /v2/dedicated-inferences/sizes

Responses

Response

GET Get a Dedicated Inference

OAuth Authentication

Scopes

How to Authenticate with OAuth

Authenticate with a Bearer Authorization Header

Path Parameters

Request: /v2/dedicated-inferences/{dedicated_inference_id}

Responses

Response

PATCH Update a Dedicated Inference

OAuth Authentication

Scopes

How to Authenticate with OAuth

Authenticate with a Bearer Authorization Header

Path Parameters

Request Body: application/json

Request: /v2/dedicated-inferences/{dedicated_inference_id}

Responses

Response

DELETE Delete a Dedicated Inference

OAuth Authentication

Scopes

How to Authenticate with OAuth

Authenticate with a Bearer Authorization Header

Path Parameters

Request: /v2/dedicated-inferences/{dedicated_inference_id}

Responses

Response

GET List Dedicated Inference Accelerators

OAuth Authentication

Scopes

How to Authenticate with OAuth

Authenticate with a Bearer Authorization Header

Path Parameters

Query Parameters

Request: /v2/dedicated-inferences/{dedicated_inference_id}/accelerators

Responses

Response

GET Get a Dedicated Inference Accelerator

OAuth Authentication

Scopes

How to Authenticate with OAuth

Authenticate with a Bearer Authorization Header

Path Parameters

Request: `/v2/dedicated-inferences`

Request Body: `application/json`

Request: `/v2/dedicated-inferences`

Request: `/v2/dedicated-inferences/gpu-model-config`

Request: `/v2/dedicated-inferences/sizes`

Request: `/v2/dedicated-inferences/{dedicated_inference_id}`

Request Body: `application/json`

Request: `/v2/dedicated-inferences/{dedicated_inference_id}`

Request: `/v2/dedicated-inferences/{dedicated_inference_id}`

Request: `/v2/dedicated-inferences/{dedicated_inference_id}/accelerators`

Request: `/v2/dedicated-inferences/{dedicated_inference_id}/accelerators/{accelerator_id}`

Request: `/v2/dedicated-inferences/{dedicated_inference_id}/ca`

Request: `/v2/dedicated-inferences/{dedicated_inference_id}/tokens`

Request Body: `application/json`

Request: `/v2/dedicated-inferences/{dedicated_inference_id}/tokens`

Request: `/v2/dedicated-inferences/{dedicated_inference_id}/tokens/{token_id}`