Give Feedback

Batch Inference

Validated on 20 Apr 2026 • Last edited on 21 May 2026

Copy page as Markdown View page as Markdown

Batch Inference is an asynchronous processing capability designed to help you scale high-volume AI projects more efficiently. Ideal for heavy-duty workloads like large-scale data classification, evaluations, and content enrichment, you can submit thousands or even millions of requests in a single job with a guaranteed results window of 24 hours. By utilizing off-peak GPU capacity, Batch Inference provides high-performance LLM access at a significantly reduced price point compared to standard synchronous APIs, making it a cost-effective choice for non-interactive workloads.

Base URL https://inference.do-ai.run

Endpoints

POST Create a Batch Inference Input File PUT Upload a Batch Inference Input File GET List Batch Inference Jobs POST Create a Batch Inference Job GET Retrieve a Batch Inference Job GET Get Batch Inference Results Download Links POST Cancel a Batch Inference Job

POST Create a Batch Inference Input File

/v1/batches/files

Authorizations: inference_bearer_auth

Http: Bearer

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

The DigitalOcean API handles this through OAuth, an open standard for authorization. OAuth allows you to delegate access to your account. Scopes can be used to grant full access, read-only access, or access to a specific set of endpoints.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

Because of this, it is absolutely essential that you keep your OAuth tokens secure. In fact, upon generation, the web interface will only display each token a single time in order to prevent the token from being compromised.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

dop_v1_ for personal access tokens generated in the control panel
doo_v1_ for tokens generated by applications using the OAuth flow
dor_v1_ for OAuth refresh tokens

Authenticate with a Bearer Authorization Header

Serverless Inference:

curl -X POST -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://inference.do-ai.run/v1/chat/completions"

Agent Inference:

curl -X POST -H "Authorization: Bearer $AGENT_ACCESS_KEY" "https://{your-agent-url}.agents.do-ai.run/v1/chat/completions?agent=true"

Note: Agent Inference APIs use an agent_access_key (endpoint access key) instead of a DigitalOcean OAuth token. The agent_access_key is provided when you provision an agent endpoint and is scoped to that specific agent. It is not interchangeable with DigitalOcean OAuth tokens (dop_v1_*, doo_v1_*, dor_v1_*), which are used with Serverless Inference and the control-plane API at https://api.digitalocean.com.

Creates a file record and returns a file_id plus a short-lived presigned PUT URL (typically valid for ~15 minutes). Upload the raw JSONL bytes to upload_url (see PUT /{upload_path}) before calling POST /v1/batches.

Request Body: `application/json`

file_name string required

Example: batch_requests.jsonl

The file you plan to upload. Must end with .jsonl (case-insensitive) and contain one request per line in the schema expected by the target provider.

Request: `/v1/batches/files`

Payload

Content type application/json

{
  "file_name": "batch_requests.jsonl"
}

cURL

curl -sS -X POST "https://inference.do-ai.run/v1/batches/files" \
  -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "file_name": "batch_requests.jsonl"
  }' | jq

Python

import json
import os
from pathlib import Path

from pydo import Client

client = Client(token=os.environ.get("DIGITALOCEAN_TOKEN"))

input_path = Path("batch_requests.jsonl")
requests = [
    {
        "custom_id": "q-1",
        "method": "POST",
        "url": "/v1/chat/completions",
        "body": {
            "model": "llama3.3-70b-instruct",
            "messages": [
                {"role": "user", "content": "One fun fact about octopuses."}
            ],
            "max_tokens": 128,
        },
    },
    {
        "custom_id": "q-2",
        "method": "POST",
        "url": "/v1/chat/completions",
        "body": {
            "model": "llama3.3-70b-instruct",
            "messages": [
                {"role": "user", "content": "One fun fact about sharks."}
            ],
            "max_tokens": 128,
        },
    },
]
input_path.write_text("\n".join(json.dumps(r) for r in requests) + "\n")

uploaded = client.files.create(file=input_path, purpose="batch")

print("file_id: ", uploaded.file_id)
print("filename:", uploaded.filename)
print("bytes:   ", uploaded.bytes)

JavaScript

import { InferenceClient } from "@digitalocean/dots";

const client = new InferenceClient({
    apiKey: process.env.DIGITALOCEAN_TOKEN,
});

const intent = await client.files.create({
    file_name: "batch_requests.jsonl",
});

console.log("file_id:   ", intent.file_id);
console.log("upload_url:", intent.upload_url);

Responses

201

File intent created.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

The number of requests in your hourly quota that remain before you hit your request limit. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

expires_at string (date-time) optional Nullable

Example: 2026-04-24T19:34:19Z

When upload_url expires.

file_id string (uuid) required

Example: a1b2c3d4-e5f6-4789-90ab-cdef12345678

Pass this value as file_id when creating a batch job.

upload_url string (uri) required

Short-lived presigned PUT URL (typically valid for ~15 minutes). If it expires before upload, create a new file intent.

400

There was an error parsing the request body.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

401

Authentication failed due to invalid credentials.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

403

The authenticated principal does not have permission to perform this action on the requested resource.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

422

Unprocessable Entity

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

Response

201

{
  "expires_at": "2026-04-24T19:34:19Z",
  "file_id": "a1b2c3d4-e5f6-4789-90ab-cdef12345678",
  "upload_url": "https://batch-inference.nyc3.digitaloceanspaces.com/uploads/a1b2c3d4-e5f6-4789-90ab-cdef12345678.jsonl?X-Amz-Algorithm=AWS4-HMAC-SHA256\u0026X-Amz-Expires=900\u0026X-Amz-Signature=..."
}

400

{
  "id": "bad_request",
  "message": "error parsing request body",
  "request_id": "4851a473-1621-42ea-b2f9-5071c0ea8414"
}

401

{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}

403

{
  "id": "forbidden",
  "message": "You do not have permission to perform this action."
}

422

{
  "id": "unprocessable_entity",
  "message": "request payload validation failed",
  "request_id": "4851a473-1621-42ea-b2f9-5071c0ea8414"
}

429

{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}

500

{
  "id": "server_error",
  "message": "Unexpected server-side error"
}

default

{
  "id": "example_error",
  "message": "some error message"
}

PUT Upload a Batch Inference Input File

/<upload_url>

Uploads the raw JSONL bytes to the presigned upload_url returned by POST /v1/batches/files.

The URL is dynamic — do not construct it. Use the upload_url value from the previous step verbatim. Its host, path, and query parameters are part of the short-lived (~15 minute) signature and change per request; the server and path shown here are illustrative. If the URL expires before the upload completes, create a new file intent and retry.

POST /v1/batches performs a HEAD check on the uploaded object and will reject the batch if this upload has not completed.

Send the raw JSONL bytes verbatim. Presigned PUT URLs are signature-sensitive to request headers — prefer application/octet-stream or omit Content-Type entirely. A custom value (for example application/jsonl) can cause signature mismatches unless the URL was signed for that exact header.

Request Body: `application/octet-stream`

Raw JSONL bytes — one request per line.

Request: `/<upload_url>`

Payload

Content type application/octet-stream

"string"

cURL

# UPLOAD_URL is the exact upload_url returned by POST /v1/batches/files.
# Use it verbatim; do not modify the host, path, or query string.
#
# Send the raw JSONL bytes with --data-binary so line endings and UTF-8
# are preserved. The presigned URL is signature-sensitive: prefer
# application/octet-stream (or omit Content-Type entirely) — a custom
# value such as application/jsonl can break signature matching unless
# the URL was signed for that exact header.
curl -X PUT "$UPLOAD_URL" \
  -H "Content-Type: application/octet-stream" \
  --data-binary "@batch_requests.jsonl"

Python

# Two-step upload flow:
#   1. Reserve a file_id + presigned upload_url via client.batches.files.create.
#   2. PUT the raw JSONL bytes to upload_url.
#
# The presigned URL is short-lived (~15 minutes) and signature-sensitive —
# use it verbatim and prefer Content-Type application/octet-stream (or
# omit the header entirely). A custom value such as application/jsonl
# can break signature matching.
import os
from pathlib import Path

import requests
from pydo import Client

client = Client(token=os.environ.get("DIGITALOCEAN_TOKEN"))

input_path = Path("batch_requests.jsonl")

# Step 1: reserve the upload slot.
intent = client.batches.files.create(file_name=input_path.name)
upload_url = intent["upload_url"]
file_id = intent["file_id"]

# Step 2: PUT the JSONL bytes to the presigned URL.
with input_path.open("rb") as fh:
    put = requests.put(
        upload_url,
        data=fh,
        headers={"Content-Type": "application/octet-stream"},
        timeout=60,
    )
put.raise_for_status()

print("uploaded file_id:", file_id)

JavaScript

// Two-step upload flow:
//   1. Reserve a file_id + presigned upload_url via client.files.create.
//   2. PUT the raw JSONL bytes to upload_url.
//
// The presigned URL is short-lived (~15 minutes) and signature-sensitive —
// use it verbatim and prefer Content-Type application/octet-stream (or
// omit the header entirely). A custom value such as application/jsonl
// can break signature matching.
import { readFile } from "node:fs/promises";
import { InferenceClient } from "@digitalocean/dots";

const client = new InferenceClient({
    apiKey: process.env.DIGITALOCEAN_TOKEN,
});

// Step 1: reserve the upload slot.
const intent = await client.files.create({ file_name: "batch_requests.jsonl" });

// Step 2: PUT the JSONL bytes to the presigned URL.
const body = await readFile("batch_requests.jsonl");
const res = await fetch(intent.upload_url, {
    method: "PUT",
    headers: { "Content-Type": "application/octet-stream" },
    body,
});
if (!res.ok) {
    throw new Error(`Upload failed: HTTP ${res.status} ${res.statusText}`);
}

console.log("uploaded file_id:", intent.file_id);

Responses

200

Upload accepted by object storage.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

401

Authentication failed due to invalid credentials.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

403

Signature mismatch or URL expired. Create a new file intent with POST /v1/batches/files and upload again.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

429

The API rate limit has been exceeded.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

Response

401

{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}

429

{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}

500

{
  "id": "server_error",
  "message": "Unexpected server-side error"
}

default

{
  "id": "example_error",
  "message": "some error message"
}

GET List Batch Inference Jobs

/v1/batches

Authorizations: inference_bearer_auth

Http: Bearer

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

dop_v1_ for personal access tokens generated in the control panel
doo_v1_ for tokens generated by applications using the OAuth flow
dor_v1_ for OAuth refresh tokens

Authenticate with a Bearer Authorization Header

Serverless Inference:

curl -X POST -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://inference.do-ai.run/v1/chat/completions"

Agent Inference:

curl -X POST -H "Authorization: Bearer $AGENT_ACCESS_KEY" "https://{your-agent-url}.agents.do-ai.run/v1/chat/completions?agent=true"

Returns a cursor-paginated list of batch jobs, ordered newest first. Use limit to control page size and after to page forward using the last_id from the previous response.

Query Parameters

after string (uuid) optional

Example: 7b2e9c1a-6f4d-4d9b-a0f1-5c4b7e2f8a12

Cursor for pagination. Pass the last_id value from the previous response to fetch the next page. Omit for the first page.

limit integer 1 – 100 optional

Example: 20

Maximum number of batches to return per page.

Default: 20

status string (enum) optional

Example: in_progress

Optional filter restricting results to batches in the given lifecycle state.

Request: `/v1/batches`

cURL

curl -sS -X GET "https://inference.do-ai.run/v1/batches?limit=20" \
  -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" \
  -H "Content-Type: application/json" | jq

Python

import os

from pydo import Client

client = Client(token=os.environ.get("DIGITALOCEAN_TOKEN"))

resp = client.batches.list(limit=20)

for b in resp.get("data") or []:
    print(f"{b.get('batch_id'):40}  {b.get('status'):12}  {b.get('created_at')}")

print("has_more:", resp.get("has_more"))
print("last_id: ", resp.get("last_id"))

JavaScript

import { InferenceClient } from "@digitalocean/dots";

const client = new InferenceClient({
    apiKey: process.env.DIGITALOCEAN_TOKEN,
});

const page = await client.batches.list({ limit: 20 });

for (const b of page.data ?? []) {
    console.log(`${b.batch_id}\t${b.status}\t${b.created_at}`);
}

console.log("has_more:", page.has_more);
console.log("last_id: ", page.last_id);

Responses

200

Page of batch jobs.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

data array of object required

Batch jobs on this page, ordered newest first.

Show child properties

batch_id string (uuid) required

Example: 0e9d1d35-3d1e-4d66-9a2f-8c7e0f6b3e21

Unique identifier for the batch job.

cancelled_at string (date-time) optional Nullable

Example: 2026-04-24T19:45:11Z

completed_at string (date-time) optional Nullable

Example: 2026-04-24T20:15:30Z

completion_window string, one of: 24h required

Example: 24h

created_at string (date-time) required

Example: 2026-04-24T19:19:19Z

endpoint string optional

Example: /v1/chat/completions

Inference endpoint each request is dispatched to.

error_file_id string (uuid) optional Nullable

Error sidecar file. Null when no errors were produced.

errors array of object optional Nullable

Top-level errors that prevented the batch from completing.

Show child properties

code string optional

Example: invalid_input_file

line integer optional Nullable

Example: 42

1-based JSONL line number, if applicable.

message string optional

Example: Line 42: missing required field 'custom_id'.

expires_at string (date-time) optional Nullable

Example: 2026-04-25T19:19:19Z

Derived from created_at plus completion_window.

failed_at string (date-time) optional Nullable

Example: 2026-04-24T19:50:00Z

finalizing_at string (date-time) optional Nullable

Example: 2026-04-24T20:10:42Z

in_progress_at string (date-time) optional Nullable

Example: 2026-04-24T19:20:05Z

input_file_id string (uuid) required

Example: a1b2c3d4-e5f6-4789-90ab-cdef12345678

The uploaded JSONL input file.

metadata object optional Nullable

Metadata attached at creation.

output_file_id string (uuid) optional Nullable

Output JSONL file. Populated once the job completes.

provider string, one of: openai, anthropic required

Example: openai

request_counts object optional

Aggregate request counts.

Show child properties

completed integer optional

Example: 0

failed integer optional

Example: 0

total integer optional

Example: 10000

request_id string optional

Example: c7e3ad1e-20c3-4e47-9bf2-6f2a4d6a2f11

The idempotency key supplied at creation.

status string (enum) required

Example: in_progress

Lifecycle status. Terminal states: completed, failed, expired, cancelled.

first_id string (uuid) optional Nullable

Example: 0e9d1d35-3d1e-4d66-9a2f-8c7e0f6b3e21

ID of the first batch on this page. Null when the page is empty.

has_more boolean required

Example: false

Whether additional batches exist beyond this page.

last_id string (uuid) optional Nullable

Example: 7b2e9c1a-6f4d-4d9b-a0f1-5c4b7e2f8a12

ID of the last batch on this page. Pass as after to fetch the next page.

object string, one of: list required

Example: list

The object type, always list.

401

Authentication failed due to invalid credentials.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

403

The authenticated principal does not have permission to perform this action on the requested resource.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

Response

200

{
  "data": [
    {
      "batch_id": "0e9d1d35-3d1e-4d66-9a2f-8c7e0f6b3e21",
      "cancelled_at": "2026-04-24T19:45:11Z",
      "completed_at": "2026-04-24T20:15:30Z",
      "completion_window": "24h",
      "created_at": "2026-04-24T19:19:19Z",
      "endpoint": "/v1/chat/completions",
      "error_file_id": "497f6eca-6276-4993-bfeb-53cbbbba6f08",
      "errors": [],
      "expires_at": "2026-04-25T19:19:19Z",
      "failed_at": "2026-04-24T19:50:00Z",
      "finalizing_at": "2026-04-24T20:10:42Z",
      "in_progress_at": "2026-04-24T19:20:05Z",
      "input_file_id": "a1b2c3d4-e5f6-4789-90ab-cdef12345678",
      "output_file_id": "497f6eca-6276-4993-bfeb-53cbbbba6f08",
      "provider": "openai",
      "request_id": "c7e3ad1e-20c3-4e47-9bf2-6f2a4d6a2f11",
      "status": "in_progress"
    }
  ],
  "first_id": "0e9d1d35-3d1e-4d66-9a2f-8c7e0f6b3e21",
  "has_more": false,
  "last_id": "7b2e9c1a-6f4d-4d9b-a0f1-5c4b7e2f8a12",
  "object": "list"
}

401

{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}

403

{
  "id": "forbidden",
  "message": "You do not have permission to perform this action."
}

429

{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}

500

{
  "id": "server_error",
  "message": "Unexpected server-side error"
}

default

{
  "id": "example_error",
  "message": "some error message"
}

POST Create a Batch Inference Job

/v1/batches

Authorizations: inference_bearer_auth

Http: Bearer

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

dop_v1_ for personal access tokens generated in the control panel
doo_v1_ for tokens generated by applications using the OAuth flow
dor_v1_ for OAuth refresh tokens

Authenticate with a Bearer Authorization Header

Serverless Inference:

curl -X POST -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://inference.do-ai.run/v1/chat/completions"

Agent Inference:

curl -X POST -H "Authorization: Bearer $AGENT_ACCESS_KEY" "https://{your-agent-url}.agents.do-ai.run/v1/chat/completions?agent=true"

Submits a batch job against a previously uploaded JSONL input file. The upload must have completed before this call; otherwise the request is rejected.

Supply a unique request_id to make the submission idempotent — retries with the same value return the existing job. When provider is openai, the url on each JSONL line must match endpoint.

Request Body: `application/json`

completion_window string, one of: 24h required

Example: 24h

Time window in which the job must complete. Jobs that do not finish in time transition to expired.

endpoint string, one of: /v1/responses, /v1/chat/completions optional

Example: /v1/chat/completions

Inference endpoint each request is dispatched to. Required when provider is openai and must match the url on every JSONL line. Must be omitted when provider is anthropic.

file_id string (uuid) required

Example: a1b2c3d4-e5f6-4789-90ab-cdef12345678

The file_id returned by POST /v1/batches/files.

metadata object optional Nullable

Example: {"dataset":"prompts_v1","team":"ml-eval"}

Optional string-valued metadata to attach to the job.

provider string, one of: openai, anthropic required

Example: openai

The inference provider whose JSONL schema the input file conforms to. openai follows the OpenAI Batch API input schema (custom_id, method, url, body); anthropic follows the Anthropic Message Batches JSONL conventions.

request_id string required

Example: c7e3ad1e-20c3-4e47-9bf2-6f2a4d6a2f11

Client-supplied idempotency key. Retries with the same value return the existing job instead of creating a duplicate.

Request: `/v1/batches`

Payload

Content type application/json

Example

{
  "completion_window": "24h",
  "file_id": "a1b2c3d4-e5f6-4789-90ab-cdef12345678",
  "metadata": {
    "dataset": "prompts_v1",
    "team": "ml-eval"
  },
  "provider": "anthropic",
  "request_id": "2f1a7d9e-8c03-4d2c-9b7e-6f8e2b1a4c77"
}

{
  "completion_window": "24h",
  "endpoint": "/v1/chat/completions",
  "file_id": "a1b2c3d4-e5f6-4789-90ab-cdef12345678",
  "provider": "openai",
  "request_id": "c7e3ad1e-20c3-4e47-9bf2-6f2a4d6a2f11"
}

{
  "completion_window": "24h",
  "endpoint": "/v1/responses",
  "file_id": "a1b2c3d4-e5f6-4789-90ab-cdef12345678",
  "provider": "openai",
  "request_id": "9f7b9d4a-4e6c-4a27-8e35-1b0e4c5a9a12"
}

cURL

# OpenAI provider — endpoint required (/v1/responses or /v1/chat/completions)
curl -sS -X POST "https://inference.do-ai.run/v1/batches" \
  -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "file_id": "a1b2c3d4-e5f6-4789-90ab-cdef12345678",
    "provider": "openai",
    "endpoint": "/v1/chat/completions",
    "completion_window": "24h",
    "request_id": "c7e3ad1e-20c3-4e47-9bf2-6f2a4d6a2f11"
  }'

# Anthropic provider — DO NOT send endpoint
curl -sS -X POST "https://inference.do-ai.run/v1/batches" \
  -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "file_id": "a1b2c3d4-e5f6-4789-90ab-cdef12345678",
    "provider": "anthropic",
    "completion_window": "24h",
    "request_id": "2f1a7d9e-8c03-4d2c-9b7e-6f8e2b1a4c77"
  }'

Python

import os
import uuid

from pydo import Client

client = Client(token=os.environ.get("DIGITALOCEAN_TOKEN"))

batch = client.batches.create(
    file_id=os.environ["BATCH_INPUT_FILE_ID"],
    provider="openai",
    endpoint="/v1/chat/completions",
    completion_window="24h",
    request_id=str(uuid.uuid4()),
)

print("batch_id:", batch.get("batch_id"))
print("status:  ", batch.get("status"))

JavaScript

import { randomUUID } from "node:crypto";
import { InferenceClient } from "@digitalocean/dots";

const client = new InferenceClient({
    apiKey: process.env.DIGITALOCEAN_TOKEN,
});

const batch = await client.batches.create({
    file_id: process.env.BATCH_INPUT_FILE_ID,
    provider: "openai",
    endpoint: "/v1/chat/completions",
    completion_window: "24h",
    request_id: randomUUID(),
});

console.log("batch_id:", batch.batch_id);
console.log("status:  ", batch.status);

Responses

201

Batch job accepted. Poll GET /v1/batches/{batch_id} for status.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

batch_id string (uuid) required

Example: 0e9d1d35-3d1e-4d66-9a2f-8c7e0f6b3e21

Unique identifier for the batch job.

cancelled_at string (date-time) optional Nullable

Example: 2026-04-24T19:45:11Z

completed_at string (date-time) optional Nullable

Example: 2026-04-24T20:15:30Z

completion_window string, one of: 24h required

Example: 24h

created_at string (date-time) required

Example: 2026-04-24T19:19:19Z

endpoint string optional

Example: /v1/chat/completions

Inference endpoint each request is dispatched to.

error_file_id string (uuid) optional Nullable

Error sidecar file. Null when no errors were produced.

errors array of object optional Nullable

Top-level errors that prevented the batch from completing.

Show child properties

code string optional

Example: invalid_input_file

line integer optional Nullable

Example: 42

1-based JSONL line number, if applicable.

message string optional

Example: Line 42: missing required field 'custom_id'.

expires_at string (date-time) optional Nullable

Example: 2026-04-25T19:19:19Z

Derived from created_at plus completion_window.

failed_at string (date-time) optional Nullable

Example: 2026-04-24T19:50:00Z

finalizing_at string (date-time) optional Nullable

Example: 2026-04-24T20:10:42Z

in_progress_at string (date-time) optional Nullable

Example: 2026-04-24T19:20:05Z

input_file_id string (uuid) required

Example: a1b2c3d4-e5f6-4789-90ab-cdef12345678

The uploaded JSONL input file.

metadata object optional Nullable

Metadata attached at creation.

output_file_id string (uuid) optional Nullable

Output JSONL file. Populated once the job completes.

provider string, one of: openai, anthropic required

Example: openai

request_counts object optional

Aggregate request counts.

Show child properties

completed integer optional

Example: 0

failed integer optional

Example: 0

total integer optional

Example: 10000

request_id string optional

Example: c7e3ad1e-20c3-4e47-9bf2-6f2a4d6a2f11

The idempotency key supplied at creation.

status string (enum) required

Example: in_progress

Lifecycle status. Terminal states: completed, failed, expired, cancelled.

400

There was an error parsing the request body.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

401

Authentication failed due to invalid credentials.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

403

The authenticated principal does not have permission to perform this action on the requested resource.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

404

The resource was not found.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

409

The request could not be completed due to a conflict.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

422

Unprocessable Entity

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

Response

201

{
  "batch_id": "0e9d1d35-3d1e-4d66-9a2f-8c7e0f6b3e21",
  "cancelled_at": "2026-04-24T19:45:11Z",
  "completed_at": "2026-04-24T20:15:30Z",
  "completion_window": "24h",
  "created_at": "2026-04-24T19:19:19Z",
  "endpoint": "/v1/chat/completions",
  "error_file_id": "497f6eca-6276-4993-bfeb-53cbbbba6f08",
  "errors": [
    {
      "code": "invalid_input_file",
      "line": 42,
      "message": "Line 42: missing required field 'custom_id'."
    }
  ],
  "expires_at": "2026-04-25T19:19:19Z",
  "failed_at": "2026-04-24T19:50:00Z",
  "finalizing_at": "2026-04-24T20:10:42Z",
  "in_progress_at": "2026-04-24T19:20:05Z",
  "input_file_id": "a1b2c3d4-e5f6-4789-90ab-cdef12345678",
  "output_file_id": "497f6eca-6276-4993-bfeb-53cbbbba6f08",
  "provider": "openai",
  "request_counts": {
    "completed": 0,
    "failed": 0,
    "total": 10000
  },
  "request_id": "c7e3ad1e-20c3-4e47-9bf2-6f2a4d6a2f11",
  "status": "in_progress"
}

400

{
  "id": "bad_request",
  "message": "error parsing request body",
  "request_id": "4851a473-1621-42ea-b2f9-5071c0ea8414"
}

401

{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}

403

{
  "id": "forbidden",
  "message": "You do not have permission to perform this action."
}

404

{
  "id": "not_found",
  "message": "The resource you requested could not be found."
}

409

{
  "id": "conflict",
  "message": "The request could not be completed due to a conflict."
}

422

{
  "id": "unprocessable_entity",
  "message": "request payload validation failed",
  "request_id": "4851a473-1621-42ea-b2f9-5071c0ea8414"
}

429

{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}

500

{
  "id": "server_error",
  "message": "Unexpected server-side error"
}

default

{
  "id": "example_error",
  "message": "some error message"
}

GET Retrieve a Batch Inference Job

/v1/batches/{batch_id}

Authorizations: inference_bearer_auth

Http: Bearer

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

dop_v1_ for personal access tokens generated in the control panel
doo_v1_ for tokens generated by applications using the OAuth flow
dor_v1_ for OAuth refresh tokens

Authenticate with a Bearer Authorization Header

Serverless Inference:

curl -X POST -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://inference.do-ai.run/v1/chat/completions"

Agent Inference:

curl -X POST -H "Authorization: Bearer $AGENT_ACCESS_KEY" "https://{your-agent-url}.agents.do-ai.run/v1/chat/completions?agent=true"

Returns the current state of a batch job. Poll until status reaches a terminal value (completed, failed, expired, or cancelled).

Path Parameters

batch_id string (uuid) required

Example: 0e9d1d35-3d1e-4d66-9a2f-8c7e0f6b3e21

The batch job identifier.

Request: `/v1/batches/{batch_id}`

cURL

curl -sS -X GET "https://inference.do-ai.run/v1/batches/0e9d1d35-3d1e-4d66-9a2f-8c7e0f6b3e21" \
  -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" \
  -H "Content-Type: application/json"

Python

import os

from pydo import Client

client = Client(token=os.environ.get("DIGITALOCEAN_TOKEN"))

batch = client.batches.retrieve(os.environ["BATCH_ID"])

print("batch_id:      ", batch.get("batch_id"))
print("status:        ", batch.get("status"))
print("request_counts:", batch.get("request_counts"))
print("output_file_id:", batch.get("output_file_id"))

JavaScript

import { InferenceClient } from "@digitalocean/dots";

const client = new InferenceClient({
    apiKey: process.env.DIGITALOCEAN_TOKEN,
});

const batch = await client.batches.retrieve(process.env.BATCH_ID);

console.log("batch_id:      ", batch.batch_id);
console.log("status:        ", batch.status);
console.log("request_counts:", batch.request_counts);
console.log("output_file_id:", batch.output_file_id);

Responses

200

The batch job.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

batch_id string (uuid) required

Example: 0e9d1d35-3d1e-4d66-9a2f-8c7e0f6b3e21

Unique identifier for the batch job.

cancelled_at string (date-time) optional Nullable

Example: 2026-04-24T19:45:11Z

completed_at string (date-time) optional Nullable

Example: 2026-04-24T20:15:30Z

completion_window string, one of: 24h required

Example: 24h

created_at string (date-time) required

Example: 2026-04-24T19:19:19Z

endpoint string optional

Example: /v1/chat/completions

Inference endpoint each request is dispatched to.

error_file_id string (uuid) optional Nullable

Error sidecar file. Null when no errors were produced.

errors array of object optional Nullable

Top-level errors that prevented the batch from completing.

Show child properties

code string optional

Example: invalid_input_file

line integer optional Nullable

Example: 42

1-based JSONL line number, if applicable.

message string optional

Example: Line 42: missing required field 'custom_id'.

expires_at string (date-time) optional Nullable

Example: 2026-04-25T19:19:19Z

Derived from created_at plus completion_window.

failed_at string (date-time) optional Nullable

Example: 2026-04-24T19:50:00Z

finalizing_at string (date-time) optional Nullable

Example: 2026-04-24T20:10:42Z

in_progress_at string (date-time) optional Nullable

Example: 2026-04-24T19:20:05Z

input_file_id string (uuid) required

Example: a1b2c3d4-e5f6-4789-90ab-cdef12345678

The uploaded JSONL input file.

metadata object optional Nullable

Metadata attached at creation.

output_file_id string (uuid) optional Nullable

Output JSONL file. Populated once the job completes.

provider string, one of: openai, anthropic required

Example: openai

request_counts object optional

Aggregate request counts.

Show child properties

completed integer optional

Example: 0

failed integer optional

Example: 0

total integer optional

Example: 10000

request_id string optional

Example: c7e3ad1e-20c3-4e47-9bf2-6f2a4d6a2f11

The idempotency key supplied at creation.

status string (enum) required

Example: in_progress

Lifecycle status. Terminal states: completed, failed, expired, cancelled.

401

Authentication failed due to invalid credentials.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

403

The authenticated principal does not have permission to perform this action on the requested resource.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

404

The resource was not found.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

Response

200

{
  "batch_id": "0e9d1d35-3d1e-4d66-9a2f-8c7e0f6b3e21",
  "cancelled_at": "2026-04-24T19:45:11Z",
  "completed_at": "2026-04-24T20:15:30Z",
  "completion_window": "24h",
  "created_at": "2026-04-24T19:19:19Z",
  "endpoint": "/v1/chat/completions",
  "error_file_id": "497f6eca-6276-4993-bfeb-53cbbbba6f08",
  "errors": [
    {
      "code": "invalid_input_file",
      "line": 42,
      "message": "Line 42: missing required field 'custom_id'."
    }
  ],
  "expires_at": "2026-04-25T19:19:19Z",
  "failed_at": "2026-04-24T19:50:00Z",
  "finalizing_at": "2026-04-24T20:10:42Z",
  "in_progress_at": "2026-04-24T19:20:05Z",
  "input_file_id": "a1b2c3d4-e5f6-4789-90ab-cdef12345678",
  "output_file_id": "497f6eca-6276-4993-bfeb-53cbbbba6f08",
  "provider": "openai",
  "request_counts": {
    "completed": 0,
    "failed": 0,
    "total": 10000
  },
  "request_id": "c7e3ad1e-20c3-4e47-9bf2-6f2a4d6a2f11",
  "status": "in_progress"
}

401

{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}

403

{
  "id": "forbidden",
  "message": "You do not have permission to perform this action."
}

404

{
  "id": "not_found",
  "message": "The resource you requested could not be found."
}

429

{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}

500

{
  "id": "server_error",
  "message": "Unexpected server-side error"
}

default

{
  "id": "example_error",
  "message": "some error message"
}

GET Get Batch Inference Results Download Links

/v1/batches/{batch_id}/results

Authorizations: inference_bearer_auth

Http: Bearer

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

dop_v1_ for personal access tokens generated in the control panel
doo_v1_ for tokens generated by applications using the OAuth flow
dor_v1_ for OAuth refresh tokens

Authenticate with a Bearer Authorization Header

Serverless Inference:

curl -X POST -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://inference.do-ai.run/v1/chat/completions"

Agent Inference:

curl -X POST -H "Authorization: Bearer $AGENT_ACCESS_KEY" "https://{your-agent-url}.agents.do-ai.run/v1/chat/completions?agent=true"

Returns short-lived presigned download URLs for the output (and optional error sidecar) of a completed batch job. If results are not yet ready, the response sets result_available: false or returns 412 Precondition Failed; in both cases, keep polling batch status and retry.

Download the artifacts soon after fetching — the URLs are short-lived. Result files themselves are retained for up to 30 days after the job completes, after which they are deleted.

Path Parameters

batch_id string (uuid) required

Example: 0e9d1d35-3d1e-4d66-9a2f-8c7e0f6b3e21

The batch job identifier.

Request: `/v1/batches/{batch_id}/results`

cURL

curl -sS -X GET "https://inference.do-ai.run/v1/batches/0e9d1d35-3d1e-4d66-9a2f-8c7e0f6b3e21/results" \
  -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" \
  -H "Content-Type: application/json" | jq

Python

import os
from pathlib import Path

import requests
from pydo import Client

client = Client(token=os.environ.get("DIGITALOCEAN_TOKEN"))

batch_id = os.environ["BATCH_ID"]

links = client.batches.results.retrieve(batch_id)

if not links.get("result_available"):
    print("results not ready yet; poll batch status and retry")
    raise SystemExit(0)

resp = requests.get(links["output_file_url"], timeout=60)
resp.raise_for_status()

out = Path("batch_output.jsonl")
out.write_bytes(resp.content)

print("wrote:", out)
print("----- preview -----")
print(resp.text[:500])

JavaScript

import { InferenceClient } from "@digitalocean/dots";

const client = new InferenceClient({
    apiKey: process.env.DIGITALOCEAN_TOKEN,
});

const batchId = process.env.BATCH_ID;

// client.files.content resolves the result envelope and follows the
// presigned URL for you, returning the raw fetch Response.
const resp = await client.files.content(batchId);
if (!resp.ok) {
    throw new Error(`results not ready: HTTP ${resp.status}`);
}

const body = await resp.text();
const lines = body.split("\n").filter(Boolean);

console.log(`got ${lines.length} line(s); first entry:`);
console.log(lines[0]);

Responses

200

Presigned download URLs.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

batch_id string (uuid) required

Example: 0e9d1d35-3d1e-4d66-9a2f-8c7e0f6b3e21

error_file_url string (uri) optional Nullable

Presigned URL for the error sidecar JSONL, if any.

expires_at string (date-time) optional Nullable

Example: 2026-04-24T20:19:19Z

When the presigned URLs expire.

output_file_url string (uri) optional Nullable

Presigned URL for the main results JSONL.

result_available boolean required

Example: true

When false, keep polling batch status and retry later.

401

Authentication failed due to invalid credentials.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

403

The authenticated principal does not have permission to perform this action on the requested resource.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

404

The resource was not found.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

412

Results are not yet available. Poll batch status and retry.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

Response

200

{
  "batch_id": "0e9d1d35-3d1e-4d66-9a2f-8c7e0f6b3e21",
  "error_file_url": "string",
  "expires_at": "2026-04-24T20:19:19Z",
  "output_file_url": "https://batch-inference.nyc3.digitaloceanspaces.com/outputs/0e9d1d35-3d1e-4d66-9a2f-8c7e0f6b3e21.jsonl?X-Amz-Signature=...",
  "result_available": true
}

401

{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}

403

{
  "id": "forbidden",
  "message": "You do not have permission to perform this action."
}

404

{
  "id": "not_found",
  "message": "The resource you requested could not be found."
}

412

{
  "id": "precondition_failed",
  "message": "Batch results are not yet available. Retry once the job has completed."
}

429

{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}

500

{
  "id": "server_error",
  "message": "Unexpected server-side error"
}

default

{
  "id": "example_error",
  "message": "some error message"
}

POST Cancel a Batch Inference Job

/v1/batches/{batch_id}/cancel

Authorizations: inference_bearer_auth

Http: Bearer

OAuth Authentication

In order to interact with the DigitalOcean API, you or your application must authenticate.

You can generate an OAuth token by visiting the Apps & API section of the DigitalOcean control panel for your account.

An OAuth token functions as a complete authentication request. In effect, it acts as a substitute for a username and password pair.

DigitalOcean access tokens begin with an identifiable prefix in order to distinguish them from other similar tokens.

dop_v1_ for personal access tokens generated in the control panel
doo_v1_ for tokens generated by applications using the OAuth flow
dor_v1_ for OAuth refresh tokens

Authenticate with a Bearer Authorization Header

Serverless Inference:

curl -X POST -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" "https://inference.do-ai.run/v1/chat/completions"

Agent Inference:

curl -X POST -H "Authorization: Bearer $AGENT_ACCESS_KEY" "https://{your-agent-url}.agents.do-ai.run/v1/chat/completions?agent=true"

Requests cancellation of a batch job. The job transitions to cancelling and, once in-flight requests drain, to cancelled. Jobs already in a terminal state (completed, failed, expired, cancelled) cannot be cancelled and return 409 Conflict. Cancellation is also rejected with 409 Conflict while the job has not yet been submitted to the upstream provider — there is nothing to cancel until the provider batch id is assigned.

Partial results produced before cancellation remain available via GET /v1/batches/{batch_id}/results.

Path Parameters

batch_id string (uuid) required

Example: 0e9d1d35-3d1e-4d66-9a2f-8c7e0f6b3e21

The batch job identifier.

Request: `/v1/batches/{batch_id}/cancel`

cURL

curl -sS -X POST "https://inference.do-ai.run/v1/batches/0e9d1d35-3d1e-4d66-9a2f-8c7e0f6b3e21/cancel" \
  -H "Authorization: Bearer $DIGITALOCEAN_TOKEN" \
  -H "Content-Type: application/json" | jq

Python

import os

from pydo import Client

client = Client(token=os.environ.get("DIGITALOCEAN_TOKEN"))

result = client.batches.cancel(os.environ["BATCH_ID"])

print("batch_id:    ", result.get("batch_id"))
print("status:      ", result.get("status"))
print("cancelled_at:", result.get("cancelled_at"))

JavaScript

import { InferenceClient } from "@digitalocean/dots";

const client = new InferenceClient({
    apiKey: process.env.DIGITALOCEAN_TOKEN,
});

const result = await client.batches.cancel(process.env.BATCH_ID);

console.log("batch_id:    ", result.batch_id);
console.log("status:      ", result.status);
console.log("cancelled_at:", result.cancelled_at);

Responses

200

Cancellation accepted. Returns the updated batch job.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

batch_id string (uuid) required

Example: 0e9d1d35-3d1e-4d66-9a2f-8c7e0f6b3e21

Unique identifier for the batch job.

cancelled_at string (date-time) optional Nullable

Example: 2026-04-24T19:45:11Z

completed_at string (date-time) optional Nullable

Example: 2026-04-24T20:15:30Z

completion_window string, one of: 24h required

Example: 24h

created_at string (date-time) required

Example: 2026-04-24T19:19:19Z

endpoint string optional

Example: /v1/chat/completions

Inference endpoint each request is dispatched to.

error_file_id string (uuid) optional Nullable

Error sidecar file. Null when no errors were produced.

errors array of object optional Nullable

Top-level errors that prevented the batch from completing.

Show child properties

code string optional

Example: invalid_input_file

line integer optional Nullable

Example: 42

1-based JSONL line number, if applicable.

message string optional

Example: Line 42: missing required field 'custom_id'.

expires_at string (date-time) optional Nullable

Example: 2026-04-25T19:19:19Z

Derived from created_at plus completion_window.

failed_at string (date-time) optional Nullable

Example: 2026-04-24T19:50:00Z

finalizing_at string (date-time) optional Nullable

Example: 2026-04-24T20:10:42Z

in_progress_at string (date-time) optional Nullable

Example: 2026-04-24T19:20:05Z

input_file_id string (uuid) required

Example: a1b2c3d4-e5f6-4789-90ab-cdef12345678

The uploaded JSONL input file.

metadata object optional Nullable

Metadata attached at creation.

output_file_id string (uuid) optional Nullable

Output JSONL file. Populated once the job completes.

provider string, one of: openai, anthropic required

Example: openai

request_counts object optional

Aggregate request counts.

Show child properties

completed integer optional

Example: 0

failed integer optional

Example: 0

total integer optional

Example: 10000

request_id string optional

Example: c7e3ad1e-20c3-4e47-9bf2-6f2a4d6a2f11

The idempotency key supplied at creation.

status string (enum) required

Example: in_progress

Lifecycle status. Terminal states: completed, failed, expired, cancelled.

401

Authentication failed due to invalid credentials.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

403

The authenticated principal does not have permission to perform this action on the requested resource.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

404

The resource was not found.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

409

The request could not be completed due to a conflict.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

429

The API rate limit has been exceeded.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

500

There was a server error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

default

There was an unexpected error.

Response Headers

ratelimit-limit integer

The default limit on number of requests that can be made per hour and per minute. Current rate limits are 5000 requests per hour and 250 requests per minute.

ratelimit-remaining integer

ratelimit-reset integer

The time when the oldest request will expire. The value is given in Unix epoch time. See https://docs.digitalocean.com/reference/api/reference/#rate-limit for information about how requests expire.

Response Schema: application/json

id string required

Example: not_found

A short identifier corresponding to the HTTP status code returned. For example, the ID for a response returning a 404 status code would be "not_found."

message string required

Example: The resource you were accessing could not be found.

A message providing additional information about the error, including details to help resolve it when possible.

request_id string optional

Example: 4d9d8375-3c56-4925-a3e7-eb137fed17e9

Optionally, some endpoints may include a request ID that should be provided when reporting bugs or opening support tickets to help identify the issue.

Response

200

{
  "batch_id": "0e9d1d35-3d1e-4d66-9a2f-8c7e0f6b3e21",
  "cancelled_at": "2026-04-24T19:45:11Z",
  "completed_at": "2026-04-24T20:15:30Z",
  "completion_window": "24h",
  "created_at": "2026-04-24T19:19:19Z",
  "endpoint": "/v1/chat/completions",
  "error_file_id": "497f6eca-6276-4993-bfeb-53cbbbba6f08",
  "errors": [
    {
      "code": "invalid_input_file",
      "line": 42,
      "message": "Line 42: missing required field 'custom_id'."
    }
  ],
  "expires_at": "2026-04-25T19:19:19Z",
  "failed_at": "2026-04-24T19:50:00Z",
  "finalizing_at": "2026-04-24T20:10:42Z",
  "in_progress_at": "2026-04-24T19:20:05Z",
  "input_file_id": "a1b2c3d4-e5f6-4789-90ab-cdef12345678",
  "output_file_id": "497f6eca-6276-4993-bfeb-53cbbbba6f08",
  "provider": "openai",
  "request_counts": {
    "completed": 0,
    "failed": 0,
    "total": 10000
  },
  "request_id": "c7e3ad1e-20c3-4e47-9bf2-6f2a4d6a2f11",
  "status": "in_progress"
}

401

{
  "id": "unauthorized",
  "message": "Unable to authenticate you."
}

403

{
  "id": "forbidden",
  "message": "You do not have permission to perform this action."
}

404

{
  "id": "not_found",
  "message": "The resource you requested could not be found."
}

409

{
  "id": "conflict",
  "message": "The request could not be completed due to a conflict."
}

429

{
  "id": "too_many_requests",
  "message": "API rate limit exceeded."
}

500

{
  "id": "server_error",
  "message": "Unexpected server-side error"
}

default

{
  "id": "example_error",
  "message": "some error message"
}

Batch Inference

Endpoints

POST Create a Batch Inference Input File

OAuth Authentication

Authenticate with a Bearer Authorization Header

Request Body: application/json

Request: /v1/batches/files

Responses

Response

PUT Upload a Batch Inference Input File

Request Body: application/octet-stream

Request: /<upload_url>

Responses

Response

GET List Batch Inference Jobs

OAuth Authentication

Authenticate with a Bearer Authorization Header

Query Parameters

Request: /v1/batches

Responses

Response

POST Create a Batch Inference Job

OAuth Authentication

Authenticate with a Bearer Authorization Header

Request Body: application/json

Request: /v1/batches

Responses

Response

GET Retrieve a Batch Inference Job

OAuth Authentication

Authenticate with a Bearer Authorization Header

Path Parameters

Request: /v1/batches/{batch_id}

Responses

Response

GET Get Batch Inference Results Download Links

OAuth Authentication

Authenticate with a Bearer Authorization Header

Path Parameters

Request: /v1/batches/{batch_id}/results

Responses

Response

POST Cancel a Batch Inference Job

OAuth Authentication

Authenticate with a Bearer Authorization Header

Path Parameters

Request: /v1/batches/{batch_id}/cancel

Responses

Response

We can't find any results for your search.

Request Body: `application/json`

Request: `/v1/batches/files`

Request Body: `application/octet-stream`

Request: `/<upload_url>`

Request: `/v1/batches`

Request Body: `application/json`

Request: `/v1/batches`

Request: `/v1/batches/{batch_id}`

Request: `/v1/batches/{batch_id}/results`

Request: `/v1/batches/{batch_id}/cancel`