DigitalOcean Gradient™ AI Platform Pricing

Validated on 31 Oct 2025 • Last edited on 10 Nov 2025

DigitalOcean Gradient™ AI Platform lets you build fully-managed AI agents with knowledge bases for retrieval-augmented generation, multi-agent routing, guardrails, and more, or use serverless inference to make direct requests to popular foundation models.

Gradient AI Platform has a usage-based pricing model, so costs scale with your actual usage. We charge for model usage and for additional features like knowledge bases, guardrails, and log stream insights. We display prices per million tokens and bill per thousand tokens for accuracy.

Serverless inference (direct API calls) is billed by DigitalOcean for both open-source and commercial models. Prices on this page align with each provider’s published rates for transparency.

Agents are billed by DigitalOcean for open-source models. When you use commercial models in agents or evaluations with your own provider API keys (for example, OpenAI, Anthropic), billing is handled directly by the model provider and DigitalOcean does not charge you for that model usage.

Agent creation is free. You are charged for all input and output tokens processed by the agent. Token usage depends on factors such as input length, agent instructions, attached knowledge bases, and configuration settings. To optimize usage, test your agents and adjust their parameters.

Serverless inference lets you call models directly through the API without creating an agent. All usage is billed per input and output token through your DigitalOcean account.

Foundation Model Usage

The following shows pricing for open-source and commercial models for serverless inference and agent usage.

Model Serverless Inference (Direct API Usage) Agent Usage
Qwen3-32B $0.25 per 1,000,000 input tokens
$0.55 per 1,000,000 output tokens
Not available
Note
For serverless inference, billing is handled by DigitalOcean and aligns with Anthropic’s published rates.
For agents, when you use your own Anthropic API key, billing is handled directly by Anthropic.
Model Serverless Inference (Direct API Usage) Agent Usage
Claude Sonnet 4
Claude 3.7 Sonnet
Claude 3.5 Sonnet
Claude Opus 4
Claude 3 Opus
Claude 3.5 Haiku
Billed by DigitalOcean (provider-aligned rates). Billed directly by provider (your API key).
Model Serverless Inference (Direct API Usage) Agent Usage
DeepSeek-R1 distill-llama-70B Same as Agent Usage $0.99 per 1,000,000 input tokens
$0.99 per 1,000,000 output tokens
Note
These models are currently in public preview. Serverless usage is billed by DigitalOcean as shown.
Model Serverless Inference (Direct API Usage) Agent Usage
Fast SDXL $0.0011 per compute second Not available
Flux Schnell $0.0030 per megapixel Not available
Stable Audio 2.5 (Text-to-Audio) $0.00058 per compute second Not available
Multilingual TTS v2 $0.10 per 1000 characters Not available
Model Serverless Inference (Direct API Usage) Agent Usage
Llama 3.1 8B Same as Agent Usage $0.198 per 1,000,000 input tokens
$0.198 per 1,000,000 output tokens
Llama 3.3 70B Same as Agent Usage $0.65 per 1,000,000 input tokens
$0.65 per 1,000,000 output tokens
Model Serverless Inference (Direct API Usage) Agent Usage
NeMo Same as Agent Usage $0.30 per 1,000,000 input tokens
$0.30 per 1,000,000 output tokens
Note
Billing for most OpenAI commercial models through DigitalOcean is not available. When using OpenAI models with your own API key, billing is handled directly by OpenAI.
Model Serverless Inference (Direct API Usage) Agent Usage
OpenAI gpt-oss-120b Same as Agent Usage $0.10 per 1,000,000 input tokens
$0.70 per 1,000,000 output tokens
OpenAI gpt-oss-20b Same as Agent Usage $0.05 per 1,000,000 input tokens
$0.45 per 1,000,000 output tokens
GPT-5
GPT-5 mini
GPT-5 nano
GPT-4.1
GPT-4o
GPT-4o mini
o1
o3
o3-mini
GPT-image-1
Billed directly by provider (your API key). Billed directly by provider (your API key).

Knowledge Bases

Knowledge bases are billed for both indexing and storage:

  • Indexing tokens: We charge for tokens required to generate embeddings. Pricing is the same for manual and auto-indexing. Charges apply only when changes are detected (new, updated, or deleted files/URLs). If auto-indexing is paused or no changes are found, there are no charges.

    For example, a 10 MB dataset is about 3 million tokens, and a 1 GB dataset is about 250 million tokens.

    Actual costs depend on the embedding model:

    Model Price
    all-mini-lm-l6-v2 $0.009 per 1,000,000 input tokens
    multi-qa-mpnet-base-dot-v1 $0.009 per 1,000,000 input tokens
    gte-large-en-v1.5 $0.09 per 1,000,000 input tokens

    One token is roughly four characters (approximately 75 words per 100 tokens). Non-Latin scripts, emojis, or binary data may increase token counts.

  • Storage: Embeddings are stored in OpenSearch. See OpenSearch pricing.

Guardrails

Charges apply for all tokens processed through guardrails:

Guardrail Price
Content Moderation $0.20 per 1,000,000 tokens
Jailbreak Detection $0.20 per 1,000,000 tokens
Sensitive Data Detection $0.34 per 1,000,000 tokens

Costs are per token. Creating, editing, or duplicating guardrails has no additional cost.

Functions

If you attach DigitalOcean Functions to your agent, you are billed at functions pricing.

Agent Evaluations

Agent evaluations are charged by token usage at the same rates as model usage.

Log Stream Insights

Log Stream Insights uses a third-party model to analyze agent trace data. You are charged per token:

Tokens Price
Input $4.00 per 1,000,000 tokens
Output $20.00 per 1,000,000 tokens

We can't find any results for your search.

Try using different keywords or simplifying your search terms.