For agents, when you use your own Anthropic API key, billing is handled directly by Anthropic.
DigitalOcean Gradient™ AI Platform Pricing
Validated on 31 Oct 2025 • Last edited on 10 Nov 2025
DigitalOcean Gradient™ AI Platform lets you build fully-managed AI agents with knowledge bases for retrieval-augmented generation, multi-agent routing, guardrails, and more, or use serverless inference to make direct requests to popular foundation models.
Gradient AI Platform has a usage-based pricing model, so costs scale with your actual usage. We charge for model usage and for additional features like knowledge bases, guardrails, and log stream insights. We display prices per million tokens and bill per thousand tokens for accuracy.
Serverless inference (direct API calls) is billed by DigitalOcean for both open-source and commercial models. Prices on this page align with each provider’s published rates for transparency.
Agents are billed by DigitalOcean for open-source models. When you use commercial models in agents or evaluations with your own provider API keys (for example, OpenAI, Anthropic), billing is handled directly by the model provider and DigitalOcean does not charge you for that model usage.
Agent creation is free. You are charged for all input and output tokens processed by the agent. Token usage depends on factors such as input length, agent instructions, attached knowledge bases, and configuration settings. To optimize usage, test your agents and adjust their parameters.
Serverless inference lets you call models directly through the API without creating an agent. All usage is billed per input and output token through your DigitalOcean account.
Foundation Model Usage
The following shows pricing for open-source and commercial models for serverless inference and agent usage.
| Model | Serverless Inference (Direct API Usage) | Agent Usage |
|---|---|---|
| Qwen3-32B | $0.25 per 1,000,000 input tokens $0.55 per 1,000,000 output tokens |
Not available |
For agents, when you use your own Anthropic API key, billing is handled directly by Anthropic.
| Model | Serverless Inference (Direct API Usage) | Agent Usage |
|---|---|---|
| Claude Sonnet 4 Claude 3.7 Sonnet Claude 3.5 Sonnet Claude Opus 4 Claude 3 Opus Claude 3.5 Haiku |
Billed by DigitalOcean (provider-aligned rates). | Billed directly by provider (your API key). |
| Model | Serverless Inference (Direct API Usage) | Agent Usage |
|---|---|---|
| DeepSeek-R1 distill-llama-70B | Same as Agent Usage | $0.99 per 1,000,000 input tokens $0.99 per 1,000,000 output tokens |
| Model | Serverless Inference (Direct API Usage) | Agent Usage |
|---|---|---|
| Fast SDXL | $0.0011 per compute second | Not available |
| Flux Schnell | $0.0030 per megapixel | Not available |
| Stable Audio 2.5 (Text-to-Audio) | $0.00058 per compute second | Not available |
| Multilingual TTS v2 | $0.10 per 1000 characters | Not available |
| Model | Serverless Inference (Direct API Usage) | Agent Usage |
|---|---|---|
| Llama 3.1 8B | Same as Agent Usage | $0.198 per 1,000,000 input tokens $0.198 per 1,000,000 output tokens |
| Llama 3.3 70B | Same as Agent Usage | $0.65 per 1,000,000 input tokens $0.65 per 1,000,000 output tokens |
| Model | Serverless Inference (Direct API Usage) | Agent Usage |
|---|---|---|
| NeMo | Same as Agent Usage | $0.30 per 1,000,000 input tokens $0.30 per 1,000,000 output tokens |
| Model | Serverless Inference (Direct API Usage) | Agent Usage |
|---|---|---|
| OpenAI gpt-oss-120b | Same as Agent Usage | $0.10 per 1,000,000 input tokens $0.70 per 1,000,000 output tokens |
| OpenAI gpt-oss-20b | Same as Agent Usage | $0.05 per 1,000,000 input tokens $0.45 per 1,000,000 output tokens |
| GPT-5 GPT-5 mini GPT-5 nano GPT-4.1 GPT-4o GPT-4o mini o1 o3 o3-mini GPT-image-1 |
Billed directly by provider (your API key). | Billed directly by provider (your API key). |
Knowledge Bases
Knowledge bases are billed for both indexing and storage:
-
Indexing tokens: We charge for tokens required to generate embeddings. Pricing is the same for manual and auto-indexing. Charges apply only when changes are detected (new, updated, or deleted files/URLs). If auto-indexing is paused or no changes are found, there are no charges.
For example, a 10 MB dataset is about 3 million tokens, and a 1 GB dataset is about 250 million tokens.
Actual costs depend on the embedding model:
Model Price all-mini-lm-l6-v2$0.009 per 1,000,000 input tokens multi-qa-mpnet-base-dot-v1$0.009 per 1,000,000 input tokens gte-large-en-v1.5$0.09 per 1,000,000 input tokens One token is roughly four characters (approximately 75 words per 100 tokens). Non-Latin scripts, emojis, or binary data may increase token counts.
-
Storage: Embeddings are stored in OpenSearch. See OpenSearch pricing.
Guardrails
Charges apply for all tokens processed through guardrails:
| Guardrail | Price |
|---|---|
| Content Moderation | $0.20 per 1,000,000 tokens |
| Jailbreak Detection | $0.20 per 1,000,000 tokens |
| Sensitive Data Detection | $0.34 per 1,000,000 tokens |
Costs are per token. Creating, editing, or duplicating guardrails has no additional cost.
Functions
If you attach DigitalOcean Functions to your agent, you are billed at functions pricing.
Agent Evaluations
Agent evaluations are charged by token usage at the same rates as model usage.
Log Stream Insights
Log Stream Insights uses a third-party model to analyze agent trace data. You are charged per token:
| Tokens | Price |
|---|---|
| Input | $4.00 per 1,000,000 tokens |
| Output | $20.00 per 1,000,000 tokens |