DigitalOcean Gradient™ AI Platform Pricing
Validated on 18 Dec 2025 • Last edited on 30 Dec 2025
DigitalOcean Gradient™ AI Platform lets you build fully-managed AI agents with knowledge bases for retrieval-augmented generation, multi-agent routing, guardrails, and more, or use serverless inference to make direct requests to popular foundation models.
Gradient AI Platform has a usage-based pricing model, so costs scale with your actual usage. We charge for model usage for serverless inference, Agent Development Kit (ADK), and agent created using the DigitalOcean Control Panel, CLI, or API, and for additional features like knowledge bases, guardrails, and log stream insights. Agent creation is free. We display prices per million tokens and bill per thousand tokens for accuracy.
Serverless inference is billed by DigitalOcean for both open-source and commercial models. Prices align with each provider’s published rates for transparency. If you are using a DigitalOcean hosted model through serverless inference in your agent deployment using the ADK, you are charged for those model keys.
Agent usage is billed by DigitalOcean for open-source models. You are charged for all input and output tokens processed by the agent. Token usage depends on factors such as input length, agent instructions, attached knowledge bases, and configuration settings. To optimize usage, test your agents and adjust their parameters.
Usage for commercial models in agents or evaluations with your own provider API keys (for example, OpenAI key or Anthropic key) is billed directly by the provider. DigitalOcean does not charge you for that model usage.
Foundation Model Usage
The following shows pricing for open-source and commercial models for serverless inference, ADK, and agent usage.
| Model | Serverless Inference and ADK | Agent Usage |
|---|---|---|
| Qwen3-32B | $0.25 per 1M input tokens $0.55 per 1M output tokens |
Not available |
Claude Sonnet 4.5 and Sonnet 4 support input context window of up to 1M tokens.
| Model | Serverless Inference and ADK | Agent Usage |
|---|---|---|
| Claude Sonnet 4.5 | For prompts less than or equal to 200K tokens: - $3.00 per 1M input tokens - $15.00 per 1M output tokens For prompts greater than 200K tokens: - $6.00 per 1M input tokens - $22.50 per 1M output tokens |
Billed directly by Anthropic when using your own API key. |
| Claude Sonnet 4 | For prompts less than or equal to 200K tokens: - $3.00 per 1M input tokens - $15.00 per 1M output tokens For prompts greater than 200K tokens: - $6.00 per 1M input tokens - $22.50 per 1M output tokens |
Billed directly by Anthropic when using your own API key. |
| Claude 3.7 Sonnet | $3.00 per 1M input tokens $15.00 per 1M output tokens |
Billed directly by Anthropic when using your own API key. |
| Claude 3.5 Sonnet | $3.00 per 1M input tokens $15.00 per 1M output tokens |
Billed directly by Anthropic when using your own API key. |
| Claude 3.5 Haiku | $0.80 per 1M input tokens $4.00 per 1M output tokens |
Billed directly by Anthropic when using your own API key. |
| Claude Opus 4.5 | $5.00 per 1M input tokens $25.00 per 1M output tokens |
Billed directly by Anthropic when using your own API key. |
| Claude Opus 4.1 | $15.00 per 1M input tokens $75.00 per 1M output tokens |
Billed directly by Anthropic when using your own API key. |
| Claude Opus 4 | $15.00 per 1M input tokens $75.00 per 1M output tokens |
Billed directly by Anthropic when using your own API key. |
| Claude 3 Opus | $15.00 per 1M input tokens $75.00 per 1M output tokens |
Billed directly by Anthropic when using your own API key. |
| Model | Serverless Inference and ADK | Agent Usage |
|---|---|---|
| DeepSeek R1 Distill Llama 70B | $0.99 per 1M input tokens $0.99 per 1M output tokens |
Same as serverless inference. |
| Model | Serverless Inference | Agent Usage |
|---|---|---|
| Fast SDXL | $0.0011 per compute second | Not available |
| Flux Schnell | $0.0030 per megapixel | Not available |
| Stable Audio 2.5 (Text-to-Audio) | $0.00058 per compute second | Not available |
| Multilingual TTS v2 | $0.10 per 1000 characters | Not available |
| Model | Serverless Inference and ADK | Agent Usage |
|---|---|---|
| Llama 3.3 Instruct-70B | $0.65 per 1M input tokens $0.65 per 1M output tokens |
Same as serverless inference. |
| Llama 3.1 Instruct-8B | $0.198 per 1M input tokens $0.198 per 1M output tokens |
Same as serverless inference. |
| Model | Serverless Inference and ADK | Agent Usage |
|---|---|---|
| NeMo | $0.30 per 1M input tokens $0.30 per 1M output tokens |
Same as serverless inference. |
| Model | Serverless Inference and ADK | Agent Usage |
|---|---|---|
| gpt-oss-120b | $0.10 per 1M input tokens $0.70 per 1M output tokens |
Same as serverless inference. |
| gpt-oss-20b | $0.05 per 1M input tokens $0.45 per 1M output tokens |
Same as serverless inference. |
| GPT-5 | $1.25 per 1M input tokens $10.00 per 1M output tokens |
Billed directly by OpenAI when using your own API key. |
| GPT-5 mini | $0.25 per 1M input tokens $2.00 per 1M output tokens |
Billed directly by OpenAI when using your own API key. |
| GPT-5 nano | $0.05 per 1M input tokens $0.40 per 1M output tokens |
Billed directly by OpenAI when using your own API key. |
| GPT-4.1 | $2.00 per 1M input tokens $8.00 per 1M output tokens |
Billed directly by OpenAI when using your own API key. |
| GPT-4o | $2.50 per 1M input tokens $10.00 per 1M output tokens |
Billed directly by OpenAI when using your own API key. |
| GPT-4o mini | $0.15 per 1M input tokens $0.60 per 1M output tokens |
Billed directly by OpenAI when using your own API key. |
| o1 | $15.00 per 1M input tokens $60.00 per 1M output tokens |
Billed directly by OpenAI when using your own API key. |
| o3 | $2.00 per 1M input tokens $8.00 per 1M output tokens |
Billed directly by OpenAI when using your own API key. |
| o3-mini | $1.10 per 1M input tokens $4.40 per 1M output tokens |
Billed directly by OpenAI when using your own API key. |
| GPT-image-1 | $5.00 per 1M input tokens $40.00 per 1M output tokens |
Billed directly by OpenAI when using your own API key. |
Knowledge Bases
Knowledge bases are billed for both indexing and storage:
-
Indexing tokens: We charge for tokens required to generate embeddings. Pricing is the same for manual and auto-indexing. Charges apply only when changes are detected (new, updated, or deleted files/URLs). If auto-indexing is paused or no changes are found, there are no charges.
For example, a 10 MB dataset is about 3 million tokens, and a 1 GB dataset is about 250 million tokens.
Actual costs depend on the embedding model:
Model Price all-mini-lm-l6-v2$0.009 per 1M input tokens multi-qa-mpnet-base-dot-v1$0.009 per 1M input tokens gte-large-en-v1.5$0.09 per 1M input tokens One token is roughly four characters (approximately 75 words per 100 tokens). Non-Latin scripts, emojis, or binary data may increase token counts.
-
Storage: Embeddings are stored in OpenSearch. See OpenSearch pricing.
-
Chunking: The chunking method you choose affects indexing cost because each algorithm embeds a different number of tokens. All indexing and re-indexing jobs are billed based on the total tokens embedded.
- Section-based and fixed-length chunking are the most cost-efficient. They rely on simple splitting and do not perform semantic analysis, resulting in minimal and predictable token usage.
- Semantic chunking is more expensive because it uses the embedding model twice, one to detect semantic boundaries and once to embed the final chunks. This typically results in 1.5 ot 3 times more indexing tokens, more total chunks, and a higher re-indexing cost when settings change.
- Hierarchical chunking produces both parent and child embeddings, slightly increasing indexing cost. The main cost impact is during retrieval: agents receive both the child and its parent chunk, increasing the number of tokens sent to the model for each lookup. Any change to chunking settings requires re-indexing the affected data source, which always consumes additional tokens. Chunking does not incur a separate charge. Costs depend on the embedding token usage and OpenSearch storage, and vary by embedding model. For detailed behavior and tuning guidance, see chunking reference page and chunking best practices.
Guardrails
Charges apply for all tokens processed through guardrails:
| Guardrail | Price |
|---|---|
| Content Moderation | $0.20 per 1,000,000 tokens |
| Jailbreak Detection | $0.20 per 1,000,000 tokens |
| Sensitive Data Detection | $0.34 per 1,000,000 tokens |
Costs are per token. Creating, editing, or duplicating guardrails has no additional cost.
Functions
If you attach DigitalOcean Functions to your agent, you are billed at functions pricing.
Agent Evaluations
Agent evaluations are charged by token usage at the same rates as model usage.
Log Stream Insights
Log Stream Insights uses a third-party model to analyze agent trace data. You are charged per token:
| Tokens | Price |
|---|---|
| Input | $1.10 per 1,000,000 tokens |
| Output | $4.40 per 1,000,000 tokens |
Agent Development Kit public
You are not charged for using the Agent Development Kit during public preview. However, you are billed for other Gradient AI Platform features you use with your agent deployment:
-
If you are using a DigitalOcean hosted model through serverless inference in your agent deployment, you are charged for those model keys.
-
For agent evaluations, token usage is charged to the agent model keys. For example, if your agent uses a serverless inference endpoint key, any token usage is charged to that key. If the agent uses a third-party model key, or a key to a model not hosted on DigitalOcean, you are charged by the hosting provider.
-
If you enable Log Stream insights for your agent deployment, you are charged for Log Stream insights tokens when new insights are generated.