DigitalOcean AI Platform Pricing

Validated on 1 May 2026 • Last edited on 1 May 2026

DigitalOcean AI Platform lets you build fully-managed AI agents with knowledge bases for retrieval-augmented generation, multi-agent routing, guardrails, and more.

DigitalOcean AI Platform has a usage-based pricing model, so costs scale with your actual usage.

We charge for model usage for Agent Development Kit (ADK)(in public preview) and agent creation using the DigitalOcean Control Panel, CLI, or API, and for additional features like knowledge bases, guardrails, and log stream insights. We display prices per million tokens and bill per thousand tokens for accuracy.

Agent creation is free. Agent usage is billed by DigitalOcean. You are charged for all input and output tokens processed by the agent. Token usage depends on factors such as input length, agent instructions, attached knowledge bases, and configuration settings. To optimize usage, [test your agents]products/ai-platform/how-to/test-agents" >}}) and adjust their parameters.

If you are using a DigitalOcean-hosted model for your agent deployment using the ADK, you are charged for those model keys.

Usage for commercial models in agents or evaluations with your own provider API keys (for example, OpenAI key or Anthropic key) is billed directly by the provider. DigitalOcean does not charge you for that model usage.

Foundation Model Usage

The following shows pricing for open-source and commercial models for ADK and agent usage.

Anthropic Models
Note
When using Anthropic commercial models with your own model API keys, billing is handled directly by Anthropic at the provider’s rates.

Claude Sonnet 4.6, Sonnet 4.5, and Sonnet 4 support an input context window of up to 1M tokens.

Model ADK Agent
Claude Sonnet 4.6 Prompts ≤200K tokens$3.00 per 1M input tokens
$15.00 per 1M output tokens
Prompts >200K tokens$6.00 per 1M input tokens
$22.50 per 1M output tokens
Prompt caching$3.75 per 1M cache creation 5m input tokens
$6.00 per 1M cache creation 1h input tokens
$0.30 per 1M cache read input tokens
Same as serverless inference.
Claude Sonnet 4.5 Prompts ≤200K tokens$3.00 per 1M input tokens
$15.00 per 1M output tokens
Prompts >200K tokens$6.00 per 1M input tokens
$22.50 per 1M output tokens
Prompt caching$3.75 per 1M cache creation 5m input tokens
$6.00 per 1M cache creation 1h input tokens
$0.30 per 1M cache read input tokens
Same as serverless inference.
Claude Sonnet 4 Prompts ≤200K tokens$3.00 per 1M input tokens
$15.00 per 1M output tokens
Prompts >200K tokens$6.00 per 1M input tokens
$22.50 per 1M output tokens
Prompt caching$3.75 per 1M cache creation 5m input tokens
$6.00 per 1M cache creation 1h input tokens
$0.30 per 1M cache read input tokens
Same as serverless inference.
Claude Haiku 4.5 Input/output tokens$1.00 per 1M input tokens
$5.00 per 1M output tokens
Prompt caching$1.25 per 1M cache creation 5m input tokens
$2.00 per 1M cache creation 1h input tokens
$1.00 per 1M cache read input tokens
Same as serverless inference.
Claude Opus 4.7 Input/output tokens$5.00 per 1M input tokens
$25.00 per 1M output tokens
Prompt caching$6.25 per 1M cache creation 5m input tokens
$10.00 per 1M cache creation 1h input tokens
$0.50 per 1M cache read input tokens
Same as serverless inference.
Claude Opus 4.6 Prompts ≤200K tokens$5.00 per 1M input tokens
$25.00 per 1M output tokens
Prompts >200K tokens$10.00 per 1M input tokens
$37.50 per 1M output tokens
Prompt caching$6.25 per 1M cache creation 5m input tokens
$10.00 per 1M cache creation 1h input tokens
$0.50 per 1M cache read input tokens
Same as serverless inference.
Claude Opus 4.5 Input/output tokens$5.00 per 1M input tokens
$25.00 per 1M output tokens
Prompt caching$6.25 per 1M cache creation 5m input tokens
$10.00 per 1M cache creation 1h input tokens
$0.50 per 1M cache read input tokens
Same as serverless inference.
Claude Opus 4.1 Input/output tokens$15.00 per 1M input tokens
$75.00 per 1M output tokens
Prompt caching$18.75 per 1M cache creation 5m input tokens
$30.00 per 1M cache creation 1h input tokens
$1.50 per 1M cache read input tokens
Same as serverless inference.
Claude Opus 4 Input/output tokens$15.00 per 1M input tokens
$75.00 per 1M output tokens
Prompt caching$18.75 per 1M cache creation 5m input tokens
$30.00 per 1M cache creation 1h input tokens
$1.50 per 1M cache read input tokens
Same as serverless inference.
Arcee Models
Model ADK Agent
Trinity Large (Public Preview) Input/output tokens$0.25 per 1M input tokens
$0.90 per 1M output tokens
Prompt caching$0.06 per 1M cache read input tokens
Not supported
fal Models
Model Serverless Inference Agent Usage
Fast SDXL $0.0011 per compute second Not supported
Flux Schnell $0.0030 per megapixel Not supported
Stable Audio 2.5 (Text-to-Audio) $0.00058 per compute second Not supported
Multilingual TTS v2 $0.10 per 1000 characters Not supported
OpenAI Models
Note
When using OpenAI commercial models with your own model API keys, billing is handled directly by OpenAI at the provider’s rates.
Model ADK Agent
gpt-oss-120b Input/output tokens$0.10 per 1M input tokens
$0.70 per 1M output tokens
Same as serverless inference
gpt-oss-20b Input/output tokens$0.05 per 1M input tokens
$0.45 per 1M output tokens
Same as serverless inference
GPT-5.4 Input/output tokens$2.50 per 1M input tokens
$15.00 per 1M output tokens
Prompt caching$0.25 per 1M cache read input tokens
Same as serverless inference
GPT-5.4 mini Input/output tokens$0.75 per 1M input tokens
$4.50 per 1M output tokens
Prompt caching$0.075 per 1M cache read input tokens
Not supported.
GPT-5.4 nano Input/output tokens$0.20 per 1M input tokens
$1.25 per 1M output tokens
Prompt caching$0.02 per 1M cache read input tokens
Not supported.
GPT-5.4 pro Input/output tokens$30.00 per 1M input tokens
$180.00 per 1M output tokens
Not supported.
GPT-5.3-Codex Input/output tokens$1.75 per 1M input tokens
$14.00 per 1M output tokens
Prompt caching$0.175 per 1M cache read input tokens
Not supported
GPT-5.2 Input/output tokens$1.75 per 1M input tokens
$14.00 per 1M output tokens
Prompt caching$0.175 per 1M cache read input tokens
Same as serverless inference
GPT-5.2 pro Input/output tokens$21.00 per 1M input tokens
$168.00 per 1M output tokens
Same as serverless inference
GPT-5.1-Codex-Max Input/output tokens$1.25 per 1M input tokens
$10.00 per 1M output tokens
Prompt caching$0.125 per 1M cache read input tokens
Same as serverless inference
GPT-5 Input/output tokens$1.25 per 1M input tokens
$10.00 per 1M output tokens
Prompt caching$0.125 per 1M cache read input tokens
Same as serverless inference
GPT-5 mini Input/output tokens$0.25 per 1M input tokens
$2.00 per 1M output tokens
Prompt caching$0.025 per 1M cache read input tokens
Same as serverless inference
GPT-5 nano Input/output tokens$0.05 per 1M input tokens
$0.40 per 1M output tokens
Prompt caching$0.005 per 1M cache read input tokens
Same as serverless inference
GPT-4.1 Input/output tokens$2.00 per 1M input tokens
$8.00 per 1M output tokens
Prompt caching$0.50 per 1M cache read input tokens
Same as serverless inference
GPT-4o Input/output tokens$2.50 per 1M input tokens
$10.00 per 1M output tokens
Prompt caching$1.25 per 1M cache read input tokens
Same as serverless inference
GPT-4o mini Input/output tokens$0.15 per 1M input tokens
$0.60 per 1M output tokens
Prompt caching$0.075 per 1M cache read input tokens
Same as serverless inference
o1 Input/output tokens$15.00 per 1M input tokens
$60.00 per 1M output tokens
Prompt caching$7.50 per 1M cache read input tokens
Same as serverless inference
o3 Input/output tokens$2.00 per 1M input tokens
$8.00 per 1M output tokens
Prompt caching$0.50 per 1M cache read input tokens
Same as serverless inference
o3-mini Input/output tokens$1.10 per 1M input tokens
$4.40 per 1M output tokens
Prompt caching$0.55 per 1M cache read input tokens
Same as serverless inference
GPT-image-1 Input/output tokens$5.00 per 1M input tokens
$40.00 per 1M output tokens
Prompt caching$1.25 per 1M cache read input tokens
Not supported
GPT Image 1.5 Input/output tokens$5.00 per 1M input tokens
$10.00 per 1M output tokens
Prompt caching$1.00 per 1M cache read input tokens
Not supported.
GPT Image 2 Text input$5.00 per 1M tokens
Text output$0.00 per 1M tokens
Text cache read$1.25 per 1M tokens
Image input$8.00 per 1M tokens
Image output$30.00 per 1M tokens
Image cache read$2.00 per 1M tokens
Not supported.
DigitalOcean-Hosted Models
Provider Model ADK Agent
Alibaba Qwen3-32B Input/output tokens$0.25 per 1M input tokens
$0.55 per 1M output tokens
Not supported
Alibaba Qwen3 Coder Flash Input/output tokens$0.45 per 1M input tokens
$1.70 per 1M output tokens
Not supported
Alibaba Qwen 3.5 397B A17B Input/output tokens$0.55 per 1M input tokens
$3.50 per 1M output tokens
Not supported
Alibaba Qwen 3 TTS (1.7B) $20.00 per 1M character tokens Not supported
Alibaba Wan2.2-T2V-A14B $0.60 per 1M video tokens Not supported
DeepSeek DeepSeek R1 Distill Llama 70B Input/output tokens$0.99 per 1M input tokens
$0.99 per 1M output tokens
Same as serverless inference
DeepSeek DeepSeek V4 Pro Input/output tokens$1.74 per 1M input tokens
$3.48 per 1M output tokens
Same as serverless inference
DeepSeek DeepSeek V3.2 Input/output tokens$0.50 per 1M input tokens
$1.60 per 1M output tokens
Same as serverless inference
Google Gemma 4 Input/output tokens$0.18 per 1M input tokens
$0.50 per 1M output tokens
Same as serverless inference
Intfloat E5 Mistral 7B Instruct Input/output tokens$0.09 per 1M input tokens
$0.00 per 1M output tokens
Not supported
MiniMax M2.5 (Public Preview) Input/output tokens$0.30 per 1M input tokens
$1.20 per 1M output tokens
Same as serverless inference
Moonshot AI Kimi K2.5 Input/output tokens$0.50 per 1M input tokens
$2.70 per 1M output tokens
Same as serverless inference
Meta Llama 3.3 Instruct-70B Input/output tokens$0.65 per 1M input tokens
$0.65 per 1M output tokens
Same as serverless inference
Meta Llama 4 Maverick 17B 128E Instruct Input/output tokens$0.25 per 1M input tokens
$0.87 per 1M output tokens
Same as serverless inference
Mistral AI Ministral 3 14B Instruct Input/output tokens$0.20 per 1M input tokens
$0.20 per 1M output tokens
Same as serverless inference
NVIDIA Nemotron-3-Super-120B (Public Preview) Input/output tokens$0.30 per 1M input tokens
$0.65 per 1M output tokens
Same as serverless inference
NVIDIA Nemotron Nano 3 Omni Input/output tokens$0.50 per 1M input tokens
$0.90 per 1M output tokens
Not supported
NVIDIA Nemotron Nano 12B v2 VL Input/output tokens$0.20 per 1M input tokens
$0.60 per 1M output tokens
Not supported
Stability AI Stable Diffusion 3.5 Large $0.08 per 1M image tokens Not supported
Z.ai GLM 5 Input/output tokens$1.00 per 1M input tokens
$3.20 per 1M output tokens
Same as serverless inference

Knowledge Bases

Knowledge base pricing is shown per million tokens, but billing is calculated per thousand tokens.

You’re billed for both indexing and storage:

  • Tokens used for indexing and retrieval query vectorization: We charge for tokens used to generate embeddings during indexing and to vectorize user queries during retrieval. Both use the same embeddings model pricing.

    Indexing pricing is the same for manual and auto-indexing. Indexing charges apply only when changes are detected, such as new, updated, or deleted files or URLs. If auto-indexing is paused or no changes are found, there are no indexing charges.

    Note
    Retrieval requests sent through a MCP server are billed the same as retrieval requests sent directly to the knowledge base retrieve endpoint. This includes the tokens used to vectorize the retrieval query with the selected embeddings model.

    For example, a 10 MB dataset is about 3 million tokens, and a 1 GB dataset is about 250 million tokens.

    Actual costs depend on the embeddings model:

    Model Price
    all-mini-lm-l6-v2 $0.009 per 1M input tokens
    multi-qa-mpnet-base-dot-v1 $0.009 per 1M input tokens
    gte-large-en-v1.5 $0.09 per 1M input tokens
    Qwen3 Embedding 0.6B $0.04 per 1,000,000 tokens
    BGE-M3 $0.02 per 1,000,000 tokens
    E5 Large V2 $0.02 per 1,000,000 tokens
    Note
    One token is roughly four characters (approximately 75 words per 100 tokens). Non-Latin scripts, emojis, or binary data may increase token counts.
  • Reranking tokens: If reranking is enabled, tokens used to rerank results are billed based on the selected reranking model. For supported reranking models, see available reranking models.

    Model Price
    BGE Reranker v2 m3 $0.01 per 1M reranking tokens
  • Storage: Embeddings are stored in OpenSearch. See OpenSearch pricing.

Chunking has no separate charge. Chunking costs depend on embedding token usage, OpenSearch database, and the selected embeddings model.

Chunking strategy cost depends on how many tokens the strategy embeds and returns:

  • Section-based and fixed length chunking are the most cost-efficient because they use simple splitting and have predictable token usage.
  • Semantic chunking costs more because it uses the embeddings model to detect semantic boundaries and embed final chunks, often resulting in 1.5 to 3 times more indexing tokens.
  • Hierarchical chunking slightly increases indexing cost by creating parent and child embeddings. It can also increase retrieval cost because agents receive both child and parent chunks for each lookup.

Changing your chunking strategy or configuration requires re-indexing the affected data source, which consumes additional tokens. For guidance on chunking configurations and best practices, see our chunking parameters reference and chunking best practices.

If you use RAG Playground, answer generation is billed separately based on the selected serverless inference model. Free tokens for RAG Playground are not separate; they are shared with Model Playground.

Guardrails

Charges apply for all tokens processed through guardrails:

Guardrail Price
Content Moderation $0.20 per 1,000,000 tokens
Jailbreak Detection $0.20 per 1,000,000 tokens
Sensitive Data Detection $0.34 per 1,000,000 tokens

Costs are per token. Creating, editing, or duplicating guardrails has no additional cost.

Functions

If you attach DigitalOcean Functions to your agent, you are billed at functions pricing.

Agent Evaluations

Agent evaluations are charged by token usage at the same rates as model usage.

Log Stream Insights

Log Stream Insights uses a third-party model to analyze agent trace data. You are charged per token:

Tokens Price
Input $1.10 per 1,000,000 tokens
Output $4.40 per 1,000,000 tokens

Agent Development Kit public

You are not charged for using the Agent Development Kit during public preview. However, you are billed for other DigitalOcean AI Platform features you use with your agent deployment:

  • If you are using a DigitalOcean hosted model through serverless inference in your agent deployment, you are charged for those model keys.

  • For agent evaluations, token usage is charged to the agent model keys. For example, if your agent uses a serverless inference endpoint key, any token usage is charged to that key. If the agent uses a third-party model key, or a key to a model not hosted on DigitalOcean, you are charged by the hosting provider.

  • If you enable Log Stream Insights for your agent deployment, you are charged for tokens when new insights are generated.

Note
For General Availability, agent deployment hosting, measured in GiB-sec, will be charged. We will also be charging for judge input and output tokens, which are the tokens used for judging the agent inputs and outputs against the test case’s chosen metrics. These costs are waived during public preview.

We can't find any results for your search.

Try using different keywords or simplifying your search terms.