DigitalOcean Gradient™ AI Platform Pricing

Validated on 18 Dec 2025 • Last edited on 30 Dec 2025

DigitalOcean Gradient™ AI Platform lets you build fully-managed AI agents with knowledge bases for retrieval-augmented generation, multi-agent routing, guardrails, and more, or use serverless inference to make direct requests to popular foundation models.

Gradient AI Platform has a usage-based pricing model, so costs scale with your actual usage. We charge for model usage for serverless inference, Agent Development Kit (ADK), and agent created using the DigitalOcean Control Panel, CLI, or API, and for additional features like knowledge bases, guardrails, and log stream insights. Agent creation is free. We display prices per million tokens and bill per thousand tokens for accuracy.

Serverless inference is billed by DigitalOcean for both open-source and commercial models. Prices align with each provider’s published rates for transparency. If you are using a DigitalOcean hosted model through serverless inference in your agent deployment using the ADK, you are charged for those model keys.

Agent usage is billed by DigitalOcean for open-source models. You are charged for all input and output tokens processed by the agent. Token usage depends on factors such as input length, agent instructions, attached knowledge bases, and configuration settings. To optimize usage, test your agents and adjust their parameters.

Usage for commercial models in agents or evaluations with your own provider API keys (for example, OpenAI key or Anthropic key) is billed directly by the provider. DigitalOcean does not charge you for that model usage.

Foundation Model Usage

The following shows pricing for open-source and commercial models for serverless inference, ADK, and agent usage.

Model Serverless Inference and ADK Agent Usage
Qwen3-32B $0.25 per 1M input tokens
$0.55 per 1M output tokens
Not available

Claude Sonnet 4.5 and Sonnet 4 support input context window of up to 1M tokens.

Model Serverless Inference and ADK Agent Usage
Claude Sonnet 4.5 For prompts less than or equal to 200K tokens:
  - $3.00 per 1M input tokens
  - $15.00 per 1M output tokens

For prompts greater than 200K tokens:
  - $6.00 per 1M input tokens
  - $22.50 per 1M output tokens
Billed directly by Anthropic when using your own API key.
Claude Sonnet 4 For prompts less than or equal to 200K tokens:
  - $3.00 per 1M input tokens
  - $15.00 per 1M output tokens

For prompts greater than 200K tokens:
  - $6.00 per 1M input tokens
  - $22.50 per 1M output tokens
Billed directly by Anthropic when using your own API key.
Claude 3.7 Sonnet $3.00 per 1M input tokens
$15.00 per 1M output tokens
Billed directly by Anthropic when using your own API key.
Claude 3.5 Sonnet $3.00 per 1M input tokens
$15.00 per 1M output tokens
Billed directly by Anthropic when using your own API key.
Claude 3.5 Haiku $0.80 per 1M input tokens
$4.00 per 1M output tokens
Billed directly by Anthropic when using your own API key.
Claude Opus 4.5 $5.00 per 1M input tokens
$25.00 per 1M output tokens
Billed directly by Anthropic when using your own API key.
Claude Opus 4.1 $15.00 per 1M input tokens
$75.00 per 1M output tokens
Billed directly by Anthropic when using your own API key.
Claude Opus 4 $15.00 per 1M input tokens
$75.00 per 1M output tokens
Billed directly by Anthropic when using your own API key.
Claude 3 Opus $15.00 per 1M input tokens
$75.00 per 1M output tokens
Billed directly by Anthropic when using your own API key.
Model Serverless Inference and ADK Agent Usage
DeepSeek R1 Distill Llama 70B $0.99 per 1M input tokens
$0.99 per 1M output tokens
Same as serverless inference.
Note
These models are currently in public preview. Serverless usage is billed by DigitalOcean as shown.
Model Serverless Inference Agent Usage
Fast SDXL $0.0011 per compute second Not available
Flux Schnell $0.0030 per megapixel Not available
Stable Audio 2.5 (Text-to-Audio) $0.00058 per compute second Not available
Multilingual TTS v2 $0.10 per 1000 characters Not available
Model Serverless Inference and ADK Agent Usage
Llama 3.3 Instruct-70B $0.65 per 1M input tokens
$0.65 per 1M output tokens
Same as serverless inference.
Llama 3.1 Instruct-8B $0.198 per 1M input tokens
$0.198 per 1M output tokens
Same as serverless inference.
Model Serverless Inference and ADK Agent Usage
NeMo $0.30 per 1M input tokens
$0.30 per 1M output tokens
Same as serverless inference.
Model Serverless Inference and ADK Agent Usage
gpt-oss-120b $0.10 per 1M input tokens
$0.70 per 1M output tokens
Same as serverless inference.
gpt-oss-20b $0.05 per 1M input tokens
$0.45 per 1M output tokens
Same as serverless inference.
GPT-5 $1.25 per 1M input tokens
$10.00 per 1M output tokens
Billed directly by OpenAI when using your own API key.
GPT-5 mini $0.25 per 1M input tokens
$2.00 per 1M output tokens
Billed directly by OpenAI when using your own API key.
GPT-5 nano $0.05 per 1M input tokens
$0.40 per 1M output tokens
Billed directly by OpenAI when using your own API key.
GPT-4.1 $2.00 per 1M input tokens
$8.00 per 1M output tokens
Billed directly by OpenAI when using your own API key.
GPT-4o $2.50 per 1M input tokens
$10.00 per 1M output tokens
Billed directly by OpenAI when using your own API key.
GPT-4o mini $0.15 per 1M input tokens
$0.60 per 1M output tokens
Billed directly by OpenAI when using your own API key.
o1 $15.00 per 1M input tokens
$60.00 per 1M output tokens
Billed directly by OpenAI when using your own API key.
o3 $2.00 per 1M input tokens
$8.00 per 1M output tokens
Billed directly by OpenAI when using your own API key.
o3-mini $1.10 per 1M input tokens
$4.40 per 1M output tokens
Billed directly by OpenAI when using your own API key.
GPT-image-1 $5.00 per 1M input tokens
$40.00 per 1M output tokens
Billed directly by OpenAI when using your own API key.

Knowledge Bases

Knowledge bases are billed for both indexing and storage:

  • Indexing tokens: We charge for tokens required to generate embeddings. Pricing is the same for manual and auto-indexing. Charges apply only when changes are detected (new, updated, or deleted files/URLs). If auto-indexing is paused or no changes are found, there are no charges.

    For example, a 10 MB dataset is about 3 million tokens, and a 1 GB dataset is about 250 million tokens.

    Actual costs depend on the embedding model:

    Model Price
    all-mini-lm-l6-v2 $0.009 per 1M input tokens
    multi-qa-mpnet-base-dot-v1 $0.009 per 1M input tokens
    gte-large-en-v1.5 $0.09 per 1M input tokens

    One token is roughly four characters (approximately 75 words per 100 tokens). Non-Latin scripts, emojis, or binary data may increase token counts.

  • Storage: Embeddings are stored in OpenSearch. See OpenSearch pricing.

  • Chunking: The chunking method you choose affects indexing cost because each algorithm embeds a different number of tokens. All indexing and re-indexing jobs are billed based on the total tokens embedded.

    • Section-based and fixed-length chunking are the most cost-efficient. They rely on simple splitting and do not perform semantic analysis, resulting in minimal and predictable token usage.
    • Semantic chunking is more expensive because it uses the embedding model twice, one to detect semantic boundaries and once to embed the final chunks. This typically results in 1.5 ot 3 times more indexing tokens, more total chunks, and a higher re-indexing cost when settings change.
    • Hierarchical chunking produces both parent and child embeddings, slightly increasing indexing cost. The main cost impact is during retrieval: agents receive both the child and its parent chunk, increasing the number of tokens sent to the model for each lookup. Any change to chunking settings requires re-indexing the affected data source, which always consumes additional tokens. Chunking does not incur a separate charge. Costs depend on the embedding token usage and OpenSearch storage, and vary by embedding model. For detailed behavior and tuning guidance, see chunking reference page and chunking best practices.

Guardrails

Charges apply for all tokens processed through guardrails:

Guardrail Price
Content Moderation $0.20 per 1,000,000 tokens
Jailbreak Detection $0.20 per 1,000,000 tokens
Sensitive Data Detection $0.34 per 1,000,000 tokens

Costs are per token. Creating, editing, or duplicating guardrails has no additional cost.

Functions

If you attach DigitalOcean Functions to your agent, you are billed at functions pricing.

Agent Evaluations

Agent evaluations are charged by token usage at the same rates as model usage.

Log Stream Insights

Log Stream Insights uses a third-party model to analyze agent trace data. You are charged per token:

Tokens Price
Input $1.10 per 1,000,000 tokens
Output $4.40 per 1,000,000 tokens

Agent Development Kit public

You are not charged for using the Agent Development Kit during public preview. However, you are billed for other Gradient AI Platform features you use with your agent deployment:

  • If you are using a DigitalOcean hosted model through serverless inference in your agent deployment, you are charged for those model keys.

  • For agent evaluations, token usage is charged to the agent model keys. For example, if your agent uses a serverless inference endpoint key, any token usage is charged to that key. If the agent uses a third-party model key, or a key to a model not hosted on DigitalOcean, you are charged by the hosting provider.

  • If you enable Log Stream insights for your agent deployment, you are charged for Log Stream insights tokens when new insights are generated.

Note
For General Availability, agent deployment hosting, measured in GiB-sec, will be charged. We will also be charging for judge input and output tokens, which are the tokens used for judging the agent inputs and outputs against the test case’s chosen metrics. These costs are waived during public preview.

We can't find any results for your search.

Try using different keywords or simplifying your search terms.