GenAI Platform Pricingpublic

Validated on 9 May 2025 • Last edited on 28 May 2025

The DigitalOcean GenAI Platform lets you work with popular foundation models and build GPU-powered AI agents with fully-managed deployment, or send direct requests using serverless inference. Create agents that incorporate guardrails, functions, agent routing, and retrieval-augmented generation (RAG) pipelines with knowledge bases.

The GenAI Platform follows a usage-based pricing model, charging based on the models you select, their frequency of use, and additional features like knowledge bases and guardrails. This pricing model ensures that costs adjust according to your usage, helping you manage spending as your needs evolve.

Note
We display pricing per million tokens but bill usage per thousand tokens to provide more accurate charges and avoid overcharging.

An agent’s token usage depends on several factors, including the length of the user’s input, complexity of agent instructions, knowledge base data, and other configurations. To optimize your token usage, test your agents and adjust their settings accordingly.

Open Source Models (Agent Usage)

We charge for all input and output tokens used by an agent. You can use the Model Playground for free, but daily token limits apply.

Model Price
DeepSeek-R1-distill-llama-70B $0.99 per 1,000,000 input tokens
$0.99 per 1,000,000 output tokens
Llama 3.1 8B $0.198 per 1,000,000 input tokens
$0.198 per 1,000,000 output tokens
Llama 3.1 70B $0.70 per 1,000,000 input tokens
$0.70 per 1,000,000 output tokens
Llama 3.3 70B $0.65 per 1,000,000 input tokens
$0.65 per 1,000,000 output tokens
Mistral NeMo $0.30 per 1,000,000 input tokens
$0.30 per 1,000,000 output tokens

We charge for agent token usage, not for agent creation.

Serverless Inference (Direct API Usage)

Serverless inference lets you send API requests directly to supported models without creating an agent.

  • For open source models, serverless inference uses the same per-token pricing as agent usage.
  • For commercial models (such as OpenAI and Anthropic), pricing is determined by the model provider. You use your provider’s API tokens, and your provider bills you directly.

All usage is billed per input and output token.

Commercial Models

Pricing for input and output tokens for commercial models follows the provider’s standard rates. Your model provider bills your account with them directly since you use your own API tokens.

Standard rates for commercial models include:

Knowledge Bases

Creating a knowledge base involves two actions:

  1. Transforming the provided data into vector embeddings (indexing). You are charged based on the number of tokens indexed into the knowledge base. For example, a 10 MB dataset becomes about three million tokens and costs around $0.0225 if the model charges $0.009 per one million tokens. A 1 GB dataset becomes about 250 million tokens and would cost about $2.25 at the same rate. Actual pricing depends on the model you choose, since each model may have a different token cost.

    Each token represents roughly four characters. At scale, 100 tokens is about 75 words. Using non-Latin characters, emojis, or binary data may increase the token count.

  2. The storage of these vector embeddings is dictated by OpenSearch pricing.

Here are the indexing token rates for embedding models by size:

Model Price
all-mini-lm-l6-v2 $0.009 per 1,000,000 input tokens
multi-qa-mpnet-base-dot-v1 $0.009 per 1,000,000 input tokens
gte-large-en-v1.5 $0.09 per 1,000,000 input tokens

Guardrails

We charge for all input and output tokens to the model based on the guardrail:

Guardrail Price
Content Moderation $3.00 per 1,000,000 tokens
Jailbreak Detection $3.00 per 1,000,000 tokens
Sensitive Data Detection $0.34 per 1,000,000 tokens

Guardrail costs are per token, not for creation. Editing or duplicating these guardrails does not change their price.

Functions

If you add DigitalOcean Functions to your agent, you are charged based on Functions pricing.

We can't find any results for your search.

Try using different keywords or simplifying your search terms.