GenAI Platform Pricingpublic
Validated on 9 May 2025 • Last edited on 28 May 2025
The DigitalOcean GenAI Platform lets you work with popular foundation models and build GPU-powered AI agents with fully-managed deployment, or send direct requests using serverless inference. Create agents that incorporate guardrails, functions, agent routing, and retrieval-augmented generation (RAG) pipelines with knowledge bases.
The GenAI Platform follows a usage-based pricing model, charging based on the models you select, their frequency of use, and additional features like knowledge bases and guardrails. This pricing model ensures that costs adjust according to your usage, helping you manage spending as your needs evolve.
An agent’s token usage depends on several factors, including the length of the user’s input, complexity of agent instructions, knowledge base data, and other configurations. To optimize your token usage, test your agents and adjust their settings accordingly.
Open Source Models (Agent Usage)
We charge for all input and output tokens used by an agent. You can use the Model Playground for free, but daily token limits apply.
Model | Price |
---|---|
DeepSeek-R1-distill-llama-70B | $0.99 per 1,000,000 input tokens $0.99 per 1,000,000 output tokens |
Llama 3.1 8B | $0.198 per 1,000,000 input tokens $0.198 per 1,000,000 output tokens |
Llama 3.1 70B | $0.70 per 1,000,000 input tokens $0.70 per 1,000,000 output tokens |
Llama 3.3 70B | $0.65 per 1,000,000 input tokens $0.65 per 1,000,000 output tokens |
Mistral NeMo | $0.30 per 1,000,000 input tokens $0.30 per 1,000,000 output tokens |
We charge for agent token usage, not for agent creation.
Serverless Inference (Direct API Usage)
Serverless inference lets you send API requests directly to supported models without creating an agent.
- For open source models, serverless inference uses the same per-token pricing as agent usage.
- For commercial models (such as OpenAI and Anthropic), pricing is determined by the model provider. You use your provider’s API tokens, and your provider bills you directly.
All usage is billed per input and output token.
Commercial Models
Pricing for input and output tokens for commercial models follows the provider’s standard rates. Your model provider bills your account with them directly since you use your own API tokens.
Standard rates for commercial models include:
Knowledge Bases
Creating a knowledge base involves two actions:
-
Transforming the provided data into vector embeddings (indexing). You are charged based on the number of tokens indexed into the knowledge base. For example, a 10 MB dataset becomes about three million tokens and costs around $0.0225 if the model charges $0.009 per one million tokens. A 1 GB dataset becomes about 250 million tokens and would cost about $2.25 at the same rate. Actual pricing depends on the model you choose, since each model may have a different token cost.
Each token represents roughly four characters. At scale, 100 tokens is about 75 words. Using non-Latin characters, emojis, or binary data may increase the token count.
-
The storage of these vector embeddings is dictated by OpenSearch pricing.
Here are the indexing token rates for embedding models by size:
Model | Price |
---|---|
all-mini-lm-l6-v2 |
$0.009 per 1,000,000 input tokens |
multi-qa-mpnet-base-dot-v1 |
$0.009 per 1,000,000 input tokens |
gte-large-en-v1.5 |
$0.09 per 1,000,000 input tokens |
Guardrails
We charge for all input and output tokens to the model based on the guardrail:
Guardrail | Price |
---|---|
Content Moderation | $3.00 per 1,000,000 tokens |
Jailbreak Detection | $3.00 per 1,000,000 tokens |
Sensitive Data Detection | $0.34 per 1,000,000 tokens |
Guardrail costs are per token, not for creation. Editing or duplicating these guardrails does not change their price.
Functions
If you add DigitalOcean Functions to your agent, you are charged based on Functions pricing.