DigitalOcean Knowledge Bases Pricing
Validated on 23 Apr 2026 • Last edited on 27 Apr 2026
DigitalOcean Knowledge Bases let you store, index, and retrieve data from private files, websites, Spaces buckets, and other sources to power retrieval-augmented generation with your own content.
Knowledge base pricing is shown per million tokens, but billing is calculated per thousand tokens.
You’re billed for both indexing and storage:
-
Tokens used for indexing and retrieval query vectorization: We charge for tokens used to generate embeddings during indexing and to vectorize user queries during retrieval. Both use the same embeddings model pricing.
Indexing pricing is the same for manual and auto-indexing. Indexing charges apply only when changes are detected, such as new, updated, or deleted files or URLs. If auto-indexing is paused or no changes are found, there are no indexing charges.
Note Retrieval requests sent through a MCP server are billed the same as retrieval requests sent directly to the knowledge base retrieve endpoint. This includes the tokens used to vectorize the retrieval query with the selected embeddings model.For example, a 10 MB dataset is about 3 million tokens, and a 1 GB dataset is about 250 million tokens.
Actual costs depend on the embeddings model:
Model Price all-mini-lm-l6-v2$0.009 per 1M input tokens multi-qa-mpnet-base-dot-v1$0.009 per 1M input tokens gte-large-en-v1.5$0.09 per 1M input tokens Qwen3 Embedding 0.6B$0.04 per 1,000,000 tokens BGE-M3$0.02 per 1,000,000 tokens E5 Large V2$0.02 per 1,000,000 tokens Note One token is roughly four characters (approximately 75 words per 100 tokens). Non-Latin scripts, emojis, or binary data may increase token counts. -
Reranking tokens: If reranking is enabled, tokens used to rerank results are billed based on the selected reranking model. For supported reranking models, see available reranking models.
Model Price BGE Reranker v2 m3$0.01 per 1M reranking tokens -
Storage: Embeddings are stored in OpenSearch. See OpenSearch pricing.
Chunking has no separate charge. Chunking costs depend on embedding token usage, OpenSearch database, and the selected embeddings model.
Chunking strategy cost depends on how many tokens the strategy embeds and returns:
- Section-based and fixed length chunking are the most cost-efficient because they use simple splitting and have predictable token usage.
- Semantic chunking costs more because it uses the embeddings model to detect semantic boundaries and embed final chunks, often resulting in 1.5 to 3 times more indexing tokens.
- Hierarchical chunking slightly increases indexing cost by creating parent and child embeddings. It can also increase retrieval cost because agents receive both child and parent chunks for each lookup.
Changing your chunking strategy or configuration requires re-indexing the affected data source, which consumes additional tokens. For guidance on chunking configurations and best practices, see our chunking parameters reference and chunking best practices.
If you use RAG Playground, answer generation is billed separately based on the selected serverless inference model. Free tokens for RAG Playground are not separate; they are shared with Model Playground.