DigitalOcean Knowledge Bases Features
Validated on 23 Apr 2026 • Last edited on 27 Apr 2026
DigitalOcean Knowledge Bases let you store, index, and retrieve data from private files, websites, Spaces buckets, and other sources to power retrieval-augmented generation with your own content.
A knowledge base is a private repository of unstructured content, such as files, folders, and URLs, that improves agent responses using retrieval-augmented generation (RAG). Knowledge bases store source data in DigitalOcean Spaces object storage and store indexes in a DigitalOcean OpenSearch cluster.
You can add data sources to knowledge bases from Spaces buckets, local files, seed or site map URLs, Dropbox folders, and Amazon S3 buckets.
Embeddings Models
The embeddings models convert unstructured data into vector embeddings so AI agents can find content that matches a user’s input.
Activity Logs
Activity logs give you visibility into indexing jobs for each knowledge base. You can view recent activity and download CSVs for debugging.
Retrieve
The retrieve feature lets you query a knowledge base for relevant chunks, apply metadata filters, and review the results for use in your applications and agent workflows. Each user query is vectorized using the knowledge base’s selected embeddings model.
You can retrieve data and run semantic, keyword, or hybrid searches, review scored chunks, and generate live Gradient SDK (Python) and cURL examples from the current query via the Control Panel or API.
You can optionally enable reranking to re-score and reorder retrieved chunks so the most relevant results appear first.
Knowledge base retrieval is also available through an MCP server for querying, filtering, and retrieving chunks. For setup information, see Knowledge Bases MCP Tools.
RAG Playground
RAG Playground lets you test how a selected serverless inference model answers a query using content retrieved from a knowledge base. You can enter a query, choose a model, and adjust settings such as system instructions, max tokens, and temperature.
RAG Playground shows the generated answer alongside retrieved chunks, including source details, page numbers, relevance scores, and which chunks were used in the response.
Auto-Indexing
Auto-indexing keeps data sources up to date by re-indexing changes on a recurring schedule.
Chunking
Chunking controls how documents are split before indexing. You configure chunking per data source and use different strategies in the same knowledge base.
Data Services supports section-based, semantic, hierarchical, and fixed length chunking for different document types, retrieval patterns, and cost needs.
For details and recommendations, see our chunking best practices and chunking parameters reference.