DigitalOcean Knowledge Bases Features

Validated on 23 Apr 2026 • Last edited on 27 Apr 2026

DigitalOcean Knowledge Bases let you store, index, and retrieve data from private files, websites, Spaces buckets, and other sources to power retrieval-augmented generation with your own content.

A knowledge base is a private repository of unstructured content, such as files, folders, and URLs, that improves agent responses using retrieval-augmented generation (RAG). Knowledge bases store source data in DigitalOcean Spaces object storage and store indexes in a DigitalOcean OpenSearch cluster.

You can add data sources to knowledge bases from Spaces buckets, local files, seed or site map URLs, Dropbox folders, and Amazon S3 buckets.

Embeddings Models

The embeddings models convert unstructured data into vector embeddings so AI agents can find content that matches a user’s input.

Activity Logs

Activity logs give you visibility into indexing jobs for each knowledge base. You can view recent activity and download CSVs for debugging.

Retrieve

The retrieve feature lets you query a knowledge base for relevant chunks, apply metadata filters, and review the results for use in your applications and agent workflows. Each user query is vectorized using the knowledge base’s selected embeddings model.

You can retrieve data and run semantic, keyword, or hybrid searches, review scored chunks, and generate live Gradient SDK (Python) and cURL examples from the current query via the Control Panel or API.

You can optionally enable reranking to re-score and reorder retrieved chunks so the most relevant results appear first.

Knowledge base retrieval is also available through an MCP server for querying, filtering, and retrieving chunks. For setup information, see Knowledge Bases MCP Tools.

RAG Playground

RAG Playground lets you test how a selected serverless inference model answers a query using content retrieved from a knowledge base. You can enter a query, choose a model, and adjust settings such as system instructions, max tokens, and temperature.

RAG Playground shows the generated answer alongside retrieved chunks, including source details, page numbers, relevance scores, and which chunks were used in the response.

Auto-Indexing

Auto-indexing keeps data sources up to date by re-indexing changes on a recurring schedule.

Chunking

Chunking controls how documents are split before indexing. You configure chunking per data source and use different strategies in the same knowledge base.

Data Services supports section-based, semantic, hierarchical, and fixed length chunking for different document types, retrieval patterns, and cost needs.

For details and recommendations, see our chunking best practices and chunking parameters reference.

We can't find any results for your search.

Try using different keywords or simplifying your search terms.