How to Create, Edit, and Destroy Knowledge Basespublic
Validated on 28 Apr 2025 • Last edited on 22 May 2025
DigitalOcean GenAI Platform lets you build GPU-powered AI agents with fully-managed deployment. Agents can use pre-built or custom foundation models, incorporate function and agent routes, and implement RAG pipelines with knowledge bases.
A knowledge base stores private data sources such as unstructured files, Spaces folders, or web pages to supplement an agent’s training data and improve response accuracy. Using retrieval-augmented generation (RAG), agents can search and reference external data to deliver more accurate, up-to-date, and domain-specific answers.
When you create a knowledge base, we automatically index your data by transforming it into vector embeddings, numerical representations that capture the meaning of the text and help agents efficiently find relevant information. These embeddings are stored in a Managed OpenSearch database, which appears in your Databases list and is scalable anytime for better performance.
Create a Knowledge Base Using the Control Panel
To create a knowledge base from the DigitalOcean Control Panel, in the left-hand menu, click GenAI Platform, click the Knowledge Bases tab, then click Create Knowledge Base to open the creation page.
In the Configure your knowledge base section, either keep the autogenerated name or choose a unique name using 3 to 63 characters, including only letters, numbers, dashes, and periods.
Select Your Data Sources
In the Select data sources to index section, click Select data sources to open the data source selection window, then click the dropdown menu to select a data source type: Spaces bucket or folder, File upload, or URL for web crawling.
You can add multiple types of data sources to a knowledge base and include as many as needed. To save processing time and cost, organize your files in dedicated Spaces buckets, specific folders, or local storage containing only relevant files.
Supported File Formats
We support a wide range of text-based file formats, including: .csv
, .eml
, .epub
, .xls
, .xlsx
, .html
, .md
, .odt
, .pdf
, .txt
, .rst
, .rtf
, .tsv
, .doc
, .docx
, .xml
, .json
, and .jsonl
.
.ppt
, .pptx
) are partially supported. We extract text but do not process images or other visual content. Image files (such as .png
, .jpeg
, .tiff
, and .bmp
) are not currently supported.
You can add any of the following data sources:
For smooth uploads, keep batches under 100 files, each no larger than 2 GB. For larger files or batches, use the DigitalOcean API.
After selecting your data source, click Add selected data source. If needed, you can add more files later.
View your selected data sources and check the Status of each:
-
Ready, the data source is uploaded and ready for indexing.
-
Error, the upload or processing failed. Remove the data source and try again. If it fails again, contact support.
-
Uploading, the data source is still uploading and not ready for indexing.
To avoid delays, upload fewer than 100 files at a time, each under 2 GB. For larger uploads, use the DigitalOcean API. If uploads continue to stall, contact support.
Knowledge bases require a new or existing OpenSearch database to store the vector embeddings created from your data. Below the list, Estimated Size shows the total size of all uploaded data. Use this value to estimate the final embedding size and allocate at least twice that amount to ensure your database is properly sized to store embeddings. This may affect costs based on OpenSearch pricing.
Choose Your OpenSearch Database
In the Where should your knowledge base live? section, under the OpenSearch database options sub-section, select either Use existing to connect to an existing OpenSearch database or Create new to provision a new one.
Choose Your Embedding Model
An embedding model converts your data into vector embeddings, which are stored in your OpenSearch database. In the How much will I pay? section, click the Embeddings model dropdown menu, then select a model. You can’t change the model after creating your knowledge base. We offer multiple embedding models for different use cases, and indexing costs depend on the selected model and the size of your data.
The pricing table estimates token counts and indexing costs based on your dataset size and the model’s token rate. Each row shows the Dataset Size, the approximate Token Count, and the estimated Indexing Cost. Larger datasets generate more tokens, which increases the indexing cost. Pricing scales linearly with both model and data size, and you only pay for successfully indexed data. Final costs may vary. For more details, see our embedding model pricing.
Finalize Details
In the Final Details section, under the Select a project sub-section, choose the project where you want the knowledge base to live. You can use the default project or select another, and attach the knowledge base to agents in any project.
Under the Tags sub-section, add tags to help organize and filter your knowledge base. Tags can include letters, numbers, colons, dashes, and underscores. Choose a tag name, then press ENTER
or SPACEBAR
to add it. Use the arrow keys to navigate and the BACKSPACE
key to remove tags.
After adding your knowledge base to a project and providing your tags, click Create Knowledge Base.
Provisioning Your Knowledge Base
After creation, your knowledge base appears under the GenAI Platform’s Knowledge Bases tab and begins indexing its data sources.
To track indexing progress, go to the Knowledge Bases tab, find your knowledge base, then check the last indexing time. Click the knowledge base to view detailed progress, including updates for each data source, tokens indexed, and any sources still processing. The list updates automatically, and agents begin using the updated embeddings as soon as they become available.
Provisioning typically takes five minutes or longer while the system processes, embeds, and stores your data. After indexing completes, go to the knowledge base’s Overview tab, then under the EMBEDDINGS DETAILS section, see a summary of the indexing results, including final costs.
If indexing takes longer than expected, click Stop job to cancel it, then Re-run job to restart it. If issues persist, contact support.
Once created, you can add more data sources, attach it to an existing agent, or include it during agent creation. You can also edit the name, project, and tags under the knowledge base’s Settings tab if needed.
Create a Knowledge Base Using the API
To create a knowledge base using the DigitalOcean API, provide a name, an embedding model, a data source, a project ID, and a datacenter region. You can also specify the ID of an existing OpenSearch database. If you don’t, a new one is created and automatically sized to about twice the size of your data to accommodate embeddings.
To list available embedding models and their IDs, call the /v2/gen-ai/models
endpoint with the usecases query parameter. After creation, your data sources are indexed. For details, see Index Data Using the API.
After creating a knowledge base, you can list all available knowledge bases, view details, or update the knowledge base.
Edit an Existing Knowledge Base
You can edit an existing knowledge base to change its name, project, or tags, and view details like the model in use, attached agents, and the OpenSearch database storing its data.
To make changes from the DigitalOcean Control Panel, on the left-hand menu, click GenAI Platform, click the Knowledge Bases tab, select the knowledge base you want to edit, then open its Settings tab. In the Settings section, click Edit next to the section you want to update, then click Submit to apply your changes.
You can edit the following attributes:
- Knowledge base info, change the knowledge base name or select a different project.
- Tags, add or remove tags.
- Destroy, destroy the knowledge base.
You can view but not edit the following sections:
- Embeddings Model, shows the model in use and the token rate for indexing events.
- Associated agents, lists the agents using the knowledge base. You can attach it to any agent as needed, or leave it unattached.
- OpenSearch DB, shows the database in use and its region. To manage databases, see our OpenSearch documentation.
Destroy a Knowledge Base Using the Control Panel
If a knowledge base is no longer needed, you can permanently delete it along with its embeddings and automated backups. This process is irreversible, triggers redeployment of any agents using it, and may affect their performance. Destroying a knowledge base does not delete the associated OpenSearch database, but you can delete the database separately if needed.
To delete a knowledge base from the DigitalOcean Control Panel, in the left-hand menu, click GenAI Platform, click the Knowledge Bases tab, find the knowledge base you want to destroy, then on the right of it, click …, then select Destroy.
In the confirmation window, type the knowledge base name to confirm deletion, then click Destroy to complete the deletion.
Destroy a Knowledge Base Using the API
To destroy a knowledge base using the DigitalOcean API, provide its unique identifier. You can retrieve available knowledge bases and their IDs using the /v2/gen-ai/knowledge_bases
endpoint.