How to Create, Edit, and Destroy Agent Knowledge Bases
Validated on 28 Apr 2025 • Last edited on 28 Jul 2025
GradientAI Platform lets you build fully-managed AI agents with knowledge bases for retrieval-augmented generation, multi-agent routing, guardrails, and more, or use serverless inference to make direct requests to popular foundation models.
A knowledge base stores private data sources such as unstructured files, Spaces folders, or web pages to supplement an agent’s training data and improve response accuracy. Using retrieval-augmented generation (RAG), agents can search and reference external data to deliver more accurate, up-to-date, and domain-specific answers.
When you create a knowledge base, we automatically index your data by transforming it into vector embeddings, numerical representations that capture the meaning of the text and help agents efficiently find relevant information. These embeddings are stored in a Managed OpenSearch database, which appears in your Databases list and you can scale to increase its performance.
Knowledge bases support the following data sources:
- DigitalOcean Spaces buckets or specific folders.
- Direct file uploads from your local machine.
- Public websites crawled at a URL you specify.
- Amazon S3 buckets.
Each knowledge base requires at least one data source. You can add more or remove data sources after creation.
Create a Knowledge Base Using the Control Panel
To create a knowledge base from the DigitalOcean Control Panel, in the left-hand menu, click Agent Platform, click the Knowledge Bases tab, then click Create Knowledge Base to open the creation page.
In the Configure your knowledge base section, either keep the autogenerated name or choose a unique name using 3 to 63 characters, including only letters, numbers, dashes, and periods.
Select Your Data Sources
In the Select data sources to index section, click Select data sources to open the data source selection window, then click the dropdown menu to select a data source type:
You can add multiple types of data sources to a knowledge base and include as many as needed. To save processing time and cost, organize your files in dedicated Spaces buckets, specific folders, or local storage containing only relevant files.
Knowledge bases support the following text-based file formats: .csv
, .eml
, .epub
, .xls
, .xlsx
, .html
, .md
, .odt
, .pdf
, .txt
, .rst
, .rtf
, .tsv
, .doc
, .docx
, .xml
, .json
, and .jsonl
.
You can add any of the following data sources:
You can add entire DigitalOcean Spaces buckets or select specific folders to organize files in your knowledge base. The system indexes all supported file formats in selected buckets and folders, regardless of privacy settings.
To add a Spaces bucket, select Spaces Bucket or folder and choose the buckets you want to index. You can also click the + beside buckets to expand their contents and select specific folders within a bucket to limit the indexed content.
For optimal performance and indexing quality, we recommend using five or fewer buckets and uploading only indexing data to your buckets.
You can upload files from your local machine to your Space to be indexed.
You can add a public URL for our web crawler to crawl and index content from. Depending on the behavior you select, the crawler follows HTML links on the site, indexes text and certain image types, ignores videos and navigation links, and respects robots.txt
rules.
To add a URL to crawl, select URL for web crawling. In the Seed URL field, enter the public URL you want to crawl.
Under the Crawling Rules section, define the crawl scope:
- Scoped crawls only the seed URL.
- Path crawls the seed URL and all pages within the same path.
- Domain crawls all pages in the same domain.
- Subdomains crawls the domain and all its subdomains.
Select the Index embedded media option to index supported images and other media encountered during the crawl.
To verify the crawl completed, re-add the same seed URL as a new data source. If it shows zero tokens, the original crawl indexed all content and you can delete the duplicate.
You can add an Amazon S3 bucket as a data source to your knowledge base.
To add an S3 bucket, select Amazon S3 Bucket, and then provide the following credentials in the fields provided:
- Access Key ID, the IAM access key ID for your S3 bucket
- Secret Key, the secret key associated with your access key ID
- Bucket Name, the name of the S3 bucket you want to index
- Region, the AWS region where your S3 bucket is located, such as
us-east-1
oreu-west-1
Click the + button to add another S3 bucket as a data source.
You can add a Dropbox folder as a data source to your knowledge base.
To add a Dropbox folder, select Dropbox, then click Connect. This opens a new window where you can log in to your Dropbox account and authorize the connection. Once you’ve authorized the connection, you can select a folder to index back in the Select data source window.
View your selected data sources and check the Status of each:
-
Ready, the data source is uploaded and ready for indexing.
-
Error, the upload or processing failed. Remove the data source and try again. If it fails again, contact support.
-
Uploading, the data source is still uploading and not ready for indexing.
To avoid delays, upload fewer than 100 files at a time, each under 2 GB. For larger uploads, use the DigitalOcean API. If uploads continue to stall, contact support.
Knowledge bases require a new or existing OpenSearch database to store the vector embeddings created from your data. Below the list, Estimated Size shows the total size of all uploaded data. Use this value to estimate the final embedding size and allocate at least twice that amount to ensure your database is properly sized to store embeddings. This may affect costs based on OpenSearch pricing.
Choose Your OpenSearch Database
In the Where should your knowledge base live? section, under the OpenSearch database options sub-section, select either Use existing to connect to an existing OpenSearch database or Create new to provision a new one.
If you choose Use existing, under the Select an OpenSearch database section, click the dropdown menu, then select the database you want to use. If it already contains data, it may limit how much new data you can index. You only pay for successfully indexed data.
If you choose Create new, under the Choose a datacenter region section, select the default datacenter region for your knowledge base, or click Additional datacenter regions to choose a different one. We recommend choosing the same region as your GradientAI Platform agents to reduce latency.
New databases are automatically sized to the smallest option that fits your data. We recommend allocating about twice the size of your original dataset to efficiently store embeddings.
Choose Your Embedding Model
An embedding model converts your data into vector embeddings which are stored in your OpenSearch database. In the How much will I pay? section, click the Embeddings model dropdown menu, then select a model. You can’t change the model after creating your knowledge base. We offer multiple embedding models for different use cases, and indexing costs depend on the selected model and the size of your data.
The pricing table estimates token counts and indexing costs based on your dataset size and the model’s token rate. Each row shows the Dataset Size, the approximate Token Count, and the estimated Indexing Cost. Larger datasets generate more tokens, which increases the indexing cost. Pricing scales linearly with both model and data size, and you only pay for successfully indexed data. Final costs may vary. For more details, see our embedding model pricing.
Finalize Details
In the Final Details section, under the Select a project sub-section, choose the project where you want the knowledge base to live. You can use the default project or select another, and attach the knowledge base to agents in any project.
Under the Tags sub-section, add tags to help organize and filter your knowledge base. Tags can include letters, numbers, colons, dashes, and underscores. Choose a tag name, then press ENTER
or SPACEBAR
to add it. Use the arrow keys to navigate and the BACKSPACE
key to remove tags.
After adding your knowledge base to a project and providing your tags, click Create Knowledge Base.
Provisioning Your Knowledge Base
After creation, your knowledge base appears under the GradientAI Platform’s Knowledge Bases tab and begins indexing its data sources.
To track indexing progress, go to the Knowledge Bases tab, find your knowledge base, then check the last indexing time. Click the knowledge base to view detailed progress, including updates for each data source, tokens indexed, and any sources still processing. The list updates automatically, and agents begin using the updated embeddings as soon as they become available.
Provisioning typically takes five minutes or longer while the system processes, embeds, and stores your data. After indexing completes, go to the knowledge base’s Overview tab, then under the EMBEDDINGS DETAILS section, see a summary of the indexing results, including final costs.
If indexing takes longer than expected, click Stop job to cancel it, then Re-run job to restart it. If issues persist, contact support.
Create a Knowledge Base Using the API
To create a knowledge base using the DigitalOcean API, provide a name, an embedding model, a data source, a project ID, and a datacenter region. You can also specify the ID of an existing OpenSearch database. If you don’t, a new one is created and automatically sized to about twice the size of your data to accommodate embeddings.
To list available embedding models and their IDs, call the /v2/gen-ai/models
endpoint with the usecases query parameter. After creation, your data sources are indexed.
After creating a knowledge base, you can list all available knowledge bases, view details, or update the knowledge base.
Add, Remove, or Reindex Data Sources Using the Control Panel
You can add, remove, or reindex existing knowledge base data sources as needed.
To add, remove, or reindex a data source using the DigitalOcean Control Panel, in the left-hand menu, click Agent Platform, click the Knowledge Bases tab, find and then select the knowledge base you want to update, then click the Data sources tab.
Add Data Sources
To add a data source from the Data sources tab, click Add source.
On the Add Data Source page, click Select data source and then select a data source from the dropdown menu. For detailed information about each data source type, see the Select Your Data Sources section of the create workflow.
After adding new data sources, review the estimated price to index the data under the Indexing event summary section and then click Index added source. The data sources are added to the knowledge base and the data is automatically indexed.
Remove Data Sources
To remove a data source from the Data sources tab, click the … menu beside the data source you want to remove and then click Remove source from the dropdown menu.
In the Remove data source modal, enter the name of the data source to confirm its removal, and then click Destroy to remove it.
After removal, the knowledge base automatically reindexes the remaining data sources.
Reindex Data Sources
To manually reindex a data source from the Data sources tab, click the … menu beside the data source you want to reindex and then click Update source from the dropdown menu.
In the confirmation window, click Update source to reindex the data. You are only charged for any new data found during the indexing.
Add, Remove, or Reindex a Data Source Using the API
You can add, remove, or reindex existing knowledge base data sources as needed using the API.
Add a Data Source
To add a data source using the API, provide the knowledge bases unique identifier and specify the Spaces bucket, folder, file, or URL to use. For detailed information about each data source type, see the Select Your Data Sources section of the create workflow. To retrieve knowledge base IDs, use the /v2/gen-ai/knowledge_bases
endpoint.
After adding a data source, start indexing it using the API to make the content available for retrieval.
To confirm the data source was added, list the knowledge base’s data sources.
Remove a Data Source
To remove a data source using the API, provide the knowledge base ID and the specific data source ID. This detaches the data source from the knowledge base but does not delete the original source file or URL.
You can find data source IDs by listing the knowledge base’s data sources.
Index a Data Source
To index a data source using the API, create an indexing job with the knowledge base ID and data source ID. Use the Create Indexing Job endpoint to start the process.
You can check the job status using the Get Indexing Job endpoint.
After indexing completes, use the Get Knowledge Base endpoint to confirm completion and review the final token count and indexing cost.
If the job takes longer than expected, cancel it using the Cancel Indexing Job endpoint, then restart it. If issues persist, contact support for assistance.
Edit Knowledge Base Settings
You can edit an existing knowledge base to change its name, project, or tags, and view details like its embedding model, attached agents, and the OpenSearch database storing its data.
To make changes from the DigitalOcean Control Panel, on the left-hand menu, click GenAI Platform, click the Knowledge Bases tab, select the knowledge base you want to edit, then open its Settings tab. In the Settings section, click Edit next to the section you want to update, then click Submit to apply your changes.
You can edit the following attributes:
- Knowledge base info, change the knowledge base name or select a different project.
- Tags, add or remove tags.
- Destroy, destroy the knowledge base.
You can view but not edit the following sections:
- Embeddings Model shows the model in use and the token rate for indexing events.
- Associated agents lists the agents using the knowledge base. You can attach it to any agent as needed or leave it unattached.
- OpenSearch DB shows the database in use and its region. To manage databases, see our OpenSearch documentation.
Destroy a Knowledge Base Using the Control Panel
If you no longer need a knowledge base, you can permanently and irreversibly delete it along with its embeddings and automated backups. Destroying a knowledge base does not delete the associated OpenSearch database, but you can delete the database separately.
Deleting a knowledge base triggers redeployment of any agents using it and may affect their performance.
To delete a knowledge base from the DigitalOcean Control Panel, in the left-hand menu, click GenAI Platform, click the Knowledge Bases tab, find the knowledge base you want to destroy, then on the right of it, click …, then select Destroy.
In the confirmation window, type the knowledge base name to confirm deletion, then click Destroy to complete the deletion.
Destroy a Knowledge Base Using the API
To destroy a knowledge base using the DigitalOcean API, provide its unique identifier. You can retrieve available knowledge bases and their IDs using the /v2/gen-ai/knowledge_bases
endpoint.