pydo.genai.create_knowledge_base()
Generated on 4 Jun 2026
from pydo version
v0.35.0
Usage
client.genai.create_knowledge_base(
body={
"database_id": "\"12345678-1234-1234-1234-123456789012\"",
"datasources": [...],
"embedding_model_uuid": "\"12345678-1234-1234-1234-123456789012\"",
...,
},
)Description
To create a knowledge base, send a POST request to /v2/gen-ai/knowledge_bases.
Parameters
database_idstring optionalExample:
"12345678-1234-1234-1234-123456789012"Identifier of the DigitalOcean OpenSearch database this knowledge base will use, optional.
If not provided, we create a new database for the knowledge base in
the same region as the knowledge base.datasourcesarray of objects optionalOptional data sources to attach at creation. Omit or use an empty list to create the knowledge base without sources, then add sources (with chunking strategy and sizes) using Add a Data Source to a Knowledge Base. When provided, see Organize Data Sources for best practices.
Show child properties
aws_data_sourceobject optionalAWS S3 Data Source
Show child properties
bucket_namestring optionalExample:
example nameSpaces bucket name
item_pathstring optionalExample:
example stringkey_idstring optionalExample:
123e4567-e89b-12d3-a456-426614174000The AWS Key ID
regionstring optionalExample:
example stringRegion of bucket
secret_keystring optionalExample:
example stringThe AWS Secret Key
bucket_namestring optionalExample:
example nameDeprecated, moved to data_source_details
bucket_regionstring optionalExample:
example stringDeprecated, moved to data_source_details
chunking_algorithmstring optionalchunking_optionsobject optionalShow child properties
child_chunk_sizeinteger optionalExample:
350max_chunk_sizeinteger optionalExample:
750Common options
parent_chunk_sizeinteger optionalExample:
1000Hierarchical options
semantic_thresholdnumber optionalExample:
0.5Semantic options
dropbox_data_sourceobject optionalDropbox Data Source
Show child properties
folderstring optionalExample:
example stringrefresh_tokenstring optionalExample:
example stringRefresh token. you can obrain a refresh token by following the oauth2 flow. see /v2/gen-ai/oauth2/dropbox/tokens for reference.
file_upload_data_sourceobject optionalFile to upload as data source for knowledge base.
Show child properties
original_file_namestring optionalExample:
example nameThe original file name
size_in_bytesstring optionalExample:
12345The size of the file in bytes
stored_object_keystring optionalExample:
example stringThe object key the file was stored as
google_drive_data_sourceobject optionalGoogle Drive Data Source
Show child properties
folder_idstring optionalExample:
123e4567-e89b-12d3-a456-426614174000refresh_tokenstring optionalExample:
example stringRefresh token. you can obrain a refresh token by following the oauth2 flow. see /v2/gen-ai/oauth2/google/tokens for reference.
item_pathstring optionalExample:
example stringspaces_data_sourceobject optionalSpaces Bucket Data Source
Show child properties
bucket_namestring optionalExample:
example nameSpaces bucket name
item_pathstring optionalExample:
example stringregionstring optionalExample:
example stringRegion of bucket
web_crawler_data_sourceobject optionalWebCrawlerDataSource
Show child properties
base_urlstring optionalExample:
example stringThe base url to crawl.
crawling_optionstring optionalOptions for specifying how URLs found on pages should be handled.
- UNKNOWN: Default unknown value
- SCOPED: Only include the base URL.
- PATH: Crawl the base URL and linked pages within the URL path.
- DOMAIN: Crawl the base URL and linked pages within the same domain.
- SUBDOMAINS: Crawl the base URL and linked pages for any subdomain.
- SITEMAP: Crawl URLs discovered in the sitemap.embed_mediaboolean optionalExample:
TrueWhether to ingest and index media (images, etc.) on web pages.
exclude_tagsarray of strings optionalExample:
['example string']Declaring which tags to exclude in web pages while webcrawling
embedding_model_uuidstring optionalExample:
"12345678-1234-1234-1234-123456789012"Identifier for the embedding model.
namestring optionalExample:
"My Knowledge Base"Name of the knowledge base.
project_idstring optionalExample:
"12345678-1234-1234-1234-123456789012"Identifier of the DigitalOcean project this knowledge base will belong to.
regionstring optionalExample:
"tor1"The datacenter region to deploy the knowledge base in.
reranking_configobject optionalConfiguration for cross-encoder reranking during retrieval.
Show child properties
enabledboolean optionalExample:
TrueWhether reranking is enabled for retrieval
modelstring optionalExample:
"bge-reranker-v2-m3"Reranker model internal name
sizestring optionaltagsarray of strings optionalExample:
['example string']Tags to organize your knowledge base.
vpc_uuidstring optionalExample:
"12345678-1234-1234-1234-123456789012"The VPC to deploy the knowledge base database in
Request Sample
Response Example
More Information
See /v2/gen-ai/knowledge_bases in the API reference for additional detail on responses, headers, parameters, and more.