DigitalOcean Knowledge Bases Limits

Validated on 23 Apr 2026 • Last edited on 27 Apr 2026

DigitalOcean Knowledge Bases let you store, index, and retrieve data from private files, websites, Spaces buckets, and other sources to power retrieval-augmented generation with your own content.

Limits

Data Source Limits

  • Estimates are available only for locally uploaded files and Spaces buckets. External sources cannot be estimated.

  • Estimates are approximate and may differ from the final extractable text due to file structure, parsing behavior, or non-text (binary) content.

  • For web crawling data sources, the crawler indexes up to 5,500 pages and skips inaccessible or disallowed links to prevent excessively large indexing jobs.

  • The size of S3 buckets is unavailable in the Control Panel. You can view the size of S3 buckets on Amazon.

  • You cannot currently reindex a previously crawled seed URL. To reindex the content, delete the seed URL, and then add it again to start a new crawl.

  • Knowledge bases partially support indexing PowerPoint files (.ppt, .pptx). Text is extracted, but images and other visual content are not processed.

  • Indexing image files (such as .png, .jpeg, .tiff, and .bmp) are not currently supported.

Indexing Limits

  • Indexable size cannot be predicted for web crawls, GitHub repositories, external URLs, APIs, or any source that cannot be inspected beforehand.

  • You cannot re-index specific data sources within a knowledge base. To re-index any changed data sources, you need to reindex all the data sources.

  • Auto-indexing your data sources currently runs only once per day, up to seven days a week.

Chunking Limits

  • Chunk sizes (max_chunk_size, parent_chunk_size, child_chunk_size) must remain within the token limits of the selected embeddings model.

  • All chunking strategies enforce a minimum chunk size of approximately 100 tokens.

  • Chunking settings apply per data source, not globally.

  • Changing the chunking strategy after a data source is created is not supported. To change strategies, you must remove the data source, and then re-add it with the preferred strategy.

Activity Limits

  • Only the 15 most recent activities are listed in a knowledge base’s Activity tab. If you want to keep a copy of the past indexing job logs, download the CSV file after running your indexing job.

  • You cannot access your knowledge base’s activity logs through the DigitalOcean API. Activity logs are only available in the Control Panel.

  • A knowledge base’s activity logs currently track only indexing jobs.

We can't find any results for your search.

Try using different keywords or simplifying your search terms.