Give Feedback

DigitalOcean Gradient™ AI Website Crawler

Last verified 13 Jul 2026

DigitalOcean Knowledge Bases let you store, index, and retrieve data from private files, websites, Spaces buckets, and other sources to power retrieval-augmented generation with your own content.

Copy page as Markdown View page as Markdown

When you specify a website URL as a data source for your knowledge base, DigitalOcean uses a custom agent named DigitalOceanGradientAICrawler/1.0 to index the website content. The crawler indexes up to 5,500 pages and skips inaccessible or disallowed links to prevent excessively large indexing jobs.

Depending on the behavior you select, the crawler follows HTML links on the site, indexes text and certain image types, and ignores videos and navigation links. It respects the website’s robots.txt rules, including any Disallow directives or the wildcard *.

For guidance on adding seed URLs and site map URLs to a DigitalOcean knowledge base, see Add a Data Source.

DigitalOcean Gradient™ AI Website Crawler

We can't find any results for your search.