Give Feedback

Chunking Parameters

Validated on 20 Apr 2026 • Last edited on 8 May 2026

Inference provides a single control plane for managing inference workflows. It includes a Model Catalog where you can view available foundation models, including both DigitalOcean-hosted and third-party commercial models, compare model capabilities and pricing, use routing to match inference requests to the best-fit model, and run inference using serverless or dedicated deployments.

Copy page as Markdown View page as Markdown

Chunking divides documents into smaller units for indexing and retrieval. This reference describes chunking parameters and how they interact with embeddings models.

For guidance on choosing a strategy, see our chunking best practices.

Parameters

The following parameters determine how each chunking strategy divides and structures your documents. All parameters must remain within the embeddings model’s token limits.

Parameter	Applies To	Definition	Recommendation
`max_chunk_size`	Section-Based Semantic Fixed Length	Maximum number of tokens per chunk. Minimum is 100. Maximum depends on the embeddings model. For fixed length chunking, this value is the exact split size.	Chunking strategy recommendations: • Section-Based: 800 for stable, readable chunks. • Fixed Length: 500 for predictable cost and performance. • Semantic: 700 for balanced semantic precision and manageable chunk count.
`semantic_threshold`	Semantic	Sensitivity to semantic shifts (ranges from 0.0 to 1.0). Lower values allow more variation and produce fewer chunks. Higher values enforce stricter similarity and may split sentences.	0.5 balances chunk quantity with meaningful semantic grouping.
`parent_chunk_size`	Hierarchical	Token size of parent chunks, used to provide broad context. Must be larger than the child chunk size.	1500 for wide context windows without excessive token cost.
`child_chunk_size`	Hierarchical	Token size of child chunks, used for retrieval. Must be smaller than the parent chunk size.	300 for focused retrieval. Model-specific recommendations: • GTE Large (v1.5): 400 • E5 Large (v2): 256 • BGE M3: 400 • All-MiniLM-L6-v2: 128 • Multi-QA-mpnet-base-dot-v1: 256 • Qwen3 Embedding 0.6B: 400.

Chunking Parameters

Parameters

We can't find any results for your search.