Chunking Parameters

Validated on 20 Apr 2026 • Last edited on 8 May 2026

Inference provides a single control plane for managing inference workflows. It includes a Model Catalog where you can view available foundation models, including both DigitalOcean-hosted and third-party commercial models, compare model capabilities and pricing, use routing to match inference requests to the best-fit model, and run inference using serverless or dedicated deployments.

Chunking divides documents into smaller units for indexing and retrieval. This reference describes chunking parameters and how they interact with embeddings models.

For guidance on choosing a strategy, see our chunking best practices.

Parameters

The following parameters determine how each chunking strategy divides and structures your documents. All parameters must remain within the embeddings model’s token limits.

Parameter Applies To Definition Recommendation
max_chunk_size Section-Based
Semantic
Fixed Length
Maximum number of tokens per chunk. Minimum is 100. Maximum depends on the embeddings model.

For fixed length chunking, this value is the exact split size.
Chunking strategy recommendations:

• Section-Based: 800 for stable, readable chunks.
• Fixed Length: 500 for predictable cost and performance.
• Semantic: 700 for balanced semantic precision and manageable chunk count.
semantic_threshold Semantic Sensitivity to semantic shifts (ranges from 0.0 to 1.0). Lower values allow more variation and produce fewer chunks. Higher values enforce stricter similarity and may split sentences. 0.5 balances chunk quantity with meaningful semantic grouping.
parent_chunk_size Hierarchical Token size of parent chunks, used to provide broad context. Must be larger than the child chunk size. 1500 for wide context windows without excessive token cost.
child_chunk_size Hierarchical Token size of child chunks, used for retrieval. Must be smaller than the parent chunk size. 300 for focused retrieval.

Model-specific recommendations:

• GTE Large (v1.5): 400
• E5 Large (v2): 256
• BGE M3: 400
• All-MiniLM-L6-v2: 128
• Multi-QA-mpnet-base-dot-v1: 256
• Qwen3 Embedding 0.6B: 400.

We can't find any results for your search.

Try using different keywords or simplifying your search terms.