Give Feedback

Vector Search on OpenSearch

Validated on 28 Apr 2026 • Last edited on 28 Apr 2026

DigitalOcean Managed OpenSearch for vector search uses the same managed OpenSearch engine available under Managed Databases. It bundles the k-NN, ML Commons, and Neural Search plugins for vector similarity search, hybrid vector and keyword search, and remote embedding models.

Copy page as Markdown View page as Markdown

This reference explains what happens underneath the OpenSearch vector how-tos: embeddings, distance metrics, approximate versus exact k-NN, HNSW parameters, engines, hybrid search, and where embeddings originate. Read it once and refer back when a parameter behaves unexpectedly.

What a Vector Embedding Is

An embedding is a list of floating-point numbers that represents content (text, an image, audio, or code) in a space where distance correlates with similarity. Two sentences that mean similar things produce embeddings that are close together; two sentences that mean different things produce embeddings that are far apart.

Common embedding model families:

Text: OpenAI text-embedding-3-small/large, Cohere Embed v3, Voyage, bge-large, GTE-large, e5-large, sentence-transformers.
Code: Voyage Code, Jina Code, OpenAI text-embedding-3-large.
Multimodal: CLIP and OpenCLIP for joint image plus text.

The model’s dimension is fixed (typically 384, 512, 768, 1024, 1536, or 3,072). OpenSearch requires the knn_vector dimension to match the model exactly; changing models means re-indexing.

Distance Metrics

OpenSearch supports four similarity functions. Use whichever the embedding model recommends.

Metric (space_type)	What it measures
Cosine (cosinesimil)	The angle between two vectors, ignoring magnitude. Most popular for text embeddings because model outputs are not always L2-normalized.
Inner product (innerproduct)	The dot product. Equivalent to cosine on L2-normalized vectors and faster to compute.
Euclidean (l2)	Straight-line distance. Common for image embeddings (CLIP, OpenCLIP) and sentence-BERT variants trained with L2 loss.
Manhattan (l1)	Sum of absolute differences. Rare.

Cosine and inner product give identical rankings on L2-normalized vectors. If your model outputs L2-normalized embeddings, prefer innerproduct: it skips the normalization step at query time and is measurably faster at scale.

Exact Versus Approximate k-NN

Finding the k nearest neighbors exactly requires comparing the query vector against every vector in the index (O(n)). That works for thousands of vectors. For millions, it is too slow.

Approximate k-NN (ANN) trades a small amount of recall for dramatic speedups. Instead of comparing against every vector, the query traverses a data structure that routes toward the nearest region of vector space. Typical recall on well-tuned HNSW indexes is 95-99% at 100x-1000x the speed of exact search.

Use exact k-NN (via script_score with knn_score) when:

The candidate set is under about 10,000 documents (or a filter narrows it that far).
You are re-ranking ANN results and need perfect ordering on the top hits.
You are testing or benchmarking.

Otherwise, use ANN through the knn query.

HNSW, the Algorithm Behind k-NN

HNSW stands for Hierarchical Navigable Small World graph. Every vector becomes a node in a multi-layer graph: higher layers are sparse long-range connections, lower layers are dense local connections. Queries start at the top layer, greedily walk toward the query vector, descend a layer, and repeat until they reach layer 0.

Three parameters matter:

m: Edges per node. Larger m improves recall and memory cost linearly.
ef_construction: Build-time candidate pool size. Larger values produce a better graph but build slower.
ef_search: Query-time candidate pool size. The only knob you can change without re-indexing; raise it first when tuning recall.

HNSW memory scales roughly as num_vectors * (dim * 4 + m * 8) bytes for FP32 vectors. The m * 8 term is the graph edge overhead (about 128 bytes per vector at the default m=16). Quantization (product or scalar, Faiss only) reduces the first term at the cost of some recall.

OpenSearch k-NN Engines Compared

Engine	Characteristics
Faiss (default in 2.19)	Facebook’s vector library. High throughput at scale, supports product and scalar quantization, and (since k-NN 2.9) efficient filtering during traversal.
Lucene	Native OpenSearch HNSW. Tight Lucene segment integration, efficient filtering during traversal. Good for workloads up to around 10M vectors per shard.
NMSLIB (deprecated)	Earlier HNSW implementation. Kept for back-compatibility only.

See OpenSearch methods and engines for the current feature matrix.

Hybrid Search

Pure vector search excels at semantic matches. Pure BM25 excels at exact matches such as names, identifiers, and rare tokens. Hybrid search combines them, running both queries and blending the scores. See Run Hybrid Searches for the mechanics.

The key insight is score normalization. BM25 returns unbounded scores. k-NN returns [0, 1]. You cannot simply add them. OpenSearch’s normalization processor rescales each sub-query’s scores into a common range (min-max or L2) and combines them with a weighted mean before ranking.

Where Embeddings Come From

OpenSearch stores and searches vectors; by default it does not produce them. You have two architectural choices:

Approach	Tradeoffs
Client-side embeddings (default)	Your application calls the embedding provider, then sends float arrays to OpenSearch. Explicit, portable across databases, easy to debug. Every service that writes to OpenSearch must know how to embed.
Server-side via ML Commons	OpenSearch calls the provider via a registered connector. Your application only sends raw text. Adds one moving part (the connector) and reduces visibility into provider calls.

Both are fully supported. See Register a Remote Embedding Model for the server-side path.

When to Use Managed OpenSearch for Vectors

DigitalOcean Managed OpenSearch is a good vector-database choice when:

You need hybrid search (vector plus keyword) as a first-class feature.
You already run OpenSearch for logs, analytics, or full-text search and want to add vectors without adding another system.
You need fine-grained access control and multi-tenancy.
You want a REST-first surface.

See Choosing an Engine for the full decision framework.