Give Feedback

Vector Search Concepts

Validated on 28 Apr 2026 • Last edited on 28 Apr 2026

DigitalOcean Vector Databases are managed clusters purpose-built for vector similarity search, supporting Weaviate, OpenSearch, and PostgreSQL (pgvector) for retrieval-augmented generation (RAG), semantic search, and other AI workloads.

Copy page as Markdown View page as Markdown

This page explains the concepts that apply to all three DigitalOcean Vector Database engines. Read it once for background, then refer back to it when a parameter or tuning option behaves unexpectedly.

What a Vector Embedding Is

An embedding is a list of floating-point numbers that represents a piece of content (text, an image, audio, or code) in a space where distance between vectors correlates with similarity between the original items. Two sentences that mean similar things produce embeddings that are close together; two sentences that mean different things produce embeddings that are far apart.

Embeddings come from models trained on large corpora. Common families include:

Text: OpenAI text-embedding-3-small and text-embedding-3-large, Cohere Embed v3, Voyage, bge-large, gte-large, e5-large, and sentence-transformers.
Code: Voyage Code, Jina Code, and text-embedding-3-large, which works well for code despite the name.
Multimodal: CLIP and OpenCLIP for joint image and text embeddings.

A model’s output dimension is fixed. Common values are 384, 512, 768, 1024, 1536, and 3072. Every engine requires the dimension of a vector field or column to match the model exactly. Changing models means re-computing and re-indexing every embedding.

Distance Metrics

Similarity between two vectors is computed using a distance metric. The metric is determined by the embedding model, not by you. Use whichever metric the model’s documentation recommends.

Metric	What it measures
Cosine	The angle between two vectors, ignoring magnitude. The most common choice for text embeddings.
Inner product (dot product)	Equivalent to cosine when both vectors are L2-normalized, and faster to compute. Some recommendation-system models are trained for it.
Euclidean (L2)	Straight-line distance in vector space. Common for image embeddings and models trained with L2 loss.
Manhattan (L1)	Sum of absolute differences. Rare; use only when the model specifies it.

Cosine and inner product produce identical rankings on L2-normalized vectors. If your model outputs L2-normalized embeddings, inner product skips the normalization step at query time and is measurably faster at scale.

Exact Versus Approximate Nearest Neighbor

Finding the k nearest neighbors exactly requires comparing the query vector against every vector in the index, which is O(n). That works for thousands of vectors but becomes slow quickly as the dataset grows.

Approximate nearest-neighbor (ANN) indexes trade a small amount of recall for a dramatic speed-up. Instead of comparing against every vector, the query traverses a data structure that routes toward the nearest region of vector space, comparing against only a small sample. Typical recall on well-tuned ANN indexes is 95% to 99% at 100 to 1,000 times the speed of exact search.

Exact search is still the right choice when:

Your dataset contains fewer than about 10,000 vectors.
A filter narrows the candidate set to a few thousand rows.
You are re-ranking ANN results and need perfect ordering on the top hits.
You are benchmarking or validating recall.

For anything larger, use an ANN index.

HNSW

HNSW (Hierarchical Navigable Small World graph) is the most common ANN index across all three Vector Database engines. It builds a multi-layer graph of vectors: higher layers are sparse long-range connections, lower layers are dense local connections. Queries start at the top layer, greedily walk toward the query vector, descend a layer, and repeat. The behavior is analogous to a skip list, but for metric spaces. See the HNSW paper by Malkov and Yashunin for details.

Three parameters control HNSW behavior:

m: The number of edges per node. Larger values increase recall and memory cost roughly linearly. Typical range: 8 to 64. Default: 16.
ef_construction: Candidate pool size during graph build. Larger values build a higher-quality graph at the cost of build time. Typical range: 100 to 512.
ef_search (sometimes efSearch): Candidate pool size at query time. This is the only parameter you can change without rebuilding the index. Raise it first when tuning recall.

HNSW memory for FP32 vectors scales roughly as:

num_vectors * (dimensions * 4 + m * 8) bytes

The m * 8 term is the graph edge overhead, adding about 128 bytes per vector at the default m=16. Quantization (product quantization or scalar) reduces the first term at the cost of some recall.

Hybrid Search

Pure vector search excels at semantic matches: paraphrases, conceptual similarity, and cross-language matching. Pure keyword search (BM25) excels at exact matches: product codes, names, identifiers, and rare tokens. Hybrid search runs both queries and combines the scores, giving better relevance than either alone.

All three DigitalOcean Vector Database engines support hybrid search, though the mechanics differ:

OpenSearch: A first-class hybrid compound query plus a search pipeline with a normalization processor. See Run Hybrid Vector and Keyword Searches.
Weaviate: A built-in hybrid operator on any collection, with alpha controlling the BM25 to vector balance.
PostgreSQL: Application-level fusion using reciprocal rank fusion (RRF) over a tsvector full-text search and a pgvector similarity search. See Advanced Vector Workloads.

The key insight in all cases is score normalization: BM25 returns unbounded scores, vector indexes return bounded similarity scores, so the two cannot be added directly. Hybrid search implementations rescale each sub-query’s scores into a common range before combining them.

Where Embeddings Come From

Vector Databases store and search vectors; they do not (by default) produce them. Your application has two architectural choices:

Approach	Tradeoffs
Client-side embeddings	Your application calls the embedding provider, then sends float arrays to the database. Explicit, portable across engines, and easy to debug. Every service that writes to the database must know how to embed.
Server-side embeddings	The database calls the provider itself (Weaviate vectorization modules, OpenSearch ML Commons, or pg-side HTTP calls through an extension). Your application sends raw text. Simpler application code; more moving parts in the database.

Both are supported. For OpenSearch server-side embeddings, see Register a Remote Embedding Model with ML Commons. Weaviate ships first-party vectorization modules for OpenAI, Cohere, Hugging Face, and other providers. Managed PostgreSQL does not ship HTTP-calling extensions, so embeddings are generated in application code.