Vector Search Quickstart for OpenSearch
Validated on 27 Apr 2026 • Last edited on 27 Apr 2026
DigitalOcean Managed OpenSearch for vector search uses the same managed OpenSearch engine available under Managed Databases. It bundles the k-NN, ML Commons, and Neural Search plugins for vector similarity search, hybrid vector and keyword search, and remote embedding models.
This quickstart walks you through creating a DigitalOcean Managed OpenSearch cluster, configuring it as a vector store, indexing a handful of sample embeddings, and running a k-Nearest Neighbor (k-NN) similarity query. It takes about 15 minutes.
OpenSearch 2.19 bundles the k-NN, ML Commons, and Neural Search plugins, so they are preinstalled on DigitalOcean managed OpenSearch clusters. You can create a k-NN index and run similarity queries as soon as the cluster is online.
Prerequisites
To complete this quickstart, you need:
- A DigitalOcean account. You provision the cluster from the Control Panel.
- A terminal with
curlinstalled, or Python 3.9 or later withopensearch-py(>=2.4.0). Both are shown below.
If you already have a managed OpenSearch cluster running version 2.14 or later, you can skip Step 1 and start at Step 2 by pointing the requests in this guide at your existing cluster.
Step 1: Create an OpenSearch Vector Database
- In the Control Panel, click Create, then Vector Database.
- Select OpenSearch 2.19 as the engine.
- For this quickstart, the default Basic Shared CPU plan (1 vCPU, 2 GB RAM, 40 GiB disk) is enough.
- Choose the region closest to your application and name the cluster.
- Click Create Vector Database Cluster.
For production vector workloads, size for RAM: OpenSearch holds the HNSW graph in memory. See Create a Cluster for full sizing guidance.
Step 2: Secure the Cluster and Collect Connection Details
While the cluster provisions, open its Overview tab:
-
Under Trusted Sources, add your workstation IP or a DigitalOcean resource. Only listed sources can connect.
-
Copy the host, port, and
doadminpassword. -
Export them as environment variables:
export OPENSEARCH_HOST="<your-cluster-host>"
export OPENSEARCH_PORT="25060"
export OPENSEARCH_USER="doadmin"
export OPENSEARCH_PASSWORD="<your-doadmin-password>"
export OS="https://$OPENSEARCH_USER:$OPENSEARCH_PASSWORD@$OPENSEARCH_HOST:$OPENSEARCH_PORT"- Verify connectivity:
curl -sS "$OS/" | jq '.version.number'You should see "2.19.x".
Step 3: Create a k-NN Index
Create an index that stores 4-dimensional vectors. In production you typically use 384, 768, 1024, or 1536 dimensions depending on your embedding model.
curl -X PUT "$OS/articles" -H 'Content-Type: application/json' -d '{
"settings": {
"index": {
"knn": true,
"knn.algo_param.ef_search": 100
}
},
"mappings": {
"properties": {
"title": { "type": "text" },
"body": { "type": "text" },
"embedding": {
"type": "knn_vector",
"dimension": 4,
"method": {
"name": "hnsw",
"engine": "lucene",
"space_type": "cosinesimil",
"parameters": { "m": 16, "ef_construction": 128 }
}
}
}
}
}'"knn": trueenables the k-NN plugin for this index."type": "knn_vector"declares the vector field;dimensionmust match your embedding model."engine": "lucene"uses OpenSearch’s native HNSW, which supports efficient filtered search. Use Faiss for very large indexes (greater than 10 million vectors).
Step 4: Index Sample Vectors
Load four tiny documents with pre-computed embeddings.
curl -X POST "$OS/articles/_bulk" -H 'Content-Type: application/x-ndjson' --data-binary '
{ "index": { "_id": "1" } }
{ "title": "Coffee brewing basics", "body": "Pour-over, espresso, and cold brew compared.", "embedding": [0.91, 0.10, 0.05, 0.02] }
{ "index": { "_id": "2" } }
{ "title": "Best espresso machines", "body": "A buyer guide for home espresso setups.", "embedding": [0.88, 0.15, 0.07, 0.04] }
{ "index": { "_id": "3" } }
{ "title": "Intro to deep learning", "body": "Neural networks, backpropagation, activations.", "embedding": [0.05, 0.92, 0.18, 0.10] }
{ "index": { "_id": "4" } }
{ "title": "Hiking trails near Denver","body": "Five scenic day hikes within an hour of the city.","embedding": [0.12, 0.08, 0.90, 0.22] }
'OpenSearch responds with "errors": false when the bulk request succeeds.
Step 5: Run Your First k-NN Query
Find the two documents closest to a query vector that looks like a coffee-related embedding:
curl -X POST "$OS/articles/_search" -H 'Content-Type: application/json' -d '{
"size": 2,
"query": {
"knn": {
"embedding": {
"vector": [0.90, 0.12, 0.06, 0.03],
"k": 2
}
}
}
}'You should see the two coffee articles ranked highest, with _score values close to 1.0. OpenSearch normalizes cosine similarity so that higher is better.
Optional: The Same Query in Python
import os
from opensearchpy import OpenSearch
client = OpenSearch(
hosts=[{
"host": os.environ["OPENSEARCH_HOST"],
"port": int(os.environ.get("OPENSEARCH_PORT", 25060)),
}],
http_auth=(os.environ["OPENSEARCH_USER"], os.environ["OPENSEARCH_PASSWORD"]),
use_ssl=True,
verify_certs=True,
)
resp = client.search(
index="articles",
body={
"size": 2,
"query": {
"knn": {
"embedding": {
"vector": [0.90, 0.12, 0.06, 0.03],
"k": 2,
}
}
},
},
)
for hit in resp["hits"]["hits"]:
print(hit["_score"], hit["_source"]["title"])Next Steps
- Create a k-NN Index: tune engines, space types, and HNSW parameters for your workload.
- Index and Query Vectors: bulk ingestion, filtered k-NN, and exact search.
- Run Hybrid Searches: combine BM25 with vector similarity.
- Register a Remote Embedding Model: let OpenSearch call your embedding service directly.
For upstream OpenSearch vector documentation, see the official OpenSearch vector-search docs.