Vector Search Quickstart for OpenSearch

Validated on 27 Apr 2026 • Last edited on 27 Apr 2026

DigitalOcean Managed OpenSearch for vector search uses the same managed OpenSearch engine available under Managed Databases. It bundles the k-NN, ML Commons, and Neural Search plugins for vector similarity search, hybrid vector and keyword search, and remote embedding models.

This quickstart walks you through creating a DigitalOcean Managed OpenSearch cluster, configuring it as a vector store, indexing a handful of sample embeddings, and running a k-Nearest Neighbor (k-NN) similarity query. It takes about 15 minutes.

OpenSearch 2.19 bundles the k-NN, ML Commons, and Neural Search plugins, so they are preinstalled on DigitalOcean managed OpenSearch clusters. You can create a k-NN index and run similarity queries as soon as the cluster is online.

Prerequisites

To complete this quickstart, you need:

  • A DigitalOcean account. You provision the cluster from the Control Panel.
  • A terminal with curl installed, or Python 3.9 or later with opensearch-py (>= 2.4.0). Both are shown below.

If you already have a managed OpenSearch cluster running version 2.14 or later, you can skip Step 1 and start at Step 2 by pointing the requests in this guide at your existing cluster.

Step 1: Create an OpenSearch Vector Database

  1. In the Control Panel, click Create, then Vector Database.
  2. Select OpenSearch 2.19 as the engine.
  3. For this quickstart, the default Basic Shared CPU plan (1 vCPU, 2 GB RAM, 40 GiB disk) is enough.
  4. Choose the region closest to your application and name the cluster.
  5. Click Create Vector Database Cluster.

For production vector workloads, size for RAM: OpenSearch holds the HNSW graph in memory. See Create a Cluster for full sizing guidance.

Step 2: Secure the Cluster and Collect Connection Details

While the cluster provisions, open its Overview tab:

  1. Under Trusted Sources, add your workstation IP or a DigitalOcean resource. Only listed sources can connect.

  2. Copy the host, port, and doadmin password.

  3. Export them as environment variables:

export OPENSEARCH_HOST="<your-cluster-host>"
export OPENSEARCH_PORT="25060"
export OPENSEARCH_USER="doadmin"
export OPENSEARCH_PASSWORD="<your-doadmin-password>"
export OS="https://$OPENSEARCH_USER:$OPENSEARCH_PASSWORD@$OPENSEARCH_HOST:$OPENSEARCH_PORT"
  1. Verify connectivity:
curl -sS "$OS/" | jq '.version.number'

You should see "2.19.x".

Step 3: Create a k-NN Index

Create an index that stores 4-dimensional vectors. In production you typically use 384, 768, 1024, or 1536 dimensions depending on your embedding model.

curl -X PUT "$OS/articles" -H 'Content-Type: application/json' -d '{
  "settings": {
    "index": {
      "knn": true,
      "knn.algo_param.ef_search": 100
    }
  },
  "mappings": {
    "properties": {
      "title": { "type": "text" },
      "body":  { "type": "text" },
      "embedding": {
        "type": "knn_vector",
        "dimension": 4,
        "method": {
          "name": "hnsw",
          "engine": "lucene",
          "space_type": "cosinesimil",
          "parameters": { "m": 16, "ef_construction": 128 }
        }
      }
    }
  }
}'
  • "knn": true enables the k-NN plugin for this index.
  • "type": "knn_vector" declares the vector field; dimension must match your embedding model.
  • "engine": "lucene" uses OpenSearch’s native HNSW, which supports efficient filtered search. Use Faiss for very large indexes (greater than 10 million vectors).

Step 4: Index Sample Vectors

Load four tiny documents with pre-computed embeddings.

curl -X POST "$OS/articles/_bulk" -H 'Content-Type: application/x-ndjson' --data-binary '
{ "index": { "_id": "1" } }
{ "title": "Coffee brewing basics",    "body": "Pour-over, espresso, and cold brew compared.",      "embedding": [0.91, 0.10, 0.05, 0.02] }
{ "index": { "_id": "2" } }
{ "title": "Best espresso machines",   "body": "A buyer guide for home espresso setups.",          "embedding": [0.88, 0.15, 0.07, 0.04] }
{ "index": { "_id": "3" } }
{ "title": "Intro to deep learning",   "body": "Neural networks, backpropagation, activations.",   "embedding": [0.05, 0.92, 0.18, 0.10] }
{ "index": { "_id": "4" } }
{ "title": "Hiking trails near Denver","body": "Five scenic day hikes within an hour of the city.","embedding": [0.12, 0.08, 0.90, 0.22] }
'

OpenSearch responds with "errors": false when the bulk request succeeds.

Step 5: Run Your First k-NN Query

Find the two documents closest to a query vector that looks like a coffee-related embedding:

curl -X POST "$OS/articles/_search" -H 'Content-Type: application/json' -d '{
  "size": 2,
  "query": {
    "knn": {
      "embedding": {
        "vector": [0.90, 0.12, 0.06, 0.03],
        "k": 2
      }
    }
  }
}'

You should see the two coffee articles ranked highest, with _score values close to 1.0. OpenSearch normalizes cosine similarity so that higher is better.

Optional: The Same Query in Python

import os
from opensearchpy import OpenSearch

client = OpenSearch(
    hosts=[{
        "host": os.environ["OPENSEARCH_HOST"],
        "port": int(os.environ.get("OPENSEARCH_PORT", 25060)),
    }],
    http_auth=(os.environ["OPENSEARCH_USER"], os.environ["OPENSEARCH_PASSWORD"]),
    use_ssl=True,
    verify_certs=True,
)

resp = client.search(
    index="articles",
    body={
        "size": 2,
        "query": {
            "knn": {
                "embedding": {
                    "vector": [0.90, 0.12, 0.06, 0.03],
                    "k": 2,
                }
            }
        },
    },
)

for hit in resp["hits"]["hits"]:
    print(hit["_score"], hit["_source"]["title"])

Next Steps

For upstream OpenSearch vector documentation, see the official OpenSearch vector-search docs.

Warning
DigitalOcean Vector Databases are billed by the hour for as long as they exist. If you are experimenting, destroy the cluster from the Settings tab when you are finished. Destroying a cluster deletes all indexes and vectors irreversibly.

We can't find any results for your search.

Try using different keywords or simplifying your search terms.