Give Feedback

Register a Remote Embedding Model with ML Commons

Validated on 28 Apr 2026 • Last edited on 28 Apr 2026

DigitalOcean Managed OpenSearch for vector search uses the same managed OpenSearch engine available under Managed Databases. It bundles the k-NN, ML Commons, and Neural Search plugins for vector similarity search, hybrid vector and keyword search, and remote embedding models.

Copy page as Markdown View page as Markdown

Instead of embedding text in your application and sending the resulting float arrays to OpenSearch, you can let OpenSearch call the embedding service itself. The ML Commons plugin registers a connector to an external HTTPS endpoint, then exposes it as a model that other OpenSearch features (Neural Search, ingest pipelines, agents) can reference by ID.

This guide walks through the complete setup for OpenAI’s text-embedding-3-small model. The same pattern applies to Bedrock, Cohere, SageMaker, Voyage, and any HTTP-compatible endpoint.

Remote models are optional. Many teams prefer generating embeddings in the application because it is simpler to reason about. See Vector Search Concepts for a comparison.

Prerequisites

A DigitalOcean-managed OpenSearch cluster. The ML Commons and Neural Search plugins are enabled by default.
An API key for your embedding provider. This guide uses OPENAI_API_KEY.
Admin access to the cluster. The doadmin user has the required ml_full_access role.

Step 1: Allow ML Commons to Call External Endpoints

By default, ML Commons restricts which domains it can reach. Allow the OpenAI API endpoint and enable model access control:

curl -X PUT "$OS/_cluster/settings" -H 'Content-Type: application/json' -d '{
  "persistent": {
    "plugins.ml_commons.trusted_connector_endpoints_regex": [
      "^https://api\\.openai\\.com/.*$",
      "^https://bedrock-runtime\\..*\\.amazonaws\\.com/.*$",
      "^https://api\\.cohere\\.com/.*$"
    ],
    "plugins.ml_commons.only_run_on_ml_node": "false",
    "plugins.ml_commons.model_access_control_enabled": "true",
    "plugins.ml_commons.native_memory_threshold": "99"
  }
}'

Warning

The only_run_on_ml_node setting is false because DigitalOcean Basic and General-Purpose plans do not have dedicated ML nodes. For memory-intensive local models, upgrade to a Memory-Optimized plan or keep using remote models.

Step 2: Create a Connector to OpenAI

A connector encapsulates the endpoint, authentication, and the request/response mapping between OpenSearch and the provider. Store your API key as a connector credential so it stays encrypted at rest.

curl -X POST "$OS/_plugins/_ml/connectors/_create" \
  -H 'Content-Type: application/json' -d '{
  "name":        "OpenAI embeddings connector",
  "description": "text-embedding-3-small",
  "version":     "1",
  "protocol":    "http",
  "parameters":  { "model": "text-embedding-3-small" },
  "credential":  { "openAI_key": "'"$OPENAI_API_KEY"'" },
  "actions": [
    {
      "action_type":  "predict",
      "method":       "POST",
      "url":          "https://api.openai.com/v1/embeddings",
      "headers":      {
        "Authorization": "Bearer ${credential.openAI_key}",
        "Content-Type":  "application/json"
      },
      "request_body": "{ \"input\": ${parameters.input}, \"model\": \"${parameters.model}\" }",
      "pre_process_function":  "connector.pre_process.openai.embedding",
      "post_process_function": "connector.post_process.openai.embedding"
    }
  ]
}'

The response includes a connector_id. Save it for the next step.

OpenSearch ships with pre- and post-processing helper functions for OpenAI, Bedrock, and Cohere. For custom providers, see the connector blueprints.

Step 3: Register and Deploy the Model

Models in ML Commons belong to a model group. Create a group, register the model against your connector, and deploy it:

# 3a. Create a model group
curl -X POST "$OS/_plugins/_ml/model_groups/_register" \
  -H 'Content-Type: application/json' -d '{
  "name":        "openai-embeddings",
  "description": "OpenAI text embedding models",
  "access_mode": "private"
}'
# -> returns "model_group_id"

# 3b. Register the model against the connector
curl -X POST "$OS/_plugins/_ml/models/_register" \
  -H 'Content-Type: application/json' -d '{
  "name":           "openai/text-embedding-3-small",
  "function_name":  "remote",
  "model_group_id": "<MODEL_GROUP_ID_FROM_3A>",
  "description":    "Remote OpenAI embedding model",
  "connector_id":   "<CONNECTOR_ID_FROM_STEP_2>"
}'
# -> returns a task_id; poll its status:
curl "$OS/_plugins/_ml/tasks/<TASK_ID>"
# -> when COMPLETED, extract "model_id"

# 3c. Deploy the model
curl -X POST "$OS/_plugins/_ml/models/<MODEL_ID>/_deploy"

Step 4: Test the Model with a Predict Call

curl -X POST "$OS/_plugins/_ml/models/<MODEL_ID>/_predict" \
  -H 'Content-Type: application/json' -d '{
  "parameters": {
    "input": ["Hello world", "OpenSearch is a search engine"]
  }
}'

The response contains one data array per input, each a list of 1,536 floats (the dimension of text-embedding-3-small). If you see this, the connector works.

Step 5: Use the Model in a Neural Search Ingest Pipeline

An ingest pipeline with the text_embedding processor auto-vectorizes a field at index time. You never call the embedding API from your application again.

curl -X PUT "$OS/_ingest/pipeline/nlp-ingest-pipeline" \
  -H 'Content-Type: application/json' -d '{
  "description": "Auto-embed the body field into embedding",
  "processors": [
    {
      "text_embedding": {
        "model_id":  "<MODEL_ID>",
        "field_map": { "body": "embedding" }
      }
    }
  ]
}'

curl -X PUT "$OS/neural-documents" -H 'Content-Type: application/json' -d '{
  "settings": {
    "index": {
      "knn": true,
      "default_pipeline": "nlp-ingest-pipeline"
    }
  },
  "mappings": {
    "properties": {
      "title":     { "type": "text" },
      "body":      { "type": "text" },
      "embedding": {
        "type": "knn_vector",
        "dimension": 1536,
        "method": {
          "name":       "hnsw",
          "engine":     "lucene",
          "space_type": "cosinesimil"
        }
      }
    }
  }
}'

curl -X POST "$OS/neural-documents/_doc" -H 'Content-Type: application/json' -d '{
  "title": "OpenSearch at scale",
  "body":  "OpenSearch scales horizontally across shards..."
}'

Step 6: Run a Neural Query

The neural query type calls the model to embed query text, then runs k-NN. Your application sends raw text; OpenSearch handles the rest.

curl -X POST "$OS/neural-documents/_search" -H 'Content-Type: application/json' -d '{
  "size": 5,
  "query": {
    "neural": {
      "embedding": {
        "query_text": "how does opensearch scale",
        "model_id":   "<MODEL_ID>",
        "k": 5
      }
    }
  }
}'

A neural query can be used anywhere a knn query can, including as a sub-query inside a hybrid query. See Run Hybrid Searches.

Next Steps

Monitor token usage: Every neural query calls the upstream provider. Budget accordingly and cache query embeddings where possible.
Rotate credentials: Use the update-connector API to rotate API keys without re-registering the model.
Try other providers: See supported connectors for Bedrock, Cohere, SageMaker, and custom HTTP connectors.