vector. CREATE EXTENSION pgvector; fails with ERROR: could not open extension control file. Always use CREATE EXTENSION vector;.
Validated on 27 Apr 2026 • Last edited on 27 Apr 2026
DigitalOcean Managed PostgreSQL for vector search uses the same managed PostgreSQL engine available under Managed Databases, with the pgvector and pgvectorscale extensions for storing and querying vector embeddings alongside relational data.
This how-to enables the pgvector extension on a DigitalOcean Managed PostgreSQL cluster, designs a table for embeddings, and ingests vectors from your application. It covers psql, Python, Node.js, and Go.
vector extension is supported on PostgreSQL 13 and later.CREATEDB and extension-install permission. The doadmin user has this by default.If you have not created a cluster yet, follow the PostgreSQL Quickstart. For connection string retrieval and SSL setup, see Connect to a PostgreSQL Cluster.
Connect to the target database with psql. Each database in a cluster has its own extension registry, so enable pgvector in every database where you intend to store vectors.
psql "postgresql://doadmin:PASSWORD@HOST:25060/defaultdb?sslmode=require"CREATE EXTENSION IF NOT EXISTS vector;vector. CREATE EXTENSION pgvector; fails with ERROR: could not open extension control file. Always use CREATE EXTENSION vector;.
If you are on a PostgreSQL version that does not support the pgvector release you need, upgrade the cluster in place. On the cluster’s Overview page, scroll to Current Version and click Upgrade Now.
Choose the vector type that fits your model output and storage budget. The dimension must exactly match the output size of the embedding model you use; mixing dimensions in a single column is not allowed.
vector(n): Single-precision floats, four bytes per element. The default choice. Supports dimensions up to 16,000.halfvec(n): Half-precision floats, two bytes per element. Roughly halves storage and index size with a small accuracy trade-off.sparsevec(n): Sparse vectors stored as index/value pairs. Useful for high-dimensional vectors that are mostly zero, such as SPLADE outputs.Add the columns your application actually needs: identifiers, the original text, metadata used for filtering, and timestamps. Keep the embedding separate from any large payloads you do not need on the hot path.
CREATE TABLE documents (
id bigserial PRIMARY KEY,
source text NOT NULL,
content text NOT NULL,
tenant_id uuid NOT NULL,
embedding vector(1536) NOT NULL,
created_at timestamptz NOT NULL DEFAULT now()
);
CREATE INDEX documents_tenant_created_idx
ON documents (tenant_id, created_at DESC);Keep a metadata column such as tenant_id or source so you can scope vector searches with an ordinary WHERE clause. Combining filters with similarity search is one of the main reasons to store vectors in PostgreSQL rather than a separate vector store.
Generate embeddings in your ingest pipeline and write them to PostgreSQL. The examples below use OpenAI-compatible APIs, but the shape is the same for any provider: the output is a list of floats that you bind as a parameter to the INSERT statement.
DigitalOcean AI Platform hosts embedding models including GTE Large v1.5, Qwen3 Embedding 0.6B, All-MiniLM-L6-v2, and Multi-QA-mpnet-base-dot-v1. Consult the DigitalOcean AI Platform documentation for the current set of models and the API surface available to your application.
from openai import OpenAI
import os
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
resp = client.embeddings.create(
model="text-embedding-3-small",
input=["DigitalOcean Managed PostgreSQL supports pgvector."],
)
vector = resp.data[0].embedding # list of 1536 floatsfrom sentence_transformers import SentenceTransformer
model = SentenceTransformer("BAAI/bge-base-en-v1.5")
vector = model.encode("DigitalOcean Managed PostgreSQL supports pgvector").tolist()pg_net or http, so you cannot call an external embedding API from inside a SQL function. Treat embedding generation as part of your ingest pipeline.
pgvector accepts a vector as a string of the form [v1, v2, ..., vn] bound as a text or array parameter. The canonical drivers for each language handle this automatically when you register the pgvector adapter.
For ad-hoc loads, supply the vector literal directly:
INSERT INTO documents (source, tenant_id, content, embedding) VALUES (
'docs',
'11111111-1111-1111-1111-111111111111',
'DigitalOcean Managed PostgreSQL supports pgvector.',
'[0.013, -0.024, 0.117, ...]'::vector
);import os
import psycopg
from pgvector.psycopg import register_vector
with psycopg.connect(os.environ["DATABASE_URL"]) as conn:
register_vector(conn)
with conn.cursor() as cur:
cur.execute(
"""
INSERT INTO documents (source, tenant_id, content, embedding)
VALUES (%s, %s, %s, %s)
""",
("docs", tenant_id, content, vector),
)
conn.commit()import pg from "pg";
import pgvector from "pgvector/pg";
const pool = new pg.Pool({
connectionString: process.env.DATABASE_URL,
ssl: { rejectUnauthorized: true },
});
pool.on("connect", async (client) => {
await pgvector.registerTypes(client);
});
await pool.query(
`INSERT INTO documents (source, tenant_id, content, embedding)
VALUES ($1, $2, $3, $4)`,
["docs", tenantId, content, pgvector.toSql(vector)],
);import (
"context"
"os"
"github.com/jackc/pgx/v5"
"github.com/jackc/pgx/v5/pgxpool"
"github.com/pgvector/pgvector-go"
pgxvec "github.com/pgvector/pgvector-go/pgx"
)
cfg, _ := pgxpool.ParseConfig(os.Getenv("DATABASE_URL"))
cfg.AfterConnect = func(ctx context.Context, conn *pgx.Conn) error {
return pgxvec.RegisterTypes(ctx, conn)
}
pool, _ := pgxpool.NewWithConfig(context.Background(), cfg)
_, err := pool.Exec(ctx, `
INSERT INTO documents (source, tenant_id, content, embedding)
VALUES ($1, $2, $3, $4)`,
"docs", tenantID, content, pgvector.NewVector(vec),
)For bulk ingest, prefer COPY or multi-row INSERT inside a single transaction. Batch a few hundred to a few thousand rows per round trip to cut network overhead. Create the HNSW index after the initial load; building an index on a populated table is much faster than maintaining it during a bulk insert.
-- Extension status
\dx vector
-- Sample a few rows (strip the embedding for readability)
SELECT id, source, tenant_id, left(content, 60) AS snippet
FROM documents
ORDER BY created_at DESC
LIMIT 5;
-- Confirm dimension matches the table definition
SELECT vector_dims(embedding) AS dims
FROM documents
LIMIT 1;If vector_dims returns a value that does not match the dimension in the column definition, resolve it before indexing. A dimension mismatch causes every insert with a differently sized vector to fail.
Try using different keywords or simplifying your search terms.