How to Convert Text Into Dense Vector Representations
Validated on 28 Apr 2026 • Last edited on 20 May 2026
Inference provides a single control plane for managing inference workflows. It includes a Model Catalog where you can view available foundation models, including both DigitalOcean-hosted and third-party commercial models, compare model capabilities and pricing, use routing to match inference requests to the best-fit model, and run inference using serverless or dedicated deployments.
Use the Embeddings API to convert text into dense vector representations for use in semantic search, retrieval-augmented generation (RAG), clustering, classification, and similarity matching. Embedding models include Qwen3 Embedding 0.6B, BGE-M3, and E5-Large. They offer a range of options across multilingual support, token window sizes, and dimensionality to match your use case.
For embedding requests, send a POST request to the https://inference.do-ai.run base URL using your existing Model Access Key. Embeddings are returned synchronously as float arrays that you can store in a vector database or use directly in your application pipeline.
The following cURL and Python PyDo examples generates a vector embedding using the Qwen3 Embedding 0.6B model:
Create a model access key and save it for use with the API.
Send a POST request to https://inference.do-ai.run/v1/embeddings.
Using cURL:
curl -X POST \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $DIGITALOCEAN_TOKEN" \
-d '{"model":"qwen3-embedding-0.6b","input":["hello world","goodbye world"],"encoding_format":"float","user":"user-1234"}' \
"https://inference.do-ai.run/v1/embeddings"Using PyDo, the official DigitalOcean API client for Python:
import os
from pydo import Client
client = Client(token=os.environ.get("DIGITALOCEAN_TOKEN"))
resp = client.embeddings.create(
model="qwen3-embedding-0.6b",
input=["hello world", "goodbye world"],
encoding_format="float",
user="user-1234",
)
for item in resp.data:
print(item.index, item.embedding[:8])