Available Foundation and Embeddings Models for DigitalOcean AI Platform

Validated on 1 May 2026 • Last edited on 1 May 2026

DigitalOcean AI Platform lets you build fully-managed AI agents with knowledge bases for retrieval-augmented generation, multi-agent routing, guardrails, and more.

The following foundation and embeddings models are available for DigitalOcean AI Platform.

Foundation Models

DigitalOcean AI Platform supports both open source and commercial foundation models. You can use these models for:

Open source models are generally published by research labs, available under open licenses. Commercial models are proprietary such as OpenAI and Anthropic models. All models are offered using DigitalOcean API access keys, but you can also bring your own provider’s API keys to access the commercial models.

We regularly update our model offerings to provide the most performant and efficient models, and deprecate older models. For information on our model deprecation policy and recommended model replacements, see Model Support Policy.

We offer the following foundation models, subject to the AI Model Terms, our Service Terms, and the Terms of Service Agreement.

You can use these models in serverless inference, dedicated inference, inference routers, or batch inference. See the model-specific usage information below.

Anthropic Models

Anthropic models available on the DigitalOcean AI Platform support tool (function) calling, prompt caching, and other features. See the usage notes in the following table for details. Refer to the provider documentation for other supported features.

Model Model ID Max Output Tokens Use for Usage Notes Tentative End-of-Support
Claude Sonnet 4.6 anthropic-claude-4.6-sonnet 64,000 ✔️ Serverless inference
✔️ ADK
✔️ Agents
✔️ Input context window of up to 1M tokens
✔️ Prompt caching
✔️ Tool (function) calling
No sooner than February 2027
Claude Sonnet 4.5 anthropic-claude-4.5-sonnet 64,000 ✔️ Serverless inference
✔️ ADK
✔️ Agents
✔️ Input context window of up to 1M tokens
✔️ Prompt caching
✔️ Tool calling
No sooner than September 2026
Claude Sonnet 4 anthropic-claude-sonnet-4 64,000 ✔️ Serverless inference
✔️ ADK
✔️ Agents
✔️ Input context window of up to 1M tokens
✔️ Prompt caching
✔️ Tool calling
No sooner than May 2026
Claude Haiku 4.5 anthropic-claude-haiku-4.5 64,000 ✔️ Serverless inference
✔️ ADK
✔️ Agents
✔️ Prompt caching
✔️ Tool calling
No sooner than October 2026
Claude Opus 4.7 anthropic-claude-opus-4.7 128,000 ✔️ Serverless inference
✔️ ADK
✔️ Agents
✔️ Input context window of up to 1M tokens
✔️ Prompt caching
✔️ Tool calling
Not sooner than April 16, 2027
Claude Opus 4.6 anthropic-claude-opus-4.6 128,000 ✔️ Serverless inference
✔️ ADK
✔️ Agents
✔️ Input context window of up to 1M tokens
✔️ Prompt caching
✔️ Tool calling
No sooner than February 2027
Claude Opus 4.5 anthropic-claude-opus-4.5 64,000 ✔️ Serverless inference
✔️ ADK
✔️ Agents
✔️ Prompt caching
✔️ Tool calling
No sooner than November 2026
Claude Opus 4.1 anthropic-claude-4.1-opus 32,000 ✔️ Serverless inference
✔️ ADK
✔️ Agents
✔️ Prompt caching
✔️ Tool calling
No sooner than August 2026
Claude Opus 4 anthropic-claude-opus-4 32,000 ✔️ Serverless inference
✔️ ADK
✔️ Agents
✔️ Prompt caching
✔️ Tool calling
No sooner than May 2026
Arcee Models
Model Model ID Max Output Tokens Use for Usage Notes
Trinity Large (Public Preview) arcee-trinity-large-thinking 128,000 ✔️ Serverless inference
✔️ ADK
✔️ Chat Completions API for sending prompts.
✔️ Prompt caching.
ℹ️ Use is subject to Public Preview Terms including Arcee Terms & Conditions.
fal Models
Model Model ID Type Use for Usage Notes
Fast SDXL fal-ai/fast-sdxl Image generation ✔️ Serverless inference
✔️ ADK
ℹ️ Multimodal and generative model
Flux Schnell fal-ai/flux/schnell Image generation ✔️ Serverless inference
✔️ ADK
ℹ️ Multimodal and generative model
Stable Audio 2.5 (Text-to-Audio) fal-ai/stable-audio-25/text-to-audio Text-to-audio ✔️ Serverless inference
✔️ ADK
ℹ️ Multimodal and generative model
Multilingual TTS v2 fal-ai/elevenlabs/tts/multilingual-v2 Text-to-speech ✔️ Serverless inference
✔️ ADK
ℹ️ Multimodal and generative model
OpenAI Models

OpenAI models available on the DigitalOcean AI Platform support tool (function) calling, prompt caching, and other features. See the usage notes in the following table for details. Refer to the provider documentation for other supported features.

Model Model ID Max Output Tokens Use for Usage Notes
GPT-5.4 openai-gpt-5.4 128,000 ✔️ Serverless inference
✔️ ADK
✔️ Agents
✔️ Input context window of up to 1M tokens
✔️ Only the Responses API for sending prompts for serverless inference
✔️ Prompt caching
✔️ Tool calling
GPT-5.4 mini openai-gpt-5.4-mini 128,000 ✔️ Serverless inference
✔️ ADK
✔️ Only the Responses API for sending prompts for serverless inference
✔️ Prompt caching
✔️ Tool calling
GPT-5.4 nano openai-gpt-5.4-nano 128,000 ✔️ Serverless inference
✔️ ADK
✔️ Only the Responses API for sending prompts for serverless inference
✔️ Prompt caching
✔️ Tool calling
GPT-5.4 pro openai-gpt-5.4-pro 128,000 ✔️ Serverless inference
✔️ ADK
✔️ Only the Responses API for sending prompts for serverless inference
✔️ Tool calling
GPT-5.3-Codex openai-gpt-5.3-codex 128,000 ✔️ Serverless inference
✔️ ADK
✔️ Input context window of up to 400,000 tokens
✔️ Prompt caching
✔️ Tool calling
GPT-5.2 openai-gpt-5.2 128,000 ✔️ Serverless inference
✔️ ADK
✔️ Agents
✔️ Prompt caching
✔️ Tool calling
GPT-5.2 pro openai-gpt-5-2-pro 128,000 ✔️ Serverless inference
✔️ ADK
✔️ Agents
✔️ Prompt caching
✔️ Tool calling
GPT-5.1-Codex-Max openai-gpt-5.1-codex-max 128,000 ✔️ Serverless inference
✔️ ADK
✔️ Prompt caching
✔️ Tool calling
GPT-5 openai-gpt-5 128,000 ✔️ Serverless inference
✔️ ADK
✔️ Agents
✔️ Prompt caching
✔️ Tool calling
GPT-5 mini openai-gpt-5-mini 128,000 ✔️ Serverless inference
✔️ ADK
✔️ Agents
✔️ Prompt caching
✔️ Tool calling
GPT-5 nano openai-gpt-5-nano 128,000 ✔️ Serverless inference
✔️ ADK
✔️ Agents
✔️ Prompt caching
✔️ Tool calling
GPT-4.1 openai-gpt-4.1 32,768 ✔️ Serverless inference
✔️ ADK
✔️ Agents
✔️ Prompt caching
✔️ Tool calling
GPT-4o openai-gpt-4o 16,384 ✔️ Serverless inference
✔️ ADK
✔️ Agents
✔️ Prompt caching
✔️ Tool calling
GPT-4o mini openai-gpt-4o-mini 16,384 ✔️ Serverless inference
✔️ ADK
✔️ Agents
✔️ Prompt caching
✔️ Tool calling
o1 openai-o1 Not published ✔️ Serverless inference
✔️ ADK
✔️ Agents
✔️ Prompt caching
✔️ Tool calling
o3 openai-o3 Not published ✔️ Serverless inference
✔️ ADK
✔️ Agents
✔️ Prompt caching
✔️ Tool calling
o3-mini openai-o3-mini Not published ✔️ Serverless inference
✔️ ADK
✔️ Agents
✔️ Prompt caching
✔️ Tool calling
GPT Image 1 openai-gpt-image-1 Not published ✔️ Serverless inference
✔️ ADK
✔️ Agents
✔️ Prompt caching
✔️ Tool calling
GPT Image 1.5 openai-gpt-image-1.5 Not published ✔️ Serverless inference
✔️ ADK
GPT Image 2 openai-gpt-image-2 Not published ✔️ Serverless inference
✔️ ADK
DigitalOcean-Hosted Models
Provider Model Model ID Parameters Max Output Tokens Use for Usage Notes
Alibaba Qwen3-32B alibaba-qwen3-32b 32 billion 40,960 ✔️ Serverless inference
✔️ Dedicated inference
✔️ ADK
Alibaba Qwen3 Coder Flash qwen3-coder-flash 30 billion 65,536 ✔️ Serverless inference
✔️ Dedicated inference
✔️ ADK
✔️ Chat Completions and Responses APIs for sending prompts for serverless inference.
Alibaba Qwen 3.5 397B A17B qwen3.5-397b-a17b Not published 81,920 ✔️ Serverless inference
✔️ Dedicated inference
✔️ ADK
Alibaba Qwen 3 TTS (1.7B) qwen3-tts-voicedesign Not published Not published ✔️ Serverless inference
✔️ ADK
ℹ️ Text-to-speech. Multimodal and generative model.
Alibaba Wan2.2-T2V-A14B wan2.2-t2v-a14b Not published Not published ✔️ Serverless inference
✔️ ADK
ℹ️ Text-to-video. Multimodal and generative model.
DeepSeek DeepSeek R1 Distill Llama 70B deepseek-r1-distill-llama-70b 70 billion 32,768 ✔️ Serverless inference
✔️ Dedicated inference
✔️ ADK
✔️ Agents
ℹ️ When using in a user-facing agent, we strongly recommend adding all available guardrails for a safer conversational experience.
DeepSeek DeepSeek V4 Pro deepseek-v4-pro 1.6 trillion 1,048,576 ✔️ Serverless inference
✔️ ADK
✔️ Agents
✔️ Chat Completions and Responses APIs for sending prompts for serverless inference.
ℹ️ When using in a user-facing agent, we strongly recommend adding all available guardrails for a safer conversational experience.
DeepSeek DeepSeek V3.2 deepseek-3.2 Not published 64,000 ✔️ Serverless inference
✔️ Dedicated inference
✔️ ADK
✔️ Agents
✔️ Chat Completions and Responses APIs for sending prompts for serverless inference.
ℹ️ When using in a user-facing agent, we strongly recommend adding all available guardrails for a safer conversational experience.
Google Gemma 4 gemma-4-31B-it 31 billion 256,000 ✔️ Serverless inference
✔️ Dedicated inference
✔️ ADK
✔️ Agents
✔️ Chat Completions and Responses APIs for sending prompts for serverless inference.
Intfloat E5 Mistral 7B Instruct e5-mistral-7b-instruct 7 billion Not published ✔️ Serverless inference
✔️ Dedicated inference
✔️ ADK
ℹ️ Embedding model for retrieval and similarity.
MiniMax M2.5 (Public Preview) minimax-m2.5 230 billion 128,000 ✔️ Serverless inference
✔️ Dedicated inference
✔️ Dedicated inference
✔️ ADK
✔️ Agents
✔️ Chat Completions and Responses APIs for sending prompts for serverless inference.
ℹ️ Use is subject to Public Preview Terms including MiniMax Model License.
Moonshot AI Kimi K2.5 kimi-k2.5 1 trillion 32,768 ✔️ Serverless inference
✔️ Dedicated inference
✔️ ADK
✔️Agents
✔️ Chat Completions and Responses APIs for sending prompts for serverless inference.
ℹ️ Use is subject to a Modified MIT license.
Meta Llama 3.3 Instruct-70B llama3.3-70b-instruct 70 billion 128,000 ✔️ Serverless inference
✔️ Dedicated inference
✔️ ADK
✔️ Agents
Meta Llama 4 Maverick 17B 128E Instruct llama-4-maverick 17 billion 16,384 ✔️ Serverless inference
✔️ Dedicated inference
✔️ ADK
✔️ Agents
✔️ Chat Completions and Responses APIs for sending prompts for serverless inference.
Mistral AI Ministral 3 14B Instruct mistral-3-14B 14 billion 128,000 ✔️ Serverless inference
✔️ Dedicated inference
✔️ ADK
✔️ Agents
✔️ Chat Completions and Responses APIs for sending prompts for serverless inference.
NVIDIA Nemotron-3-Super-120B (Public Preview) nvidia-nemotron-3-super-120b 120 billion Not published ✔️ Serverless inference
✔️ ADK
✔️ Agents
✔️ Chat Completions and Responses APIs for sending prompts for serverless inference.
ℹ️ Use is subject to Public Preview Terms including NVIDIA Model License.
NVIDIA Nemotron Nano 3 Omni nemotron-3-nano-omni Not published 65,536 ✔️ Serverless inference
✔️ ADK
✔️ Chat Completions and Responses APIs for sending prompts for serverless inference.
ℹ️ Context window 65,536 tokens.
NVIDIA Nemotron Nano 12B v2 VL nemotron-nano-12b-v2-vl 12 billion 16,384 ✔️ Serverless inference
✔️ ADK
✔️ Chat Completions and Responses APIs for sending prompts for serverless inference.
OpenAI gpt-oss-120b openai-gpt-oss-120b 117 billion 131,072 ✔️ Serverless inference
✔️ ADK
✔️ Agents
OpenAI gpt-oss-20b openai-gpt-oss-20b 21 billion 131,072 ✔️ Serverless inference
✔️ ADK
✔️ Agents
Stability AI Stable Diffusion 3.5 Large stable-diffusion-3.5-large Not published Not published ✔️ Serverless inference
✔️ ADK
ℹ️ Image generation. Multimodal and generative model.
Z.ai GLM 5 glm-5 744 billion 128,000 ✔️ Serverless inference
✔️ ADK
✔️ Agents
✔️ Chat Completions and Responses APIs for sending prompts for serverless inference.
ℹ️ Use is subject to the MIT License.

Embeddings Models

An embedding model converts data into vector embeddings. DigitalOcean stores vector embeddings in an OpenSearch database cluster for use with agent knowledge bases. The following embeddings models are available on the platform, along with their token windows and recommended chunking ranges.

Alibaba Models
Model Parameters Token Window Chunk Size Range Parent Chunk Range Child Chunk Range
GTE Large (v1.5) Not available 8192 tokens 0-750 500-1000 300-500
Qwen3 Embedding 0.6B (Multilingual)
(in Public Preview)
600 million 8000 tokens 0-750 500-1000 300-500
BAAI Models
Model Parameters Token Window Chunk Size Range Parent Chunk Range Child Chunk Range
BGE M3 568M 8192 tokens 0-8192 Not Specified Not Specified
Intfloat Models
Model Parameters Token Window Chunk Size Range Parent Chunk Range Child Chunk Range
E5 Large (multilingual) 560 million 514 tokens 0-512 100-512 100-500
E5 Large (v2) Not available 512 tokens 0-512 Not Specified Not Specified
UKP Lab (Technical University of Darmstadt) Models
Model Parameters Token Window Chunk Size Range Parent Chunk Range Child Chunk Range
All-MiniLM-L6-v2 22 million 256 tokens 0-256 100-256 100-200
Multi-QA-mpnet-base-dot-v1 109 million 512 tokens 0-512 100-512 100-500

Reranking Models

Reranking models reorder retrieved results to improve relevance after the initial retrieval step. DigitalOcean supports the following reranking model for knowledge base retrieval:

BAAI Models
Model Parameters Usage Notes
BGE Reranker (v2) M3 Not available Can be enabled at knowledge base creation, updated after creation.

We can't find any results for your search.

Try using different keywords or simplifying your search terms.