For AI agents: The documentation index is at https://docs.digitalocean.com/llms.txt. Markdown versions of pages use the same URL with
index.html.mdin place of the HTML page (for example, appendindex.html.mdto the directory path instead of opening the HTML document).
Available Foundation and Embeddings Models for DigitalOcean AI Platform
Validated on 1 May 2026 • Last edited on 1 May 2026
DigitalOcean AI Platform lets you build fully-managed AI agents with knowledge bases for retrieval-augmented generation, multi-agent routing, guardrails, and more.
The following foundation and embeddings models are available for DigitalOcean AI Platform.
Foundation Models
DigitalOcean AI Platform supports both open source and commercial foundation models. You can use these models for:
Open source models are generally published by research labs, available under open licenses. Commercial models are proprietary such as OpenAI and Anthropic models. All models are offered using DigitalOcean API access keys, but you can also bring your own provider’s API keys to access the commercial models.
We regularly update our model offerings to provide the most performant and efficient models, and deprecate older models. For information on our model deprecation policy and recommended model replacements, see Model Support Policy.
We offer the following foundation models, subject to the AI Model Terms, our Service Terms, and the Terms of Service Agreement.
You can use these models in serverless inference, dedicated inference, inference routers, or batch inference. See the model-specific usage information below.
Anthropic Models
Anthropic models available on the DigitalOcean AI Platform support tool (function) calling, prompt caching, and other features. See the usage notes in the following table for details. Refer to the provider documentation for other supported features.
| Model | Model ID | Max Output Tokens | Use for | Usage Notes | Tentative End-of-Support | |
|---|---|---|---|---|---|---|
| Claude Sonnet 4.6 | anthropic-claude-4.6-sonnet |
64,000 | ✔️ Serverless inference ✔️ ADK ✔️ Agents |
✔️ Input context window of up to 1M tokens ✔️ Prompt caching ✔️ Tool (function) calling |
No sooner than February 2027 | |
| Claude Sonnet 4.5 | anthropic-claude-4.5-sonnet |
64,000 | ✔️ Serverless inference ✔️ ADK ✔️ Agents |
✔️ Input context window of up to 1M tokens ✔️ Prompt caching ✔️ Tool calling |
No sooner than September 2026 | |
| Claude Sonnet 4 | anthropic-claude-sonnet-4 |
64,000 | ✔️ Serverless inference ✔️ ADK ✔️ Agents |
✔️ Input context window of up to 1M tokens ✔️ Prompt caching ✔️ Tool calling |
No sooner than May 2026 | |
| Claude Haiku 4.5 | anthropic-claude-haiku-4.5 |
64,000 | ✔️ Serverless inference ✔️ ADK ✔️ Agents |
✔️ Prompt caching ✔️ Tool calling |
No sooner than October 2026 | |
| Claude Opus 4.7 | anthropic-claude-opus-4.7 |
128,000 | ✔️ Serverless inference ✔️ ADK ✔️ Agents |
✔️ Input context window of up to 1M tokens ✔️ Prompt caching ✔️ Tool calling |
Not sooner than April 16, 2027 | |
| Claude Opus 4.6 | anthropic-claude-opus-4.6 |
128,000 | ✔️ Serverless inference ✔️ ADK ✔️ Agents |
✔️ Input context window of up to 1M tokens ✔️ Prompt caching ✔️ Tool calling |
No sooner than February 2027 | |
| Claude Opus 4.5 | anthropic-claude-opus-4.5 |
64,000 | ✔️ Serverless inference ✔️ ADK ✔️ Agents |
✔️ Prompt caching ✔️ Tool calling |
No sooner than November 2026 | |
| Claude Opus 4.1 | anthropic-claude-4.1-opus |
32,000 | ✔️ Serverless inference ✔️ ADK ✔️ Agents |
✔️ Prompt caching ✔️ Tool calling |
No sooner than August 2026 | |
| Claude Opus 4 | anthropic-claude-opus-4 |
32,000 | ✔️ Serverless inference ✔️ ADK ✔️ Agents |
✔️ Prompt caching ✔️ Tool calling |
No sooner than May 2026 |
Arcee Models
| Model | Model ID | Max Output Tokens | Use for | Usage Notes |
|---|---|---|---|---|
| Trinity Large (Public Preview) | arcee-trinity-large-thinking |
128,000 | ✔️ Serverless inference ✔️ ADK |
✔️ Chat Completions API for sending prompts. ✔️ Prompt caching. ℹ️ Use is subject to Public Preview Terms including Arcee Terms & Conditions. |
fal Models
| Model | Model ID | Type | Use for | Usage Notes |
|---|---|---|---|---|
| Fast SDXL | fal-ai/fast-sdxl |
Image generation | ✔️ Serverless inference ✔️ ADK |
ℹ️ Multimodal and generative model |
| Flux Schnell | fal-ai/flux/schnell |
Image generation | ✔️ Serverless inference ✔️ ADK |
ℹ️ Multimodal and generative model |
| Stable Audio 2.5 (Text-to-Audio) | fal-ai/stable-audio-25/text-to-audio |
Text-to-audio | ✔️ Serverless inference ✔️ ADK |
ℹ️ Multimodal and generative model |
| Multilingual TTS v2 | fal-ai/elevenlabs/tts/multilingual-v2 |
Text-to-speech | ✔️ Serverless inference ✔️ ADK |
ℹ️ Multimodal and generative model |
OpenAI Models
OpenAI models available on the DigitalOcean AI Platform support tool (function) calling, prompt caching, and other features. See the usage notes in the following table for details. Refer to the provider documentation for other supported features.
| Model | Model ID | Max Output Tokens | Use for | Usage Notes |
|---|---|---|---|---|
| GPT-5.4 | openai-gpt-5.4 |
128,000 | ✔️ Serverless inference ✔️ ADK ✔️ Agents |
✔️ Input context window of up to 1M tokens ✔️ Only the Responses API for sending prompts for serverless inference ✔️ Prompt caching ✔️ Tool calling |
| GPT-5.4 mini | openai-gpt-5.4-mini |
128,000 | ✔️ Serverless inference ✔️ ADK |
✔️ Only the Responses API for sending prompts for serverless inference ✔️ Prompt caching ✔️ Tool calling |
| GPT-5.4 nano | openai-gpt-5.4-nano |
128,000 | ✔️ Serverless inference ✔️ ADK |
✔️ Only the Responses API for sending prompts for serverless inference ✔️ Prompt caching ✔️ Tool calling |
| GPT-5.4 pro | openai-gpt-5.4-pro |
128,000 | ✔️ Serverless inference ✔️ ADK |
✔️ Only the Responses API for sending prompts for serverless inference ✔️ Tool calling |
| GPT-5.3-Codex | openai-gpt-5.3-codex |
128,000 | ✔️ Serverless inference ✔️ ADK |
✔️ Input context window of up to 400,000 tokens ✔️ Prompt caching ✔️ Tool calling |
| GPT-5.2 | openai-gpt-5.2 |
128,000 | ✔️ Serverless inference ✔️ ADK ✔️ Agents |
✔️ Prompt caching ✔️ Tool calling |
| GPT-5.2 pro | openai-gpt-5-2-pro |
128,000 | ✔️ Serverless inference ✔️ ADK ✔️ Agents |
✔️ Prompt caching ✔️ Tool calling |
| GPT-5.1-Codex-Max | openai-gpt-5.1-codex-max |
128,000 | ✔️ Serverless inference ✔️ ADK |
✔️ Prompt caching ✔️ Tool calling |
| GPT-5 | openai-gpt-5 |
128,000 | ✔️ Serverless inference ✔️ ADK ✔️ Agents |
✔️ Prompt caching ✔️ Tool calling |
| GPT-5 mini | openai-gpt-5-mini |
128,000 | ✔️ Serverless inference ✔️ ADK ✔️ Agents |
✔️ Prompt caching ✔️ Tool calling |
| GPT-5 nano | openai-gpt-5-nano |
128,000 | ✔️ Serverless inference ✔️ ADK ✔️ Agents |
✔️ Prompt caching ✔️ Tool calling |
| GPT-4.1 | openai-gpt-4.1 |
32,768 | ✔️ Serverless inference ✔️ ADK ✔️ Agents |
✔️ Prompt caching ✔️ Tool calling |
| GPT-4o | openai-gpt-4o |
16,384 | ✔️ Serverless inference ✔️ ADK ✔️ Agents |
✔️ Prompt caching ✔️ Tool calling |
| GPT-4o mini | openai-gpt-4o-mini |
16,384 | ✔️ Serverless inference ✔️ ADK ✔️ Agents |
✔️ Prompt caching ✔️ Tool calling |
| o1 | openai-o1 |
Not published | ✔️ Serverless inference ✔️ ADK ✔️ Agents |
✔️ Prompt caching ✔️ Tool calling |
| o3 | openai-o3 |
Not published | ✔️ Serverless inference ✔️ ADK ✔️ Agents |
✔️ Prompt caching ✔️ Tool calling |
| o3-mini | openai-o3-mini |
Not published | ✔️ Serverless inference ✔️ ADK ✔️ Agents |
✔️ Prompt caching ✔️ Tool calling |
| GPT Image 1 | openai-gpt-image-1 |
Not published | ✔️ Serverless inference ✔️ ADK ✔️ Agents |
✔️ Prompt caching ✔️ Tool calling |
| GPT Image 1.5 | openai-gpt-image-1.5 |
Not published | ✔️ Serverless inference ✔️ ADK |
|
| GPT Image 2 | openai-gpt-image-2 |
Not published | ✔️ Serverless inference ✔️ ADK |
DigitalOcean-Hosted Models
| Provider | Model | Model ID | Parameters | Max Output Tokens | Use for | Usage Notes |
|---|---|---|---|---|---|---|
| Alibaba | Qwen3-32B | alibaba-qwen3-32b |
32 billion | 40,960 | ✔️ Serverless inference ✔️ Dedicated inference ✔️ ADK |
|
| Alibaba | Qwen3 Coder Flash | qwen3-coder-flash |
30 billion | 65,536 | ✔️ Serverless inference ✔️ Dedicated inference ✔️ ADK |
✔️ Chat Completions and Responses APIs for sending prompts for serverless inference. |
| Alibaba | Qwen 3.5 397B A17B | qwen3.5-397b-a17b |
Not published | 81,920 | ✔️ Serverless inference ✔️ Dedicated inference ✔️ ADK |
|
| Alibaba | Qwen 3 TTS (1.7B) | qwen3-tts-voicedesign |
Not published | Not published | ✔️ Serverless inference ✔️ ADK |
ℹ️ Text-to-speech. Multimodal and generative model. |
| Alibaba | Wan2.2-T2V-A14B | wan2.2-t2v-a14b |
Not published | Not published | ✔️ Serverless inference ✔️ ADK |
ℹ️ Text-to-video. Multimodal and generative model. |
| DeepSeek | DeepSeek R1 Distill Llama 70B | deepseek-r1-distill-llama-70b |
70 billion | 32,768 | ✔️ Serverless inference ✔️ Dedicated inference ✔️ ADK ✔️ Agents |
ℹ️ When using in a user-facing agent, we strongly recommend adding all available guardrails for a safer conversational experience. |
| DeepSeek | DeepSeek V4 Pro | deepseek-v4-pro |
1.6 trillion | 1,048,576 | ✔️ Serverless inference ✔️ ADK ✔️ Agents |
✔️ Chat Completions and Responses APIs for sending prompts for serverless inference. ℹ️ When using in a user-facing agent, we strongly recommend adding all available guardrails for a safer conversational experience. |
| DeepSeek | DeepSeek V3.2 | deepseek-3.2 |
Not published | 64,000 | ✔️ Serverless inference ✔️ Dedicated inference ✔️ ADK ✔️ Agents |
✔️ Chat Completions and Responses APIs for sending prompts for serverless inference. ℹ️ When using in a user-facing agent, we strongly recommend adding all available guardrails for a safer conversational experience. |
| Gemma 4 | gemma-4-31B-it |
31 billion | 256,000 | ✔️ Serverless inference ✔️ Dedicated inference ✔️ ADK ✔️ Agents |
✔️ Chat Completions and Responses APIs for sending prompts for serverless inference. | |
| Intfloat | E5 Mistral 7B Instruct | e5-mistral-7b-instruct |
7 billion | Not published | ✔️ Serverless inference ✔️ Dedicated inference ✔️ ADK |
ℹ️ Embedding model for retrieval and similarity. |
| MiniMax | M2.5 (Public Preview) | minimax-m2.5 |
230 billion | 128,000 | ✔️ Serverless inference ✔️ Dedicated inference ✔️ Dedicated inference ✔️ ADK ✔️ Agents |
✔️ Chat Completions and Responses APIs for sending prompts for serverless inference. ℹ️ Use is subject to Public Preview Terms including MiniMax Model License. |
| Moonshot AI | Kimi K2.5 | kimi-k2.5 |
1 trillion | 32,768 | ✔️ Serverless inference ✔️ Dedicated inference ✔️ ADK ✔️Agents |
✔️ Chat Completions and Responses APIs for sending prompts for serverless inference. ℹ️ Use is subject to a Modified MIT license. |
| Meta | Llama 3.3 Instruct-70B | llama3.3-70b-instruct |
70 billion | 128,000 | ✔️ Serverless inference ✔️ Dedicated inference ✔️ ADK ✔️ Agents |
|
| Meta | Llama 4 Maverick 17B 128E Instruct | llama-4-maverick |
17 billion | 16,384 | ✔️ Serverless inference ✔️ Dedicated inference ✔️ ADK ✔️ Agents |
✔️ Chat Completions and Responses APIs for sending prompts for serverless inference. |
| Mistral AI | Ministral 3 14B Instruct | mistral-3-14B |
14 billion | 128,000 | ✔️ Serverless inference ✔️ Dedicated inference ✔️ ADK ✔️ Agents |
✔️ Chat Completions and Responses APIs for sending prompts for serverless inference. |
| NVIDIA | Nemotron-3-Super-120B (Public Preview) | nvidia-nemotron-3-super-120b |
120 billion | Not published | ✔️ Serverless inference ✔️ ADK ✔️ Agents |
✔️ Chat Completions and Responses APIs for sending prompts for serverless inference. ℹ️ Use is subject to Public Preview Terms including NVIDIA Model License. |
| NVIDIA | Nemotron Nano 3 Omni | nemotron-3-nano-omni |
Not published | 65,536 | ✔️ Serverless inference ✔️ ADK |
✔️ Chat Completions and Responses APIs for sending prompts for serverless inference. ℹ️ Context window 65,536 tokens. |
| NVIDIA | Nemotron Nano 12B v2 VL | nemotron-nano-12b-v2-vl |
12 billion | 16,384 | ✔️ Serverless inference ✔️ ADK |
✔️ Chat Completions and Responses APIs for sending prompts for serverless inference. |
| OpenAI | gpt-oss-120b | openai-gpt-oss-120b |
117 billion | 131,072 | ✔️ Serverless inference ✔️ ADK ✔️ Agents |
|
| OpenAI | gpt-oss-20b | openai-gpt-oss-20b |
21 billion | 131,072 | ✔️ Serverless inference ✔️ ADK ✔️ Agents |
|
| Stability AI | Stable Diffusion 3.5 Large | stable-diffusion-3.5-large |
Not published | Not published | ✔️ Serverless inference ✔️ ADK |
ℹ️ Image generation. Multimodal and generative model. |
| Z.ai | GLM 5 | glm-5 |
744 billion | 128,000 | ✔️ Serverless inference ✔️ ADK ✔️ Agents |
✔️ Chat Completions and Responses APIs for sending prompts for serverless inference. ℹ️ Use is subject to the MIT License. |
Embeddings Models
An embedding model converts data into vector embeddings. DigitalOcean stores vector embeddings in an OpenSearch database cluster for use with agent knowledge bases. The following embeddings models are available on the platform, along with their token windows and recommended chunking ranges.
Alibaba Models
| Model | Parameters | Token Window | Chunk Size Range | Parent Chunk Range | Child Chunk Range |
|---|---|---|---|---|---|
| GTE Large (v1.5) | Not available | 8192 tokens | 0-750 | 500-1000 | 300-500 |
| Qwen3 Embedding 0.6B (Multilingual) (in Public Preview) |
600 million | 8000 tokens | 0-750 | 500-1000 | 300-500 |
BAAI Models
| Model | Parameters | Token Window | Chunk Size Range | Parent Chunk Range | Child Chunk Range |
|---|---|---|---|---|---|
| BGE M3 | 568M | 8192 tokens | 0-8192 | Not Specified | Not Specified |
Intfloat Models
| Model | Parameters | Token Window | Chunk Size Range | Parent Chunk Range | Child Chunk Range |
|---|---|---|---|---|---|
| E5 Large (multilingual) | 560 million | 514 tokens | 0-512 | 100-512 | 100-500 |
| E5 Large (v2) | Not available | 512 tokens | 0-512 | Not Specified | Not Specified |
UKP Lab (Technical University of Darmstadt) Models
| Model | Parameters | Token Window | Chunk Size Range | Parent Chunk Range | Child Chunk Range |
|---|---|---|---|---|---|
| All-MiniLM-L6-v2 | 22 million | 256 tokens | 0-256 | 100-256 | 100-200 |
| Multi-QA-mpnet-base-dot-v1 | 109 million | 512 tokens | 0-512 | 100-512 | 100-500 |
Reranking Models
Reranking models reorder retrieved results to improve relevance after the initial retrieval step. DigitalOcean supports the following reranking model for knowledge base retrieval:
BAAI Models
| Model | Parameters | Usage Notes |
|---|---|---|
| BGE Reranker (v2) M3 | Not available | Can be enabled at knowledge base creation, updated after creation. |