Give Feedback

Available Foundation Models for DigitalOcean Gradient™ AI Inference Hub

Validated on 14 Apr 2026 • Last edited on 14 Apr 2026

DigitalOcean Gradient™ AI Inference Hub provides a single control plane for managing inference workflows. It includes a Model Catalog where you can view available foundation models, including both DigitalOcean-hosted and third-party commercial models, compare capabilities and pricing, and run inference using serverless or dedicated deployments. DigitalOcean Gradient AI Inference Hub is in private preview. You can contact support for questions or assistance.

Copy page as Markdown View page as Markdown

Gradient AI Inference Hub supports both open source and commercial foundation models. Open source models are generally published by research labs, available under open licenses. Commercial models are proprietary such as OpenAI and Anthropic models. All models are offered using DigitalOcean API access keys, but you can also bring your own provider’s API keys to access the commercial models.

You can use these models for:

Serverless inference

You can also use these models for building agents in the Gradient AI Agent Platform.

Note

For pricing information, see the pricing page.

We offer the following foundation models, subject to the AI Model Terms, our Service Terms, and the Terms of Service Agreement:

Anthropic Models

Anthropic models available on the Gradient AI Platform support tool (function) calling, prompt caching, and other features. See the usage notes in the following table for details. Refer to the provider documentation for other supported features.

Model	Model ID	Max Output Tokens	Use for	Usage Notes	Tentative End-of-Support
Claude Sonnet 4.6	`anthropic-claude-4.6-sonnet`	64,000	✔️ Serverless inference ✔️ ADK ✔️ Agents	✔️ Input context window of up to 1M tokens ✔️ Prompt caching ✔️ Tool (function) calling	No sooner than February 2027
Claude Sonnet 4.5	`anthropic-claude-4.5-sonnet`	64,000	✔️ Serverless inference ✔️ ADK ✔️ Agents	✔️ Input context window of up to 1M tokens ✔️ Prompt caching ✔️ Tool calling	No sooner than September 2026
Claude Sonnet 4	`anthropic-claude-sonnet-4`	64,000	✔️ Serverless inference ✔️ ADK ✔️ Agents	✔️ Input context window of up to 1M tokens ✔️ Prompt caching ✔️ Tool calling	No sooner than May 2026
Claude Haiku 4.5	`anthropic-claude-haiku-4.5`	64,000	✔️ Serverless inference ✔️ ADK ✔️ Agents	✔️ Prompt caching ✔️ Tool calling	No sooner than October 2026
Claude Opus 4.7	`anthropic-claude-opus-4.7`	128,000	✔️ Serverless inference ✔️ ADK ✔️ Agents	✔️ Input context window of up to 1M tokens ✔️ Prompt caching ✔️ Tool calling	Not sooner than April 16, 2027
Claude Opus 4.6	`anthropic-claude-opus-4.6`	128,000	✔️ Serverless inference ✔️ ADK ✔️ Agents	✔️ Input context window of up to 1M tokens ✔️ Prompt caching ✔️ Tool calling	No sooner than February 2027
Claude Opus 4.5	`anthropic-claude-opus-4.5`	64,000	✔️ Serverless inference ✔️ ADK ✔️ Agents	✔️ Prompt caching ✔️ Tool calling	No sooner than November 2026
Claude Opus 4.1	`anthropic-claude-4.1-opus`	32,000	✔️ Serverless inference ✔️ ADK ✔️ Agents	✔️ Prompt caching ✔️ Tool calling	No sooner than August 2026
Claude Opus 4	`anthropic-claude-opus-4`	32,000	✔️ Serverless inference ✔️ ADK ✔️ Agents	✔️ Prompt caching ✔️ Tool calling	No sooner than May 2026

Arcee Models

Model	Model ID	Max Output Tokens	Use for	Usage Notes
Trinity Large (Public Preview)	`arcee-trinity-large-thinking`	128,000	✔️ Serverless inference ✔️ ADK	✔️ Chat Completions API for sending prompts. ✔️ Prompt caching. ℹ️ Use is subject to Public Preview Terms including Arcee Terms & Conditions.

fal Models

Model	Model ID	Type	Use for	Usage Notes
Fast SDXL	`fal-ai/fast-sdxl`	Image generation	✔️ Serverless inference ✔️ ADK	ℹ️ Multimodal and generative model
Flux Schnell	`fal-ai/flux/schnell`	Image generation	✔️ Serverless inference ✔️ ADK	ℹ️ Multimodal and generative model
Stable Audio 2.5 (Text-to-Audio)	`fal-ai/stable-audio-25/text-to-audio`	Text-to-audio	✔️ Serverless inference ✔️ ADK	ℹ️ Multimodal and generative model
Multilingual TTS v2	`fal-ai/elevenlabs/tts/multilingual-v2`	Text-to-speech	✔️ Serverless inference ✔️ ADK	ℹ️ Multimodal and generative model

OpenAI Models

OpenAI models available on the Gradient AI Platform support tool (function) calling, prompt caching, and other features. See the usage notes in the following table for details. Refer to the provider documentation for other supported features.

Model	Model ID	Max Output Tokens	Use for	Usage Notes
GPT-5.4	`openai-gpt-5.4`	128,000	✔️ Serverless inference ✔️ ADK ✔️ Agents	✔️ Input context window of up to 1M tokens ✔️ Only the Responses API for sending prompts for serverless inference ✔️ Prompt caching ✔️ Tool calling
GPT-5.4 mini	`openai-gpt-5.4-mini`	128,000	✔️ Serverless inference ✔️ ADK	✔️ Only the Responses API for sending prompts for serverless inference ✔️ Prompt caching ✔️ Tool calling
GPT-5.4 nano	`openai-gpt-5.4-nano`	128,000	✔️ Serverless inference ✔️ ADK	✔️ Only the Responses API for sending prompts for serverless inference ✔️ Prompt caching ✔️ Tool calling
GPT-5.4 pro	`openai-gpt-5.4-pro`	128,000	✔️ Serverless inference ✔️ ADK	✔️ Only the Responses API for sending prompts for serverless inference ✔️ Tool calling
GPT-5.3-Codex	`openai-gpt-5.3-codex`	128,000	✔️ Serverless inference ✔️ ADK	✔️ Input context window of up to 400,000 tokens ✔️ Prompt caching ✔️ Tool calling
GPT-5.2	`openai-gpt-5.2`	128,000	✔️ Serverless inference ✔️ ADK ✔️ Agents	✔️ Prompt caching ✔️ Tool calling
GPT-5.2 pro	`openai-gpt-5-2-pro`	128,000	✔️ Serverless inference ✔️ ADK ✔️ Agents	✔️ Prompt caching ✔️ Tool calling
GPT-5.1-Codex-Max	`openai-gpt-5.1-codex-max`	128,000	✔️ Serverless inference ✔️ ADK	✔️ Prompt caching ✔️ Tool calling
GPT-5	`openai-gpt-5`	128,000	✔️ Serverless inference ✔️ ADK ✔️ Agents	✔️ Prompt caching ✔️ Tool calling
GPT-5 mini	`openai-gpt-5-mini`	128,000	✔️ Serverless inference ✔️ ADK ✔️ Agents	✔️ Prompt caching ✔️ Tool calling
GPT-5 nano	`openai-gpt-5-nano`	128,000	✔️ Serverless inference ✔️ ADK ✔️ Agents	✔️ Prompt caching ✔️ Tool calling
GPT-4.1	`openai-gpt-4.1`	32,768	✔️ Serverless inference ✔️ ADK ✔️ Agents	✔️ Prompt caching ✔️ Tool calling
GPT-4o	`openai-gpt-4o`	16,384	✔️ Serverless inference ✔️ ADK ✔️ Agents	✔️ Prompt caching ✔️ Tool calling
GPT-4o mini	`openai-gpt-4o-mini`	16,384	✔️ Serverless inference ✔️ ADK ✔️ Agents	✔️ Prompt caching ✔️ Tool calling
o1	`openai-o1`	Not published	✔️ Serverless inference ✔️ ADK ✔️ Agents	✔️ Prompt caching ✔️ Tool calling
o3	`openai-o3`	Not published	✔️ Serverless inference ✔️ ADK ✔️ Agents	✔️ Prompt caching ✔️ Tool calling
o3-mini	`openai-o3-mini`	Not published	✔️ Serverless inference ✔️ ADK ✔️ Agents	✔️ Prompt caching ✔️ Tool calling
GPT Image 1	`openai-gpt-image-1`	Not published	✔️ Serverless inference ✔️ ADK ✔️ Agents	✔️ Prompt caching ✔️ Tool calling
GPT Image 1.5	`openai-gpt-image-1.5`	Not published	✔️ Serverless inference ✔️ ADK

DigitalOcean-Hosted Models

Provider	Model	Model ID	Parameters	Max Output Tokens	Use for	Usage Notes
Alibaba	Qwen3-32B	`alibaba-qwen3-32b`	32 billion	40,960	✔️ Serverless inference ✔️ ADK
DeepSeek	DeepSeek R1 Distill Llama 70B	`deepseek-r1-distill-llama-70b`	70 billion	32,768	✔️ Serverless inference ✔️ ADK ✔️ Agents	ℹ️ When using in a user-facing agent, we strongly recommend adding all available guardrails for a safer conversational experience.
MiniMax	M2.5 (Public Preview)	`minimax-m2.5`	230 billion	128,000	✔️ Serverless inference ✔️ ADK ✔️ Agents	✔️ Chat Completions and Responses APIs for sending prompts for serverless inference. ℹ️ Use is subject to Public Preview Terms including MiniMax Model License.
Moonshot AI	Kimi K2.5	`kimi-k2.5`	1 trillion	32,768	✔️ Serverless inference ✔️ ADK ✔️ Agents	✔️ Chat Completions and Responses APIs for sending prompts for serverless inference. ℹ️ Use is subject to a Modified MIT license.
Meta	Llama 3.3 Instruct-70B	`llama3.3-70b-instruct`	70 billion	128,000	✔️ Serverless inference ✔️ ADK ✔️ Agents
NVIDIA	Nemotron-3-Super-120B (Public Preview)	`nvidia-nemotron-3-super-120b`	120 billion	Not published	✔️ Serverless inference ✔️ ADK ✔️ Agents	✔️ Chat Completions and Responses APIs for sending prompts for serverless inference. ℹ️ Use is subject to Public Preview Terms including NVIDIA Model License.
OpenAI	gpt-oss-120b	`openai-gpt-oss-120b`	117 billion	131,072	✔️ Serverless inference ✔️ ADK ✔️ Agents
OpenAI	gpt-oss-20b	`openai-gpt-oss-20b`	21 billion	131,072	✔️ Serverless inference ✔️ ADK ✔️ Agents
Z.ai	GLM 5	`glm-5`	744 billion	128,000	✔️ Serverless inference ✔️ ADK ✔️ Agents	✔️ Chat Completions and Responses APIs for sending prompts for serverless inference. ℹ️ Use is subject to the MIT License.

Available Foundation Models for DigitalOcean Gradient™ AI Inference Hub

We can't find any results for your search.