What is serverless inference and how it differs from dedicated inference.
DigitalOcean Gradient™ AI Inference Hub How-Tos
Generated on 17 Apr 2026
DigitalOcean Gradient™ AI Inference Hub provides a single control plane for managing inference workflows. It includes a Model Catalog where you can view available foundation models, including both DigitalOcean-hosted and third-party commercial models, compare capabilities and pricing, and run inference using serverless or dedicated deployments. DigitalOcean Gradient AI Inference Hub is in private preview. You can contact support for questions or assistance.
Serverless Inference
Synchronous and asynchronous API endpoints for serverless inference.
How to retrieve models available for serverless inference.
Send prompts and use reasoning with the Chat Completions API.
Send prompts with the Responses API.
Use prompt caching with the Chat Completions and Responses API.
Use reasoning with the Chat Completions and Responses API.
Generate or edit images from text prompts.
How to use serverless inference after updating a model.
Generate image, audio, or text-to-speech using fal models.
Manage Model Catalog
Identify the right model for your use case by filtering available foundation models by capabilities and price.
Use Model Playground
Test and compare foundation models in the Model Playground.
Manage Model Access Keys
Create and edit model access keys to use serverless inference endpoints.