Give Feedback

DigitalOcean Gradient™ AI Inference Hub How-Tos

Generated on 17 Apr 2026

DigitalOcean Gradient™ AI Inference Hub provides a single control plane for managing inference workflows. It includes a Model Catalog where you can view available foundation models, including both DigitalOcean-hosted and third-party commercial models, compare capabilities and pricing, and run inference using serverless or dedicated deployments. DigitalOcean Gradient AI Inference Hub is in private preview. You can contact support for questions or assistance.

Copy page as Markdown View page as Markdown

Serverless Inference

Serverless Inference Overview

What is serverless inference and how it differs from dedicated inference.

Serverless Inference API Endpoints

Synchronous and asynchronous API endpoints for serverless inference.

How to Retrieve Available Models

How to retrieve models available for serverless inference.

How to Send Prompts to a Model Using the Chat Completions API

Send prompts and use reasoning with the Chat Completions API.

How to Send Prompts to a Model Using the Responses API

Send prompts with the Responses API.

How to Use Prompt Caching in Chat Completions and Responses API

Use prompt caching with the Chat Completions and Responses API.

How to Use Reasoning with the Chat Completions and Responses API

Use reasoning with the Chat Completions and Responses API.

How to Generate Images from Text Prompts

Generate or edit images from text prompts.

How to Use Serverless Inference After Updating to Another Model

How to use serverless inference after updating a model.

How to Use fal Models to Generate Image, Audio, or Text-to-Speech

Generate image, audio, or text-to-speech using fal models.

Manage Model Catalog

Browse Models in Model Catalog

Identify the right model for your use case by filtering available foundation models by capabilities and price.

Use Model Playground

Test and Compare Models Using the Model Playground

Test and compare foundation models in the Model Playground.

Manage Model Access Keys

How to Create and Update Model Access Keys

Create and edit model access keys to use serverless inference endpoints.

Manage Inference Deployments

How to Use Dedicated Inference on DigitalOcean Gradient™ AI Inference Hub

Deploy open-source and commercial LLMs on dedicated GPUs as an inference endpoint.