Use the DigitalOcean MCP server to manage Dedicated Inference, Model Catalog, and other DigitalOcean resources.
Inference Reference
Validated on 20 Apr 2026 • Last edited on 27 Apr 2026
Inference provides a single control plane for managing inference workflows. It includes a Model Catalog where you can view available foundation models, including both DigitalOcean-hosted and third-party commercial models, compare model capabilities and pricing, use routing to match inference requests to the best-fit model, and run inference using serverless or dedicated deployments.
The DigitalOcean API
The DigitalOcean API lets you manage resources programmatically with standard HTTP requests. All actions available in the control panel are also available through the API.
-
Serverless Inference API: Interact directly with foundation models for chat completions, or generating image, audio and text-to-speech.
-
Dedicated Inference API: Manage your dedicated inference deployments. Dedicated Inference is available in public preview. You can opt in from the Feature Preview page.
The Inference SDK
Use the official DigitalOcean Python client library for:
You can also use the official DigitalOcean TypeScript library or Go library.
The SDK will be deprecated in a future release.
The DigitalOcean MCP Server
The DigitalOcean MCP server lets you use natural language prompts to manage AI resources, including creating, updating, listing, and deleting Dedicated Inference endpoints and browsing models in Model Catalog. Supported operations use argument-based input.