GenAI Inference Router
Generated on 29 Jun 2026
This content is automatically generated from https://github.com/digitalocean-labs/mcp-digitalocean/blob/main/pkg/registry/genai-inferencerouter/README.md.
What is a model router?
A model router (sometimes called an inference router) is a named GenAI configuration in your DigitalOcean account. It does not run inference by itself; it defines how your app or agent should choose models for different kinds of work. Concretely, a router has:
- Policies — Each policy ties a task to an ordered list of model ids and a selection policy (for example, prefer the fastest or cheapest model among the candidates for that task). A task is either a built-in
task_slug(e.g. code generation, summarization) or a custom task you describe with a name and shortdescription. - Fallback models — A required ordered list the API can use when primary model choices in a policy are not available, so traffic still has a path to complete.
Use cases and how routing fits together (for agents)
From DigitalOcean’s product documentation, an Inference Router is meant for production routing over serverless inference and dedicated inference. Instead of hard-coding one model per call, you define tasks (preset routes tuned by DigitalOcean or custom routes you name and describe), attach model pools and a selection policy (cost vs latency tradeoffs; the control panel also describes “optimal” presets and manual ordering for custom tasks), and fallback models so unmatched or degraded traffic still completes. The router evaluates each request against those tasks and policies. This MCP’s create/update/list/get/delete and preset-list tools map to the account-level router configuration (/v2/gen-ai/models/routers and related endpoints) that backs that behavior. After a router exists, client applications typically invoke it as a drop-in model target (for example model set to router:<your-router-name> in Chat Completions or Responses against the inference runtime—see the official how-to for exact URLs, headers such as model affinity, and response details like which route was selected).
Further reading
- How to Use Inference Router — end-to-end: concepts, control panel vs API (
POST /v2/gen-ai/models/routers), preset vs custom tasks, fallbacks, calling the router from inference APIs, playground and metrics. - DigitalOcean Inference Engine — where Inference Router sits alongside serverless, batch, and dedicated inference, evaluations, and the broader “single control plane” story.
When you use these MCP tools, you are managing that routing configuration through the typed godo.GradientAI client (same auth, base URL, and transport as the rest of this MCP server). Responses are formatted JSON aligned with the API (tasks for presets, model_routers / model_router, and config on get/create/update).
godo surface
Calls map to:
GradientAI.CreateInferenceRouter— create (POST /v2/gen-ai/models/routers)GradientAI.ListInferenceRouters— list (GET …withpage,per_page)GradientAI.GetInferenceRouter— get by UUIDGradientAI.UpdateInferenceRouter— update (PUT …/{uuid})GradientAI.DeleteInferenceRouter— delete by UUIDGradientAI.ListInferenceRouterTaskPresets— list preset tasks (GET /v2/gen-ai/models/routers/tasks/presetswithpage,per_page)
Built-in task_slug values (how to choose)
Prefer the live catalog — Call genai-inference-router-task-presets (wraps ListInferenceRouterTaskPresets) for current task_slug values, display names, suggested models, and pagination metadata. The API remains authoritative if a slug is unavailable in your account or region.
Practical ways to pick a slug:
genai-inference-router-task-presets— Returns each preset’stask_slugand related fields.- Copy from an existing router — Call
genai-inference-router-listorgenai-inference-router-getand readtask_slugundermodel_router.config.policies. - Avoid slugs for one-off work — Use a
custom_taskpolicy withnameanddescriptioninstead oftask_slug(the e2e test does this so tests do not depend on a specific catalog in every environment).
Static reference (snapshot for quick scanning; may drift—use genai-inference-router-task-presets or the docs above for current values):
- General:
brainstorming-ideation,classification-labeling,opinion-advice-recommendation,planning-task-decomposition,summarization,text-extraction-structured-output,translation - Writing:
creative-writing,email-professional-communication-drafting,long-form-article-blog-writing,rewriting-editing,social-media-short-form-content - Software engineering:
bug-fixing,code-completion-inline,code-generation,code-performance-optimization,test-writing-code-verification - Knowledge base & document intelligence:
knowledge-base-customer-support,long-context-retrieval-aggregation,long-document-qa,rag-system-quality-evaluation,retrieval-quality-cross-domain-ir,text-and-table-grounded-reasoning
Tools
genai-inference-router-create
Arguments
| Name | Type | Required | Description |
|---|---|---|---|
Name |
string | yes | Router name |
PoliciesJson |
string | no | JSON array of policies (omit or [] if the API allows); see below |
FallbackModels |
string[] | yes | At least one model id, sent as fallback_models (required by the API) |
Create request body (via godo): name, optional policies, and required non-empty fallback_models. Each policy must define a task:
- Built-in task: set
task_slug(e.g.code-generation,summarization,bug-fixing) andmodels(ordered model id strings). Includeselection_policywithpreferset tofastestorcheapest. - Custom task: use
custom_taskwithnameanddescriptioninstead oftask_slug, and still setmodelsandselection_policyas needed.
Policies that only set model plus usecase_class (with no task) are rejected by the API with an error like policy 0 task is required.
Example (equivalent JSON body sent by godo; PoliciesJson is only the policies array):
{
"name": "my-router",
"policies": [
{
"task_slug": "code-generation",
"models": ["openai-gpt-5", "anthropic-claude-4.6-sonnet"],
"selection_policy": { "prefer": "fastest" }
}
],
"fallback_models": ["openai-gpt-oss-120b"]
}Custom task policy example:
{
"custom_task": {
"name": "Code reviewer",
"description": "Review patches for correctness and style."
},
"models": ["openai-gpt-5.2"],
"selection_policy": { "prefer": "cheapest" }
}Pass the contents of policies (a JSON array) as the PoliciesJson string. List and get return model_router.config.policies in the same general shape (task_slug or custom_task, models, selection_policy). List summaries and full router payloads may include regions when the API returns them; that field is not set by this MCP on create (see current godo.InferenceRouterCreateRequest).
genai-inference-router-list
Arguments
| Name | Type | Required | Description |
|---|---|---|---|
Page |
number | no | Page (default 1) |
PerPage |
number | no | Page size (default 1000, max 1000) |
genai-inference-router-get
Arguments
| Name | Type | Required | Description |
|---|---|---|---|
UUID |
string | yes | Model router UUID |
genai-inference-router-delete
Arguments
| Name | Type | Required | Description |
|---|---|---|---|
UUID |
string | yes | Model router UUID to delete |
Returns formatted JSON (typically {"uuid":"..."}). If the API responds with an empty body on success, the tool still returns a JSON object containing the requested UUID.
genai-inference-router-task-presets
Arguments
| Name | Type | Required | Description |
|---|---|---|---|
Page |
number | no | Page (default 1) |
PerPage |
number | no | Page size (default 1000, max 1000) |
Returns JSON with a tasks array (preset task_slug, names, categories, suggested models, etc.) plus optional meta and links. Use this when choosing built-in slugs for PoliciesJson on create/update.
genai-inference-router-update
Arguments
| Name | Type | Required | Description |
|---|---|---|---|
UUID |
string | yes | Model router UUID |
Name |
string | no | New name (omit if unchanged) |
Description |
string | no | New description (omit if unchanged) |
PoliciesJson |
string | no | JSON array of policies (omit if unchanged); must decode as a JSON array when non-empty |
FallbackModels |
string[] | no | New ordered fallbacks; pass this argument only when updating fallbacks |
At least one of Name, Description, non-empty PoliciesJson, or FallbackModels must be provided (matches godo validation).
Enabling the service
Register with --services genai-inferencerouter (or include it in SERVICES). A valid DIGITALOCEAN_API_TOKEN is required.
Notes
- The GenAI model router API requires at least one
fallback_modelsentry on create, so the MCP enforces a non-emptyFallbackModelslist for create (mirroringgodoclient validation). - Preview / unreleased APIs may only work on specific API hosts or accounts.
- Response bodies are returned as formatted JSON text.
- Integration tests / tokens: If
DELETE …/v2/gen-ai/models/routers/{uuid}returns 403 (“not authorized”), your Personal Access Token can list or create routers but cannot delete them—use a token with write (not read-only) access to the account’s Inference / GenAI resources, or delete the router in the control panel under INFERENCE → Inference Router.