Give Feedback

GenAI Inference Router

Generated on 20 Jul 2026

Copy page as Markdown View page as Markdown

This content is automatically generated from https://github.com/digitalocean-labs/mcp-digitalocean/blob/main/pkg/registry/genai-inferencerouter/README.md.

What is a model router?

A model router (sometimes called an inference router) is a named GenAI configuration in your DigitalOcean account. It does not run inference by itself; it defines how your app or agent should choose models for different kinds of work. Concretely, a router has:

Policies — Each policy ties a task to an ordered list of model ids and a selection policy (for example, prefer the fastest or cheapest model among the candidates for that task). A task is either a built-in task_slug (e.g. code generation, summarization) or a custom task you describe with a name and short description.
Fallback models — A required ordered list the API can use when primary model choices in a policy are not available, so traffic still has a path to complete.

Use cases and how routing fits together (for agents)

From DigitalOcean’s product documentation, an Inference Router is meant for production routing over serverless inference and dedicated inference. Instead of hard-coding one model per call, you define tasks (preset routes tuned by DigitalOcean or custom routes you name and describe), attach model pools and a selection policy (cost vs latency tradeoffs; the control panel also describes “optimal” presets and manual ordering for custom tasks), and fallback models so unmatched or degraded traffic still completes. The router evaluates each request against those tasks and policies. This MCP’s create/update/list/get/delete and preset-list tools map to the account-level router configuration (/v2/gen-ai/models/routers and related endpoints) that backs that behavior. After a router exists, client applications typically invoke it as a drop-in model target (for example model set to router:<your-router-name> in Chat Completions or Responses against the inference runtime—see the official how-to for exact URLs, headers such as model affinity, and response details like which route was selected).

Further reading

How to Use Inference Router — end-to-end: concepts, control panel vs API (POST /v2/gen-ai/models/routers), preset vs custom tasks, fallbacks, calling the router from inference APIs, playground and metrics.
DigitalOcean Inference Engine — where Inference Router sits alongside serverless, batch, and dedicated inference, evaluations, and the broader “single control plane” story.

When you use these MCP tools, you are managing that routing configuration through the typed godo.GradientAI client (same auth, base URL, and transport as the rest of this MCP server). Responses are formatted JSON aligned with the API (tasks for presets, model_routers / model_router, and config on get/create/update).

godo surface

Calls map to:

GradientAI.CreateInferenceRouter — create (POST /v2/gen-ai/models/routers)
GradientAI.ListInferenceRouters — list (GET … with page, per_page)
GradientAI.GetInferenceRouter — get by UUID
GradientAI.UpdateInferenceRouter — update (PUT …/{uuid})
GradientAI.DeleteInferenceRouter — delete by UUID
GradientAI.ListInferenceRouterTaskPresets — list preset tasks (GET /v2/gen-ai/models/routers/tasks/presets with page, per_page)

Built-in `task_slug` values (how to choose)

Prefer the live catalog — Call genai-inference-router-task-presets (wraps ListInferenceRouterTaskPresets) for current task_slug values, display names, suggested models, and pagination metadata. The API remains authoritative if a slug is unavailable in your account or region.

Practical ways to pick a slug:

genai-inference-router-task-presets — Returns each preset’s task_slug and related fields.
Copy from an existing router — Call genai-inference-router-list or genai-inference-router-get and read task_slug under model_router.config.policies.
Avoid slugs for one-off work — Use a custom_task policy with name and description instead of task_slug (the e2e test does this so tests do not depend on a specific catalog in every environment).

Static reference (snapshot for quick scanning; may drift—use genai-inference-router-task-presets or the docs above for current values):

General: brainstorming-ideation, classification-labeling, opinion-advice-recommendation, planning-task-decomposition, summarization, text-extraction-structured-output, translation
Writing: creative-writing, email-professional-communication-drafting, long-form-article-blog-writing, rewriting-editing, social-media-short-form-content
Software engineering: bug-fixing, code-completion-inline, code-generation, code-performance-optimization, test-writing-code-verification
Knowledge base & document intelligence: knowledge-base-customer-support, long-context-retrieval-aggregation, long-document-qa, rag-system-quality-evaluation, retrieval-quality-cross-domain-ir, text-and-table-grounded-reasoning

Tools

`genai-inference-router-create`

Arguments

Name	Type	Required	Description
`Name`	string	yes	Router name
`PoliciesJson`	string	no	JSON array of policies (omit or `[]` if the API allows); see below
`FallbackModels`	string[]	yes	At least one model id, sent as `fallback_models` (required by the API)

Create request body (via godo): name, optional policies, and required non-empty fallback_models. Each policy must define a task:

Built-in task: set task_slug (e.g. code-generation, summarization, bug-fixing) and models (ordered model id strings). Include selection_policy with prefer set to fastest or cheapest.
Custom task: use custom_task with name and description instead of task_slug, and still set models and selection_policy as needed.

Policies that only set model plus usecase_class (with no task) are rejected by the API with an error like policy 0 task is required.

Example (equivalent JSON body sent by godo; PoliciesJson is only the policies array):

{
  "name": "my-router",
  "policies": [
    {
      "task_slug": "code-generation",
      "models": ["openai-gpt-5", "anthropic-claude-4.6-sonnet"],
      "selection_policy": { "prefer": "fastest" }
    }
  ],
  "fallback_models": ["openai-gpt-oss-120b"]
}

Custom task policy example:

{
  "custom_task": {
    "name": "Code reviewer",
    "description": "Review patches for correctness and style."
  },
  "models": ["openai-gpt-5.2"],
  "selection_policy": { "prefer": "cheapest" }
}

Pass the contents of policies (a JSON array) as the PoliciesJson string. List and get return model_router.config.policies in the same general shape (task_slug or custom_task, models, selection_policy). List summaries and full router payloads may include regions when the API returns them; that field is not set by this MCP on create (see current godo.InferenceRouterCreateRequest).

`genai-inference-router-list`

Arguments

Name	Type	Required	Description
`Page`	number	no	Page (default 1)
`PerPage`	number	no	Page size (default 1000, max 1000)

`genai-inference-router-get`

Arguments

Name	Type	Required	Description
`UUID`	string	yes	Model router UUID

`genai-inference-router-delete`

Arguments

Name	Type	Required	Description
`UUID`	string	yes	Model router UUID to delete

Returns formatted JSON (typically {"uuid":"..."}). If the API responds with an empty body on success, the tool still returns a JSON object containing the requested UUID.

`genai-inference-router-task-presets`

Arguments

Name	Type	Required	Description
`Page`	number	no	Page (default 1)
`PerPage`	number	no	Page size (default 1000, max 1000)

Returns JSON with a tasks array (preset task_slug, names, categories, suggested models, etc.) plus optional meta and links. Use this when choosing built-in slugs for PoliciesJson on create/update.

`genai-inference-router-update`

Arguments

Name	Type	Required	Description
`UUID`	string	yes	Model router UUID
`Name`	string	no	New name (omit if unchanged)
`Description`	string	no	New description (omit if unchanged)
`PoliciesJson`	string	no	JSON array of policies (omit if unchanged); must decode as a JSON array when non-empty
`FallbackModels`	string[]	no	New ordered fallbacks; pass this argument only when updating fallbacks

At least one of Name, Description, non-empty PoliciesJson, or FallbackModels must be provided (matches godo validation).

Enabling the service

Register with --services genai-inferencerouter (or include it in SERVICES). A valid DIGITALOCEAN_API_TOKEN is required.

Notes

The GenAI model router API requires at least one fallback_models entry on create, so the MCP enforces a non-empty FallbackModels list for create (mirroring godo client validation).
Preview / unreleased APIs may only work on specific API hosts or accounts.
Response bodies are returned as formatted JSON text.
Integration tests / tokens: If DELETE …/v2/gen-ai/models/routers/{uuid} returns 403 (“not authorized”), your Personal Access Token can list or create routers but cannot delete them—use a token with write (not read-only) access to the account’s Inference / GenAI resources, or delete the router in the control panel under INFERENCE → Inference Router.

GenAI Inference Router

What is a model router?

Use cases and how routing fits together (for agents)

godo surface

Built-in task_slug values (how to choose)

Tools

genai-inference-router-create

genai-inference-router-list

genai-inference-router-get

genai-inference-router-delete

genai-inference-router-task-presets

genai-inference-router-update

Enabling the service

Notes

We can't find any results for your search.

Built-in `task_slug` values (how to choose)

`genai-inference-router-create`

`genai-inference-router-list`

`genai-inference-router-get`

`genai-inference-router-delete`

`genai-inference-router-task-presets`

`genai-inference-router-update`