Give Feedback

How to Use Built-in Tools

Validated on 28 Apr 2026 • Last edited on 13 May 2026

Inference provides a single control plane for managing inference workflows. It includes a Model Catalog where you can view available foundation models, including both DigitalOcean-hosted and third-party commercial models, compare model capabilities and pricing, use routing to match inference requests to the best-fit model, and run inference using serverless or dedicated deployments.

Copy page as Markdown View page as Markdown

Built-in tools are server-side integrations that extend the model’s capabilities during inference. Instead of managing tool orchestration yourself, you add tool definitions to your API request for discovery, execution, and response integration automatically. We provide built-in tools for knowledge base retrieval, the DigitalOcean MCP server, and web search.

Using knowledge base retrieval and the DigitalOcean MCP server does not incur additional charges other than the standard per-token inference costs. You are charged $10 per 1000 requests for using web search with serverless inference.

Built-in tools work with both the Chat Completions API and the Responses API.

Retrieve Knowledge Base

Knowledge base retrieval lets the model query your private data sources during inference using retrieval-augmented generation (RAG). You add the knowledge_base_retrieval tool to your API request and the inference API handles retrieval and incorporates the results into the model’s response automatically.

To use knowledge base retrieval, send a POST request with your knowledge base ID. You can find the ID in the DigitalOcean Control Panel or by querying the API. Set tool_choice to auto to let the model decide when to query the knowledge base, or required to always query it before responding.

curl -X POST https://inference.do-ai.run/v1/chat/completions \
  -H "Authorization: Bearer $MODEL_ACCESS_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai-gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": "What are some features of DigitalOcean Inference?"
      }
    ],
    "tools": [
      {
        "type": "knowledge_base_retrieval",
        "knowledge_base_id": "<your-knowledge-base-id>"
      }
    ],
    "tool_choice": "auto",
    "stream": false,
    "max_tokens": 1024
  }'

For the full set of parameters, see the Serverless Inference API reference. The response looks like the following:

{
    "choices": [
        {
            "finish_reason": "stop",
            "index": 0,
            "logprobs": null,
            "message": {
                "annotations": [
                    {
                        "type": "tool_use",
                        "tool_use": {
                            "name": "knowledge_base_retrieval",
                            "call_id": "call_y6aBCI2IrnS8ZgPR6DKjXl9T",
                            "arguments": "{\"query\":\"features of DigitalOcean Inference\"}",
                            "status": "completed",
                            "output": "{\"knowledge_base_id\":\"e7651dee-da73-11ef-bf8f-4e013e2ddde4\",\"query\":\"features of DigitalOcean Inference\",\"results\":[{\"metadata\":{\"chunk_category\":\"CompositeElement\",\"ingested_timestamp\":\"2026-05-11T18:15:46.532085+00:00\",\"item_name\":\"https://docs.digitalocean.com/\",\"page_number\":null},\"text_content\":\"### 1 May 2026 [ ](https://docs.digitalocean.com/#1-may-2026) * The following DeepSeek model is now available on DigitalOcean Inference for [serverless inference](https://docs.digitalocean.com/products/inference/how-to/use-serverless-inference/), [Agent Development Kit](https://docs.digitalocean.com/products/inference/how-to/build-agents-using-adk/) and [agents](https://docs.digitalocean.com/products/inference/how-to/create-agents/): For more information, see the [Available Models page](https://docs.digitalocean.com/products/inference/details/models/).\"},{\"metadata\":{\"chunk_category\":\"CompositeElement\",\"ingested_timestamp\":\"2026-05-11T18:15:46.532085+00:00\",\"item_name\":\"https://docs.digitalocean.com/\",\"page_number\":null},\"text_content\":\"### 5 May 2026 [ ](https://docs.digitalocean.com/#5-may-2026) ....\"}],\"total_results\":3}"
                        }
                    }
                ],
                "content": "DigitalOcean Inference includes several features designed to support serverless inference and facilitate the development and deployment of AI models and agents. ...",
                "reasoning_content": null,
                "refusal": null,
                "role": "assistant"
            }
        }
    ],
    ...
    }
}

Use Model Context Protocol (MCP)

MCP servers expose tools that the model can call, such as fetching account data, managing schedules, or interacting with third-party APIs. The MCP built-in tool connects the model to remote Model Context Protocol (MCP) servers and orchestrates calls to them.

Connect to an Authenticated MCP Server

You can connect to authenticated MCP servers using bearer token authentication. The following example sends a Chat Completions request that connects to the DigitalOcean Accounts MCP server. Replace $DIGITALOCEAN_API_TOKEN with a valid DigitalOcean personal access token.

curl -X POST https://inference.do-ai.run/v1/chat/completions \
  -H "Authorization: Bearer $MODEL_ACCESS_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai-gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": "Fetch my DigitalOcean account information and summarize it in 2 bullets."
      }
    ],
    "tools": [
      {
        "type": "mcp",
        "server_label": "digitalocean",
        "server_url": "https://accounts.mcp.digitalocean.com/mcp",
        "authorization": "Bearer $DIGITALOCEAN_API_TOKEN",
        "allowed_tools": ["account-get-information"]
      }
    ],
    "tool_choice": "required",
    "stream": false,
    "max_tokens": 512
  }'

The allowed_tools array restricts which tools from the MCP server the model can call. In this example, only the account-get-information tool is available. When omitted, the model can use any tool the server exposes. For the full set of MCP tool parameters, see the Serverless Inference API reference. The response looks like the following:

{
    "choices": [
        {
            "finish_reason": "stop",
            "index": 0,
            "logprobs": null,
            "message": {
                "annotations": [
                    {
                        "type": "tool_use",
                        "tool_use": {
                            "name": "digitalocean__account-get-information",
                            "call_id": "call_xasb9Xk1HAfT564P3bteZZTT",
                            "arguments": "{}",
                            "status": "completed",
                            "output": "{\n  \"droplet_limit\": 100,\n  \"floating_ip_limit\": 75,\n  \"reserved_ip_limit\": 75,\n  \"volume_limit\": 5000,\n  \"email\": \"[email protected]\",\n  \"name\": \"dev-sammy\",\n  \"uuid\": \"de55ee97-21ab-452d-aaf0-d4046480xxxx\",\n  \"email_verified\": true,\n  \"status\": \"active\",\n  \"team\": {\n    \"name\": \"My Team\",\n    \"uuid\": \"de55ee97-21ab-452d-aaf0-d4046480xxxx\"\n  }\n}"
                        }
                    }
                ],
                "content": "- Your account (\"dev-sammy\") is active with email \"[email protected]\", which is verified. You are part of \"My Team\" with a UUID of \"de55ee97-21ab-452d-aaf0-d4046480xxxx\".\n- You have a resource allocation limit of 100 droplets, 75 floating IPs, 75 reserved IPs, and 5000 volumes.",
                "reasoning_content": null,
                "refusal": null,
                "role": "assistant"
            }
        }
    ],
....    }
}

Connect to an Unauthenticated MCP Server

You can also connect to public MCP servers that do not require authentication. The following example sends a Responses API request:

curl -X POST https://inference.do-ai.run/v1/responses \
  -H "Authorization: Bearer $MODEL_ACCESS_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai-gpt-4o",
    "input": "Create a scheduling poll called Team Lunch with two time options for tomorrow at noon and the day after at noon.",
    "tools": [
      {
        "type": "mcp",
        "server_label": "timergy",
        "server_url": "https://api.timergy.com/mcp"
      }
    ],
    "tool_choice": "required",
    "stream": false,
    "max_output_tokens": 512
  }'

The response looks like the following:

{
...
  "model": "openai-gpt-4o",
  "object": "response",
  "output": [
    {
      "arguments": "{\"autoFinalize\":false,\"creatorName\":\"Assistant for Team\",\"deadline\":\"2026-04-08T12:00:00-05:00\",\"description\":\"Scheduling poll for a team lunch\",\"invitees\":[],\"location\":\"Office Cafeteria\",\"options\":[{\"end\":\"2026-04-10T13:00:00-05:00\",\"start\":\"2026-04-10T12:00:00-05:00\"},{\"end\":\"2026-04-11T13:00:00-05:00\",\"start\":\"2026-04-11T12:00:00-05:00\"}],\"title\":\"Team Lunch\"}",
      "call_id": "call_uallF71f2THNYsEvZUubSUhQ",
      "name": "timergy__create_poll",
      "status": "completed",
      "type": "function_call"
    },
    {
      "call_id": "call_uallF71f2THNYsEvZUubSUhQ",
      "output": "{\n  \"pollId\": \"fdc8110b-274d-45b4-b791-0149f6cfc4bc\",\n  \"title\": \"Team Lunch\",\n  \"url\": \"https://timergy.com/en/polls/fdc8110b-274d-45b4-b791-0149f6cfc4bc\",\n  \"passphrase\": \"horse-sword-thumb\",\n  \"options\": [\n    {\n      \"id\": \"c8879755-9136-497d-aedc-81b50fafbb13\",\n      \"start\": \"2026-04-10T17:00:00.000Z\",\n      \"end\": \"2026-04-10T18:00:00.000Z\",\n      \"label\": null\n    },\n    {\n      \"id\": \"58c474c6-12b4-42d5-a96e-c6c977d8a3b2\",\n      \"start\": \"2026-04-11T17:00:00.000Z\",\n      \"end\": \"2026-04-11T18:00:00.000Z\",\n      \"label\": null\n    }\n  ],\n  \"expiresAt\": \"2026-04-21T18:00:00.000Z\",\n  \"autoFinalize\": false,\n  \"inviteesSent\": 0,\n  \"note\": \"Share the URL with participants. The passphrase is saved for finalization. Assistant for Team's \\\"yes\\\" votes have been auto-submitted.\"\n}",
      "status": "completed",
      "type": "function_call_output"
    },
    {
      "content": [
        {
          "annotations": [],
          "logprobs": [],
          "text": "The scheduling poll \"Team Lunch\" has been created. You can share the following URL with participants to vote:\n\n**Poll URL:** [Team Lunch Poll](https://timergy.com/en/polls/fdc8110b-274d-45b4-b791-0149f6cfc4bc)\n\nFor admin access and to finalize the poll, you can use the passphrase:\n\n**Passphrase:** `horse-sword-thumb`\n\nThe poll includes two time slot options:\n- April 10, 2026, from 12:00 PM to 1:00 PM (local time)\n- April 11, 2026, from 12:00 PM to 1:00 PM (local time)",
          "type": "output_text"
        }
      ],
...
  "tool_choice": "auto",
  "tools": [
    {
      "type": "mcp",
      "server_label": "timergy",
      "server_url": "https://api.timergy.com/mcp"
    }
  ],
...
  }
}

Add Web Search to Inference Request

Web search is a built-in tool that gives the model access to real-time web content during inference. When you add web search in your API request, the model decides when a search is needed, and the results are incorporated into the model’s response.

To enable web search, include a tool object with type set to web_search in the tools array of your request.

The following example sends a Responses API request with web search enabled:

curl -X POST https://inference.do-ai.run/v1/responses \
  -H "Authorization: Bearer $MODEL_ACCESS_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai-gpt-4o",
    "input": "What are the latest pricing changes for DigitalOcean Droplets?",
    "tools": [
      {
        "type": "web_search",
        "max_uses": 3,
        "max_results": 5
      }
    ],
    "max_output_tokens": 1024,
    "stream": false
  }'

When the model determines that a prompt benefits from web search, it searches for relevant information and incorporates the results into its response.

You can optionally limit how many searches the model performs per request with max_uses (1-5) and how many results each search returns with max_results (1-10, default 5).

When max_uses is reached, the model produces a final response using the results collected so far. For the full set of web search parameters, see the Serverless Inference API reference.

The response looks similar to the following:

{
  ...
  "output": [
    {
      "action": {
        "queries": [
          "DigitalOcean AI platform features"
        ],
        "query": "DigitalOcean AI platform features",
        "type": "search"
      },
      "id": "ws_call_t7eyYNbAWQOcEln1Ns2TxuOv",
      "status": "completed",
      "type": "web_search_call"
    },
    {
      "content": [
        {
          "annotations": [
            {
              "end_index": 1501,
              "start_index": 1439,
              "title": "DigitalOcean AI Platform Features | DigitalOcean Documentation",
              "type": "url_citation",
              "url": "https://docs.digitalocean.com/products/inference/details/features"
            },
            {
              "end_index": 1800,
              "start_index": 1729,
              "title": "DigitalOcean Inference Details | DigitalOcean Documentation",
              "type": "url_citation",
              "url": "https://docs.digitalocean.com/products/inference/details"
            },
            {
              "end_index": 2085,
              "start_index": 2028,
              "title": "Agent Platform | Build AI Agents with DigitalOcean",
              "type": "url_citation",
              "url": "https://www.digitalocean.com/products/inference/platform"
            }
          ],
          "logprobs": [],
          "text": "The DigitalOcean AI Platform offers a variety of features:\n\n1. **AI Agent Development**: Build fully-managed AI ...",
          "type": "output_text"
        }
      ],
....
  }
}

Use Built-in Tools With Dedicated Inference

You can use built-in tools with dedicated inference. Provide the name of the dedicated inference and model slug that you can find using the API. For example:

curl -X POST https://inference.do-ai.run/v1/chat/completions \
  -H "Authorization: Bearer $MODEL_ACCESS_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "dedicated:<dedicated-inference-name>:<model_slug>",
    "messages": [
      {
        "role": "user",
        "content": "What features does DigitalOcean Inference offer?"
      }
    ],
    "tools": [
      {
        "type": "web_search",
        "max_uses": 2
      }
    ],
    "stream": false,
    "max_tokens": 1024
  }'