Quota-Specific Response Headers For Serverless Inference

Validated on 19 May 2026 • Last edited on 12 Jun 2026

Inference provides a single control plane for managing inference workflows. It includes a Model Catalog where you can view available foundation models, including both DigitalOcean-hosted and third-party commercial models, compare model capabilities and pricing, use routing to match inference requests to the best-fit model, and run inference using serverless or dedicated deployments.

The following headers are for serverless inference-specific quotas and govern inference workloads:

  • x-ratelimit-limit-<limit-type>: Total capacity configured for the evaluated quota bucket, corresponding to the number of requests (x-ratelimit-limit-requests), tokens per day (x-ratelimit-limit-tokens-per-day), and tokens per minute (x-ratelimit-limit-tokens-per-minute).

  • x-ratelimit-remaining-<limit-type>: Remaining available capacity after the current request evaluation, corresponding to the number of remaining requests (x-ratelimit-remaining-requests), remaining tokens per day (x-ratelimit-remaining-tokens-per-day), and remaining tokens per minute (x-ratelimit-remaining-tokens-per-minute).

  • x-ratelimit-reset-<limit-type>: Unix epoch timestamp (a non-zero value in seconds) that projects when the bucket will be refilled to enough capacity to satisfy a request of the size that was just evaluated or rejected. Because the bucket refills continuously, this is a forward projection, not a window-boundary timestamp. These headers correspond to the number of requests (x-ratelimit-reset-requests), tokens per day (x-ratelimit-reset-tokens-per-day), and tokens per minute (x-ratelimit-reset-tokens-per-minute).

    A value of 0 indicates that when the request was evaluated, there was sufficient capacity, and no action is required.

    Note
    x-ratelimit-reset-<limit-type> is not a fixed-window boundary. Since inference buckets refill continuously, the value is a refill projection.

We can't find any results for your search.

Try using different keywords or simplifying your search terms.