Skip to main content

Rate limiting

A rate limit policy caps how many requests a key serves within a time window. Ory Talos stores the policy on the key and returns it in verification responses. The Commercial edition enforces it server-side; the OSS edition returns the policy as metadata for your gateway to enforce. For how enforcement works in each edition, see the rate limiting concepts page.

Prerequisites

A running Ory Talos server. See the quickstart to start one locally. Attaching and reading policies works in both editions; server-side enforcement requires the Commercial edition with rate_limit.enabled set to true.

Attach a rate limit policy

Set a rate limit policy when issuing a key. The policy defines a quota (maximum requests, must be greater than 0) and a window (time window as a duration string, for example "60s"):

RESPONSE=$(talos keys issue "rate-limited-key" \
--actor service_api \
--rate-limit-quota 100 \
--rate-limit-window "60s" \
--format json \
-e "$TALOS_URL" 2>/dev/null)

echo "$RESPONSE" | jq .

export API_SECRET=$(echo "$RESPONSE" | jq -er '.secret')
export KEY_ID=$(echo "$RESPONSE" | jq -er '.issued_api_key.key_id')

The response includes the full key metadata with the rate_limit_policy attached. For the complete request and response field reference, see the IssueAPIKey API reference.

Verify a rate-limited key

Verify the key as you would any other credential. When the key has a rate limit policy, the response includes the policy metadata:

talos keys verify "$API_SECRET" -e "$TALOS_URL"

When the Commercial edition enforces the limit, the response also includes rate_limit_remaining (approximate requests available before the limit is reached) and rate_limit_reset_time (when the limiter returns to full capacity). For the complete response field reference, see the VerifyAPIKey API reference.

Exceeding the limit

When a key exhausts its quota, the Commercial edition returns is_valid: false with error code VERIFICATION_ERROR_RATE_LIMITED. The HTTP status stays 200: the verification endpoint always returns a structured response, so read is_valid and error_code from the body, not the status code.

{
"is_valid": false,
"error_code": "VERIFICATION_ERROR_RATE_LIMITED",
"error_message": "rate limit exceeded"
}

The HTTP response also includes a Retry-After header with the number of seconds to wait before retrying. The OSS edition never rejects requests on quota: it returns the policy as metadata and leaves enforcement to your gateway.

For the complete list of verification error codes, see the error codes reference.

Update rate limit policy

Change a key's rate limit policy with PATCH, without rotating the secret. The CLI sets the policy from the flags; over HTTP, list rate_limit_policy in the update_mask:

talos keys issued update "$KEY_ID" \
--rate-limit-quota 500 \
--rate-limit-window "120s" \
-e "$TALOS_URL"

Verification reads the policy from the key record, which may be cached, so the new policy applies once the cached entry refreshes. For the complete update field reference, see the UpdateIssuedAPIKey API reference.

Remove rate limit policy

To remove a rate limit policy, set the quota to 0 from the CLI, or set rate_limit_policy to null and list it in update_mask over HTTP:

talos keys issued update "$KEY_ID" \
--rate-limit-quota 0 \
-e "$TALOS_URL"

After removal, Talos no longer rate limits the key.

HTTP response headers

When a verified key has a rate limit policy, the HTTP gateway adds IETF draft-compliant headers to verification responses:

HeaderWhen presentDescription
RateLimit-PolicyAny edition, with a policyDeclares the quota and window: "default";q=100;w=60
RateLimitCommercial, enforcement onRemaining requests: "default";r=42
Retry-AfterCommercial, only when limitedSeconds to wait before the next allowed request

RateLimit-Policy lets your API gateway enforce externally (OSS) or read the configured limit. The RateLimit and Retry-After headers carry live counter state and appear only when the Commercial edition enforces the limit; clients use them for backoff.

Behavior notes

  • Fail-open on limiter errors — if the rate limiter backend is unavailable (for example, Redis is down), verification succeeds but Talos omits the dynamic counter fields (rate_limit_remaining, rate_limit_reset_time, and the RateLimit header). Limiter failures never block legitimate traffic.
  • Cache interaction — the limiter runs on every verification, including cache hits. Talos reads the policy from the key record (which may be cached), then consults the limiter. The counter state lives outside the verification cache, so a cache hit still decrements the counter.
  • Per-key isolation — each key keeps its own counter. Keys don't share rate limit budgets, even when they belong to the same actor.
  • Policy changes — because the policy is read from the key record, an updated policy applies once the cached entry refreshes. To read it immediately, send the verification request with the Cache-Control: no-cache header (the CLI's --no-cache flag).

Next steps