> ## Documentation Index
> Fetch the complete documentation index at: https://docs.tallyforagents.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Rate limits

> Per-key request ceilings and how to respect Retry-After.

Tally rate-limits every `/v1/*` endpoint per API key, using a sliding-window counter in Upstash Redis. Money-moving endpoints get a stricter ceiling because abuse there is more expensive than abuse on reads.

## The ceilings

| Bucket     | Endpoints                                                                           | Limit                    |
| ---------- | ----------------------------------------------------------------------------------- | ------------------------ |
| `default`  | `GET /v1/agents`, `GET /v1/agents/{id}`, `POST /v1/agents`, `GET /v1/payments/{id}` | 60 requests / 60 seconds |
| `payments` | `POST /v1/payments`                                                                 | 30 requests / 60 seconds |

Counters are per API key, not per account. If you rotate a key, the new key starts with a fresh budget.

## When you hit the ceiling

A request that exceeds the limit returns `429 Too Many Requests` with the error envelope:

```json theme={null}
{
  "error": {
    "type": "rate_limited",
    "message": "Rate limit exceeded. Retry in 12s.",
    "code": "rate_limit_exceeded"
  }
}
```

The response carries headers you should honor:

| Header                  | Meaning                                                         |
| ----------------------- | --------------------------------------------------------------- |
| `Retry-After`           | Whole-second hint before you should retry. Always at least 1.   |
| `X-RateLimit-Limit`     | The configured limit for the bucket.                            |
| `X-RateLimit-Remaining` | Requests remaining in the current window (0 when rate-limited). |
| `X-RateLimit-Reset`     | Millisecond Unix timestamp when the window resets.              |

## Idiomatic handling

```ts theme={null}
async function withRateLimitRetry<T>(
  fn: () => Promise<Response>,
): Promise<T> {
  for (let attempt = 0; attempt < 5; attempt++) {
    const res = await fn();
    if (res.status !== 429) return res.json();

    const retryAfter = Number(res.headers.get("Retry-After") ?? 1);
    await new Promise((r) => setTimeout(r, retryAfter * 1000));
  }
  throw new Error("rate-limited too many times");
}
```

When retrying writes (`POST /v1/payments`), pair this with an [idempotency key](/api/idempotency) so a retry that races with the original doesn't double-spend.

## Anticipating limits

`X-RateLimit-Remaining` on every successful response lets you proactively back off before you hit the ceiling — useful for batch flows where you'd rather slow down preemptively than handle 429s mid-stream.

```ts theme={null}
const res = await fetch(`${baseUrl}/v1/agents`, { headers: authHeaders });
const remaining = Number(res.headers.get("X-RateLimit-Remaining") ?? 0);
if (remaining < 5) await sleep(2_000); // give the window a moment to drain
```

## What's not rate-limited

* Webhook deliveries from Tally to your endpoint. Those follow the [retry schedule](/webhooks#retries-and-timeouts) regardless of your endpoint's response time.
* Dashboard endpoints (cookie-authed). These have their own internal limits.

## Asking for a higher ceiling

The default ceilings are calibrated for a typical agent product. If you have a use case that needs more throughput — bulk payouts, batched grant inspection — reach out before launch and we'll tune the limits to your account.

## In the SDK

The TypeScript SDK throws [`RateLimitError`](/sdk/errors#ratelimiterror--429) on 429 responses. The retry pattern lives in your application code — the SDK is a thin transport and doesn't auto-retry. See [SDK errors](/sdk/errors) for the exception shape.
