Start free

Rate limits

The cloud API caps how fast each team can call it, so one busy tenant can never slow the platform for everyone else. The cap is per plan — and a 429 never touches your local chain.

How the limit works

Each plan gets a sustained request rate — a number of requests per second — with a short burst allowance, so a brief spike is not punished. The limit is measured per organization, not per API key, so issuing more keys does not raise it. Reads and writes draw on the same budget.

The exact numbers depend on your plan and can change, so they live in one place rather than being restated here: the pricing page lists the rate for each tier, and your dashboard shows the rate and burst in effect for your team right now.

When you go over

Past the rate, the API replies 429 Too Many Requests. Every response carries standard headers, so you always know where you stand:

HeaderMeaning
Retry-AfterSeconds to wait before trying again. Honour this first.
RateLimit-LimitYour burst capacity — the most requests allowed at once.
RateLimit-RemainingRequests left in the current window.
RateLimit-ResetSeconds until the budget refills to full.

A 429 from your plan cap also carries RateLimit-Scope: tenant, which tells it apart from a coarse anti-flood limit at the edge. You may occasionally see 503 Service Unavailable with a short Retry-After when the platform is momentarily at capacity — handle it the same way.

Handling it

The rule is simple: on a 429, wait for the Retry-After value, then retry — and back off exponentially if it keeps coming. The SDKs surface the HTTP status and headers on a structured error, so you can branch on a throttle without parsing messages.

A 429 never drops a receipt

Rate limiting only throttles the cloud mirror. Your agent’s local signed chain is never refused by a 429 — receipts keep being written locally and stay verifiable offline. The throttle only delays mirroring them to the cloud; it never blocks the action or breaks the chain.

Next steps