Rate limiting
Limits are per-account and per-plan. The API uses a token-bucket algorithm — small bursts above the sustained rate are allowed, and every response tells you how many requests you have left.
How it works#
Each account gets a bucket of request tokens. Every request consumes one token. Tokens refill at a constant rate determined by your plan. The bucket has a maximum size that exceeds the per-minute rate so you can absorb short bursts before throttling kicks in.
Limits are tracked per email, not per API key — multiple keys for the same account share the same budget.
Per-plan limits#
Response headers#
Every response (success or 429) includes:
x-ratelimit-limit— your plan's burst capacity.x-ratelimit-remaining— tokens left in your bucket.x-ratelimit-reset— seconds until the bucket fully refills.
On a 429 you'll also see Retry-After — the integer seconds you should wait before retrying.
Handling 429s#
Honor Retry-After. The simplest robust pattern is exponential backoff with a cap:
import time, requests
def call_with_retry(url, max_attempts=5):
backoff = 1
for attempt in range(max_attempts):
r = requests.get(url)
if r.status_code != 429:
return r
retry_after = int(r.headers.get("Retry-After", backoff))
time.sleep(retry_after)
backoff = min(backoff * 2, 60)
raise RuntimeError("rate limited; gave up")rate_limit_exceeded for the full error response shape.