Getting started

Rate limiting

Limits are per-account and per-plan. The API uses a token-bucket algorithm — small bursts above the sustained rate are allowed, and every response tells you how many requests you have left.

How it works#

Each account gets a bucket of request tokens. Every request consumes one token. Tokens refill at a constant rate determined by your plan. The bucket has a maximum size that exceeds the per-minute rate so you can absorb short bursts before throttling kicks in.

Limits are tracked per email, not per API key — multiple keys for the same account share the same budget.

Per-plan limits#

PlanRequests / minBurst (max tokens)Refill rate
DEMO60901.0 / sec
BASIC10200.167 / sec
STARTER10200.167 / sec
PREMIUM20400.333 / sec
STARTUP25500.417 / sec
ULTIMATE501000.833 / sec
SEED501000.833 / sec
ULTIMATE_PLUS2505004.167 / sec
ENTERPRISE5001,0008.333 / sec
ENTERPRISE_PLUS5,00010,00083.333 / sec

Response headers#

Every response (success or 429) includes:

  • x-ratelimit-limit — your plan's burst capacity.
  • x-ratelimit-remaining — tokens left in your bucket.
  • x-ratelimit-reset — seconds until the bucket fully refills.

On a 429 you'll also see Retry-After — the integer seconds you should wait before retrying.

Handling 429s#

Honor Retry-After. The simplest robust pattern is exponential backoff with a cap:

import time, requests

def call_with_retry(url, max_attempts=5):
    backoff = 1
    for attempt in range(max_attempts):
        r = requests.get(url)
        if r.status_code != 429:
            return r
        retry_after = int(r.headers.get("Retry-After", backoff))
        time.sleep(retry_after)
        backoff = min(backoff * 2, 60)
    raise RuntimeError("rate limited; gave up")
Tip
See rate_limit_exceeded for the full error response shape.