Rate limits

Paylera applies sliding-window rate limits per tenant per route group. The limits are generous for normal use; bursts and pathological patterns get a structured 429 with a Retry-After header.

Live limits

Route group	Limit (live)
Read endpoints (`GET /v1/*`)	1000 / second
Write endpoints (most `POST` / `PATCH` / `DELETE`)	200 / second
Money-moving (`POST /v1/payments/`, `/v1/refunds/`, `/v1/payouts/*`)	100 / second
Webhook deliveries (outbound, doesn’t count against your limit)	n/a
Usage ingestion (`POST /v1/usage`, `/v1/usage/batch`)	5000 events / second
Admin endpoints (`/v1/admin/*`)	50 / second

Limits are per-tenant, not per-key. Multiple tokens against the same tenant share the same buckets.

Sandbox limits

Sandbox is one-tenth of live across the board. Enough to develop and test; not enough to load-test.

Headers on every response

X-RateLimit-Limit:      1000
X-RateLimit-Remaining:  847
X-RateLimit-Reset:      1736179215

Reset is a Unix epoch seconds at which the bucket fully refills.

When you hit a limit

HTTP/1.1 429 Too Many Requests
Retry-After: 2
Content-Type: application/problem+json

{
  "type": "https://errors.paylera.io/rate-limit/exceeded",
  "title": "Rate limit exceeded",
  "status": 429,
  "problem": "rate_limit.exceeded",
  "detail": "You've exceeded the limit for write endpoints. Retry after 2 seconds.",
  "trace_id": "01H8…"
}

Retry-After is in seconds. It’s a suggestion based on the bucket refill rate; honour it.

Backoff

Recommended pattern:

def call_with_backoff(fn, max_attempts=5):
    for attempt in range(max_attempts):
        res = fn()
        if res.status_code != 429:
            return res
        wait = int(res.headers.get("Retry-After", 2 ** attempt))
        wait += random.uniform(0, wait * 0.1)  # jitter
        time.sleep(wait)
    raise RuntimeError("rate limit not recoverable")

Jitter (10% of the wait) avoids the thundering-herd problem of every client retrying on the same boundary.

What counts toward the limit

Each billed call counts as 1, regardless of body size. Two exceptions:

POST /v1/usage/batch — counts as ceil(events / 100) toward the usage bucket. A 1000-event batch is 10 cost units.
POST /v1/exports — counts as 1, but the export job itself consumes server resources for hours. Concurrent exports from the same tenant are limited to 4.

Increasing your limits

If your steady-state traffic justifies higher limits (a marketplace at scale, an analytics workload, a heavy migration), email support@paylera.io with:

Your tenant ID.
Which route group(s) you need raised.
The peak rate you expect, and over what window.
A short note on the use case.

We typically respond within one business day. Limit increases are configured per-tenant in seconds.

Burst capacity

The sliding window allows short bursts above the limit without penalty: as long as your 1-second bucket and your 10-second bucket both stay within budget, you can spike. The exact algorithm is a double-leaky-bucket; the practical effect is that your average rate matters more than a single-second spike.

Failure modes

All 429s — your sustained rate is over budget. Add backoff, request a raise, or shed load.
Sporadic 429s on writes — concurrency higher than expected. Add jittered backoff.
429s with Retry-After: 0 — the limit was hit by another process on the same tenant; retry immediately.
Webhook handler getting backed up — not a rate limit issue; see Retries & ordering.

Rate limits are a feature, not a punishment

They protect your tenant from a runaway script flooding your own queues, and they protect every other tenant from your bug. Treat 429 as a normal control-flow signal, not an error.