Skip to content

Rate limits

Paylera applies sliding-window rate limits per tenant per route group. The limits are generous for normal use; bursts and pathological patterns get a structured 429 with a Retry-After header.

Live limits

Route groupLimit (live)
Read endpoints (GET /v1/*)1000 / second
Write endpoints (most POST / PATCH / DELETE)200 / second
Money-moving (POST /v1/payments/*, /v1/refunds/*, /v1/payouts/*)100 / second
Webhook deliveries (outbound, doesn’t count against your limit)n/a
Usage ingestion (POST /v1/usage, /v1/usage/batch)5000 events / second
Admin endpoints (/v1/admin/*)50 / second

Limits are per-tenant, not per-key. Multiple tokens against the same tenant share the same buckets.

Sandbox limits

Sandbox is one-tenth of live across the board. Enough to develop and test; not enough to load-test.

Headers on every response

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset: 1736179215

Reset is a Unix epoch seconds at which the bucket fully refills.

When you hit a limit

HTTP/1.1 429 Too Many Requests
Retry-After: 2
Content-Type: application/problem+json
{
"type": "https://errors.paylera.io/rate-limit/exceeded",
"title": "Rate limit exceeded",
"status": 429,
"problem": "rate_limit.exceeded",
"detail": "You've exceeded the limit for write endpoints. Retry after 2 seconds.",
"trace_id": "01H8…"
}

Retry-After is in seconds. It’s a suggestion based on the bucket refill rate; honour it.

Backoff

Recommended pattern:

def call_with_backoff(fn, max_attempts=5):
for attempt in range(max_attempts):
res = fn()
if res.status_code != 429:
return res
wait = int(res.headers.get("Retry-After", 2 ** attempt))
wait += random.uniform(0, wait * 0.1) # jitter
time.sleep(wait)
raise RuntimeError("rate limit not recoverable")

Jitter (10% of the wait) avoids the thundering-herd problem of every client retrying on the same boundary.

What counts toward the limit

Each billed call counts as 1, regardless of body size. Two exceptions:

  • POST /v1/usage/batch — counts as ceil(events / 100) toward the usage bucket. A 1000-event batch is 10 cost units.
  • POST /v1/exports — counts as 1, but the export job itself consumes server resources for hours. Concurrent exports from the same tenant are limited to 4.

Increasing your limits

If your steady-state traffic justifies higher limits (a marketplace at scale, an analytics workload, a heavy migration), email support@paylera.io with:

  • Your tenant ID.
  • Which route group(s) you need raised.
  • The peak rate you expect, and over what window.
  • A short note on the use case.

We typically respond within one business day. Limit increases are configured per-tenant in seconds.

Burst capacity

The sliding window allows short bursts above the limit without penalty: as long as your 1-second bucket and your 10-second bucket both stay within budget, you can spike. The exact algorithm is a double-leaky-bucket; the practical effect is that your average rate matters more than a single-second spike.

Failure modes

  • All 429s — your sustained rate is over budget. Add backoff, request a raise, or shed load.
  • Sporadic 429s on writes — concurrency higher than expected. Add jittered backoff.
  • 429s with Retry-After: 0 — the limit was hit by another process on the same tenant; retry immediately.
  • Webhook handler getting backed up — not a rate limit issue; see Retries & ordering.

Rate limits are a feature, not a punishment

They protect your tenant from a runaway script flooding your own queues, and they protect every other tenant from your bug. Treat 429 as a normal control-flow signal, not an error.