Rate limits
Paylera applies sliding-window rate limits per tenant per route group.
The limits are generous for normal use; bursts and pathological
patterns get a structured 429 with a Retry-After header.
Live limits
| Route group | Limit (live) |
|---|---|
Read endpoints (GET /v1/*) | 1000 / second |
Write endpoints (most POST / PATCH / DELETE) | 200 / second |
Money-moving (POST /v1/payments/*, /v1/refunds/*, /v1/payouts/*) | 100 / second |
| Webhook deliveries (outbound, doesn’t count against your limit) | n/a |
Usage ingestion (POST /v1/usage, /v1/usage/batch) | 5000 events / second |
Admin endpoints (/v1/admin/*) | 50 / second |
Limits are per-tenant, not per-key. Multiple tokens against the same tenant share the same buckets.
Sandbox limits
Sandbox is one-tenth of live across the board. Enough to develop and test; not enough to load-test.
Headers on every response
X-RateLimit-Limit: 1000X-RateLimit-Remaining: 847X-RateLimit-Reset: 1736179215Reset is a Unix epoch seconds at which the bucket fully refills.
When you hit a limit
HTTP/1.1 429 Too Many RequestsRetry-After: 2Content-Type: application/problem+json
{ "type": "https://errors.paylera.io/rate-limit/exceeded", "title": "Rate limit exceeded", "status": 429, "problem": "rate_limit.exceeded", "detail": "You've exceeded the limit for write endpoints. Retry after 2 seconds.", "trace_id": "01H8…"}Retry-After is in seconds. It’s a suggestion based on the bucket
refill rate; honour it.
Backoff
Recommended pattern:
def call_with_backoff(fn, max_attempts=5): for attempt in range(max_attempts): res = fn() if res.status_code != 429: return res wait = int(res.headers.get("Retry-After", 2 ** attempt)) wait += random.uniform(0, wait * 0.1) # jitter time.sleep(wait) raise RuntimeError("rate limit not recoverable")Jitter (10% of the wait) avoids the thundering-herd problem of every client retrying on the same boundary.
What counts toward the limit
Each billed call counts as 1, regardless of body size. Two exceptions:
POST /v1/usage/batch— counts asceil(events / 100)toward the usage bucket. A 1000-event batch is 10 cost units.POST /v1/exports— counts as 1, but the export job itself consumes server resources for hours. Concurrent exports from the same tenant are limited to 4.
Increasing your limits
If your steady-state traffic justifies higher limits (a marketplace at
scale, an analytics workload, a heavy migration), email
support@paylera.io with:
- Your tenant ID.
- Which route group(s) you need raised.
- The peak rate you expect, and over what window.
- A short note on the use case.
We typically respond within one business day. Limit increases are configured per-tenant in seconds.
Burst capacity
The sliding window allows short bursts above the limit without penalty: as long as your 1-second bucket and your 10-second bucket both stay within budget, you can spike. The exact algorithm is a double-leaky-bucket; the practical effect is that your average rate matters more than a single-second spike.
Failure modes
- All 429s — your sustained rate is over budget. Add backoff, request a raise, or shed load.
- Sporadic 429s on writes — concurrency higher than expected. Add jittered backoff.
- 429s with
Retry-After: 0— the limit was hit by another process on the same tenant; retry immediately. - Webhook handler getting backed up — not a rate limit issue; see Retries & ordering.
Rate limits are a feature, not a punishment
They protect your tenant from a runaway script flooding your own queues, and they protect every other tenant from your bug. Treat 429 as a normal control-flow signal, not an error.