# Rate limits (/docs/api/operational/rate-limits)



Every API key gets a rate-limit tier baked in at create time. The tier decides how many `read-light`, `write-light`, and `long-running` calls you can make, plus how many jobs can be running at once. Hit the limit and you get `429 RATE_LIMITED` with a `Retry-After` header - honor it before retrying.

## Tiers [#tiers]

| Tier       | Reads (rpm, `read-light`) | Writes (rpm, `write-light`) | Long-running starts (rpm, `long-running`) | Daily cap (`write-light`) |
| ---------- | ------------------------- | --------------------------- | ----------------------------------------- | ------------------------- |
| `standard` | 120                       | 60                          | 20                                        | 10,000                    |
| `pilot`    | 1,200                     | 600                         | 60                                        | 100,000                   |
| `partner`  | 6,000                     | 3,000                       | 300                                       | 500,000                   |

`standard` is the default for partner keys. `pilot` adds headroom for early integrations; `partner` is the enterprise tier for production integrations.

Check your current tier with [`GET /v1/whoami`](/docs/api/reference/organizations/whoami):

```json
{
  "organizationId": "org_2481fa5c-a404-...",
  "scopes": [],
  "rateLimitTier": "standard"
}
```

## Buckets are per endpoint class [#buckets-are-per-endpoint-class]

One bucket for reads, one for writes, one for long-running starts. They're separate. A burst of content-generation POSTs cannot starve out your polling of `/v1/jobs/:id`. The endpoint class is fixed per route and we don't move routes between classes silently.

* **`read-light`** - every `GET`. Polling jobs, reading metrics, listing scheduled posts.
* **`write-light`** - `PATCH`, `DELETE`, small `POST` that completes synchronously (approve, reject, create OAuth URL).
* **`long-running`** - the `POST` that kicks off a job: ingest, generate, clone, create influencer.

The class is exposed on every response in the `X-RateLimit-Endpoint-Class` header so you know which bucket a call drew from.

Concurrent-jobs is a different dimension. It caps how many jobs can be in `running` state for your org at once, regardless of how fast you started them. Hitting it returns `429` on the POST that would exceed the cap. Wait for existing jobs to finish, then retry.

## Response headers [#response-headers]

Every response - success or error - carries the current state of your bucket. Log these if you want to see the limit approaching.

```http
X-RateLimit-Endpoint-Class: read-light
X-RateLimit-Limit: 120
X-RateLimit-Remaining: 119
X-RateLimit-Reset: 1776572820
X-RateLimit-Tier: standard
```

* `X-RateLimit-Endpoint-Class` - which bucket this call drew from (`read-light`, `write-light`, `long-running`).
* `X-RateLimit-Limit` - your tier's cap for that class.
* `X-RateLimit-Remaining` - tokens left in the current bucket.
* `X-RateLimit-Reset` - when the bucket refills.
* `X-RateLimit-Tier` - your key's tier (`standard`, `pilot`, `partner`).

Every response reports exactly the bucket the call touched. To see the state of a different bucket, issue a request that hits it.

## The 429 response [#the-429-response]

```json
{
  "error": {
    "code": "RATE_LIMITED",
    "message": "Rate limit exceeded on write-light.",
    "requestId": "req_01HXZ9G7...",
    "details": {
      "endpointClass": "write-light",
      "retryAfterMs": 12400
    }
  }
}
```

Headers:

```http
Retry-After: 13
X-RateLimit-Endpoint-Class: write-light
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1745000013
```

Honor `Retry-After`, then retry with the same `Idempotency-Key`. Don't jitter - the server staggered your bucket reset for you.

For high-volume integrations, batch work before calling the API and watch the response headers for each bucket.

## Kill switch [#kill-switch]

A per-key kill switch lets us disable a key without revoking it. Every request fails with `503 KILL_SWITCH` until the switch is cleared.

```json
{
  "error": {
    "code": "KILL_SWITCH",
    "message": "This API key has been temporarily disabled.",
    "requestId": "req_01HXZ9G7...",
    "details": { "scope": "key" }
  }
}
```

`details.scope` is one of:

* `key` - your specific key is off. Contact us to turn it back on.
* `organization` - your whole org is off. Same story.
* `global` - the entire API integration is off. Incident only. Watch the status page and retry when we post recovery.

We flip the kill switch for two reasons: a runaway client burning credits, or an upstream incident we need to contain. We tell you either way. It's not a silent fail.

## Rate-limiter fallback [#rate-limiter-fallback]

Our rate limiter has an in-memory fallback if its primary backend is unreachable. In that mode, the response carries `X-RateLimit-Fallback: memory` alongside the standard headers. This is intentional: we'd rather let you through than black-hole your traffic on our infrastructure problem. We flip the global kill switch if the downstream can't handle the flood.

You don't need to code for this. If the headers aren't there, treat it as "we don't know" and don't change your behavior.

## Long-running jobs and rate limiting [#long-running-jobs-and-rate-limiting]

Starting a job costs one token from the `long-running` bucket. Polling `/v1/jobs/:id` costs one `read-light` token per poll. A sensible poll loop at 2s intervals over a 60s job costs 30 reads - well under the `standard` tier's 120 rpm read cap.

Poll with backoff. Job state is not emitted on every request.

## Requesting an increase [#requesting-an-increase]

Hit the `standard` ceiling and need room to test? Message your Layers contact with:

* Your `organizationId` (from `GET /v1/whoami`).
* Which bucket you're saturating (reads, writes, long-running, concurrent jobs).
* A traffic shape - sustained rpm, burst ceiling, or both.

We read that and flip the tier.

## See also [#see-also]

* [Errors](/docs/api/operational/errors)
* [Jobs](/docs/api/concepts/jobs)
* [Idempotency](/docs/api/operational/idempotency)
