Rate limits
The platform caps how fast a single principal can call operations: when you exceed a cap, the request is rejected with 429 Too Many Requests and a Retry-After hint — back off and retry, and your traffic flows again.
Why limits exist
Rate limits protect shared infrastructure and, on the money path, stop a single
compromised key from draining a balance with a flood of fast reservations. They are a
normal, expected part of operating against the API at scale: design your client to
treat a 429 as a signal to slow down, not as a hard failure.
How limiting works
Limits are enforced per organization for authenticated calls and per IP for unauthenticated public calls. The window is a fixed 60-second bucket: each call increments a counter for the current minute, and the counter resets at the top of the next window.
Enforcement is distributed, so the cap holds across all server instances rather than
being multiplied by the number of machines behind the load balancer. Authentication
runs first, so the principal that a 429 is counted against is the same
organization that owns the Bearer token — whether that token is a session token or a minted gpra_ API key. See Authentication for how the
principal is resolved.
Limits
Most authenticated operations share a generous default cap; a few spend operations on the money path are held to tighter caps because a runaway loop there reserves credits, not just compute.
| Surface | Cap | Keyed on |
|---|---|---|
| Authenticated operations (default) | 600 / min | Organization |
Order placement (e.g. orders.archive.place) | 60 / min | Organization |
| Other spend / reservation operations | 30 / min | Organization |
| Unauthenticated public operations | 60 / min | Client IP |
The higher default exists because real pipelines legitimately burst — a job firing ten calls a second is not pathological. The tighter spend caps are deliberate: they bound how quickly any single key can move money. Treat the table as the current behavior rather than a contract; the authoritative limit for a given response is always the one carried in that response’s headers.
The 429 contract
A throttled request returns 429 Too Many Requests with an application/problem+json body, exactly like every other error on the platform — see Errors for the full problem+json model.
HTTP/1.1 429 Too Many Requests
Content-Type: application/problem+json
Retry-After: 42
X-RateLimit-Limit: 600
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1718900000
{
"type": "about:blank",
"title": "Too Many Requests",
"status": 429,
"detail": "Too many requests — limit 600/min/org"
}The headers you may see:
| Header | Meaning |
|---|---|
Retry-After | Seconds to wait before retrying. Always present on a 429. |
X-RateLimit-Limit | The cap that applied to this request. |
X-RateLimit-Remaining | Calls left in the current window. |
X-RateLimit-Reset | Unix epoch seconds when the window resets. |
Retry-After is the value to honor. When it is present, wait at least that many
seconds before retrying; do not retry sooner. The X-RateLimit-* headers let you slow
down proactively — if Remaining is near zero, pace your next calls instead of
sprinting into the cap.
Backoff and retry
Retry 429 responses with exponential backoff and jitter:
- Start from
Retry-Afterwhen present; otherwise use a base delay (e.g. one second). - Double the delay on each successive
429, up to a ceiling (e.g. 30–60 seconds). - Add random jitter so that many clients throttled in the same window don’t retry in lockstep and re-collide.
- Cap the number of attempts, then surface the error.
Retrying a 429 is always safe: a throttled request never reached the handler, so no
side effect ran. For spend operations you should still send an Idempotency-Key so that a retry after a timeout (where the first request may have succeeded) replays the original result instead of
acting twice.
Python (httpx)
import random
import time
import httpx
BASE_URL = "https://api.geopera.com"
TOKEN = "gpra_your_api_key"
def invoke(operation_id: str, body: dict, *, max_attempts: int = 6) -> dict:
url = f"{BASE_URL}/v1/op/{operation_id}"
headers = {"Authorization": f"Bearer {TOKEN}"}
delay = 1.0
with httpx.Client(timeout=30.0) as client:
for attempt in range(max_attempts):
resp = client.post(url, json=body, headers=headers)
if resp.status_code != 429:
resp.raise_for_status()
return resp.json()
# Honor Retry-After if the server sent one, else back off.
retry_after = resp.headers.get("Retry-After")
wait = float(retry_after) if retry_after else delay
wait += random.uniform(0, 1.0) # jitter
time.sleep(wait)
delay = min(delay * 2, 60.0)
raise RuntimeError(f"rate limited after {max_attempts} attempts")
print(invoke("catalog.search", {"host_name": "earthsearch-aws", "collections": ["sentinel-2-l2a"], "limit": 10}))TypeScript (fetch)
const BASE_URL = 'https://api.geopera.com';
const TOKEN = 'gpra_your_api_key';
const sleep = (ms: number) => new Promise((r) => setTimeout(r, ms));
export async function invoke(
operationId: string,
body: unknown,
maxAttempts = 6
): Promise<unknown> {
const url = `${BASE_URL}/v1/op/${operationId}`;
let delay = 1000;
for (let attempt = 0; attempt < maxAttempts; attempt++) {
const resp = await fetch(url, {
method: 'POST',
headers: {
Authorization: `Bearer ${TOKEN}`,
'Content-Type': 'application/json'
},
body: JSON.stringify(body)
});
if (resp.status !== 429) {
if (!resp.ok) throw new Error(`request failed: ${resp.status}`);
return resp.json();
}
// Honor Retry-After (seconds) if present, else back off.
const retryAfter = resp.headers.get('Retry-After');
const wait = (retryAfter ? Number(retryAfter) * 1000 : delay) + Math.random() * 1000;
await sleep(wait);
delay = Math.min(delay * 2, 60_000);
}
throw new Error(`rate limited after ${maxAttempts} attempts`);
}The official geopera Python package and @geopera/sdk apply this backoff for
you; reach for the snippets above only when you call the HTTP API directly.
Gotchas
- Limits are per organization, not per key. Several keys minted under one organization share the same authenticated bucket. Splitting load across keys does not raise your cap.
- The window is fixed, not sliding. Up to two windows’ worth of calls can land across a window boundary. Pacing to the average rate, not the burst rate, keeps you clear.
- Don’t ignore
Retry-After. Retrying immediately just earns another429and wastes the attempt. Always wait at least the advertised interval. - Add jitter. Fleets of workers throttled in the same window will otherwise retry simultaneously and re-collide; randomized backoff spreads them out.
- A
429can also mean an egress or bandwidth limit, not just request count — treat thedetailfield as the source of truth for which limit you hit.
Related
- Errors — the full problem+json model and status codes.
- Idempotency — making spend retries safe.
- Authentication — how the principal a limit applies to is resolved.