Rate limits

The platform caps how fast a single principal can call operations: when you exceed a cap, the request is rejected with 429 Too Many Requests and a Retry-After hint — back off and retry, and your traffic flows again.

Why limits exist

Rate limits protect shared infrastructure and, on the money path, stop a single compromised key from draining a balance with a flood of fast reservations. They are a normal, expected part of operating against the API at scale: design your client to treat a 429 as a signal to slow down, not as a hard failure.

How limiting works

Limits are enforced per organization for authenticated calls and per IP for unauthenticated public calls. The window is a fixed 60-second bucket: each call increments a counter for the current minute, and the counter resets at the top of the next window.

Enforcement is distributed, so the cap holds across all server instances rather than being multiplied by the number of machines behind the load balancer. Authentication runs first, so the principal that a 429 is counted against is the same organization that owns the Bearer token — whether that token is a session token or a minted gpra_ API key. See Authentication for how the principal is resolved.

Limits

Most authenticated operations share a generous default cap; a few spend operations on the money path are held to tighter caps because a runaway loop there reserves credits, not just compute.

SurfaceCapKeyed on
Authenticated operations (default)600 / minOrganization
Order placement (e.g. orders.archive.place)60 / minOrganization
Other spend / reservation operations30 / minOrganization
Unauthenticated public operations60 / minClient IP

The higher default exists because real pipelines legitimately burst — a job firing ten calls a second is not pathological. The tighter spend caps are deliberate: they bound how quickly any single key can move money. Treat the table as the current behavior rather than a contract; the authoritative limit for a given response is always the one carried in that response’s headers.

The 429 contract

A throttled request returns 429 Too Many Requests with an application/problem+json body, exactly like every other error on the platform — see Errors for the full problem+json model.

http
HTTP/1.1 429 Too Many Requests
Content-Type: application/problem+json
Retry-After: 42
X-RateLimit-Limit: 600
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1718900000

{
  "type": "about:blank",
  "title": "Too Many Requests",
  "status": 429,
  "detail": "Too many requests — limit 600/min/org"
}

The headers you may see:

HeaderMeaning
Retry-AfterSeconds to wait before retrying. Always present on a 429.
X-RateLimit-LimitThe cap that applied to this request.
X-RateLimit-RemainingCalls left in the current window.
X-RateLimit-ResetUnix epoch seconds when the window resets.

Retry-After is the value to honor. When it is present, wait at least that many seconds before retrying; do not retry sooner. The X-RateLimit-* headers let you slow down proactively — if Remaining is near zero, pace your next calls instead of sprinting into the cap.

Backoff and retry

Retry 429 responses with exponential backoff and jitter:

  • Start from Retry-After when present; otherwise use a base delay (e.g. one second).
  • Double the delay on each successive 429, up to a ceiling (e.g. 30–60 seconds).
  • Add random jitter so that many clients throttled in the same window don’t retry in lockstep and re-collide.
  • Cap the number of attempts, then surface the error.

Retrying a 429 is always safe: a throttled request never reached the handler, so no side effect ran. For spend operations you should still send an Idempotency-Key so that a retry after a timeout (where the first request may have succeeded) replays the original result instead of acting twice.

Python (httpx)

python
import random
import time

import httpx

BASE_URL = "https://api.geopera.com"
TOKEN = "gpra_your_api_key"


def invoke(operation_id: str, body: dict, *, max_attempts: int = 6) -> dict:
    url = f"{BASE_URL}/v1/op/{operation_id}"
    headers = {"Authorization": f"Bearer {TOKEN}"}
    delay = 1.0

    with httpx.Client(timeout=30.0) as client:
        for attempt in range(max_attempts):
            resp = client.post(url, json=body, headers=headers)
            if resp.status_code != 429:
                resp.raise_for_status()
                return resp.json()

            # Honor Retry-After if the server sent one, else back off.
            retry_after = resp.headers.get("Retry-After")
            wait = float(retry_after) if retry_after else delay
            wait += random.uniform(0, 1.0)  # jitter
            time.sleep(wait)
            delay = min(delay * 2, 60.0)

    raise RuntimeError(f"rate limited after {max_attempts} attempts")


print(invoke("catalog.search", {"host_name": "earthsearch-aws", "collections": ["sentinel-2-l2a"], "limit": 10}))

TypeScript (fetch)

typescript
const BASE_URL = 'https://api.geopera.com';
const TOKEN = 'gpra_your_api_key';

const sleep = (ms: number) => new Promise((r) => setTimeout(r, ms));

export async function invoke(
	operationId: string,
	body: unknown,
	maxAttempts = 6
): Promise<unknown> {
	const url = `${BASE_URL}/v1/op/${operationId}`;
	let delay = 1000;

	for (let attempt = 0; attempt < maxAttempts; attempt++) {
		const resp = await fetch(url, {
			method: 'POST',
			headers: {
				Authorization: `Bearer ${TOKEN}`,
				'Content-Type': 'application/json'
			},
			body: JSON.stringify(body)
		});

		if (resp.status !== 429) {
			if (!resp.ok) throw new Error(`request failed: ${resp.status}`);
			return resp.json();
		}

		// Honor Retry-After (seconds) if present, else back off.
		const retryAfter = resp.headers.get('Retry-After');
		const wait = (retryAfter ? Number(retryAfter) * 1000 : delay) + Math.random() * 1000;
		await sleep(wait);
		delay = Math.min(delay * 2, 60_000);
	}

	throw new Error(`rate limited after ${maxAttempts} attempts`);
}

The official geopera Python package and @geopera/sdk apply this backoff for you; reach for the snippets above only when you call the HTTP API directly.

Gotchas

  • Limits are per organization, not per key. Several keys minted under one organization share the same authenticated bucket. Splitting load across keys does not raise your cap.
  • The window is fixed, not sliding. Up to two windows’ worth of calls can land across a window boundary. Pacing to the average rate, not the burst rate, keeps you clear.
  • Don’t ignore Retry-After. Retrying immediately just earns another 429 and wastes the attempt. Always wait at least the advertised interval.
  • Add jitter. Fleets of workers throttled in the same window will otherwise retry simultaneously and re-collide; randomized backoff spreads them out.
  • A 429 can also mean an egress or bandwidth limit, not just request count — treat the detail field as the source of truth for which limit you hit.

Related

  • Errors — the full problem+json model and status codes.
  • Idempotency — making spend retries safe.
  • Authentication — how the principal a limit applies to is resolved.