Uploads & downloads

Moving bytes in and out of Geopera is deliberately not a one-call helper: there is no client.upload_file(...) or client.download(...) convenience. The SDK reserves quota, signs URLs, and finalizes sessions, while you perform the actual byte transfer with your own httpx. Uploads are a multi-step session — initiate, sign, PUT the bytes yourself, complete — and downloads return JSON descriptors (a short-lived signed URL), never raw file bytes. This page shows the exact operations, fields, return shapes, and complete worked examples for both directions, leading with the fluent Geopera client and keeping the generated operation modules as the documented escape hatch.

This page assumes a configured client. With the fluent client that is client = Geopera(token="gpra_..."); with the generated layer it is an AuthenticatedClient called client. See client setup for both, calling operations for how sync / sync_detailed work, and the uploads guide for the protocol-level walkthrough.

The two surfaces

Every operation here is reachable two ways, and both hit the same POST endpoint:

  • Fluent (recommended). client.<resource>.<action>(body) where the resource path mirrors the dotted operation id. uploads.initiate is client.uploads.initiate({...}); the three-segment items.asset.download is client.items.asset.download({...}). The body is a plain dict (converted to the typed input model for you) or the typed model itself. The return is the typed output model, a decoded JSON value, or a typed error model. Pass detailed=True to get the full typed Response with the status code and headers.
  • Generated (escape hatch). from geopera.api.operations import <module> then <module>.sync(client=client, body=Model(...)). Each module also offers sync_detailed, asyncio, and asyncio_detailed. The fluent client wraps this layer and exposes the configured AuthenticatedClient at client.client.

The operation ids, endpoints, and modules are:

Operation idFluent callGenerated moduleEndpoint
uploads.initiateclient.uploads.initiate(...)uploads_initiatePOST /v1/op/uploads.initiate
uploads.signed_urlclient.uploads.signed_url(...)uploads_signed_urlPOST /v1/op/uploads.signed_url
uploads.progressclient.uploads.progress(...)uploads_progressPOST /v1/op/uploads.progress
uploads.completeclient.uploads.complete(...)uploads_completePOST /v1/op/uploads.complete
uploads.failclient.uploads.fail(...)uploads_failPOST /v1/op/uploads.fail
items.asset.downloadclient.items.asset.download(...)items_asset_downloadPOST /v1/op/items.asset.download
clip.job.downloadclient.clip.job.download(...)clip_job_downloadPOST /v1/op/clip.job.download
clip.job.downloadsclient.clip.job.downloads(...)clip_job_downloadsPOST /v1/op/clip.job.downloads

All generated modules live under geopera.api.operations; all models under geopera.models. Uploads require the uploads:write scope; downloads require the matching read scope (see scopes).

The upload session flow

An upload is a session you drive across several operations. The SDK never streams your file — between signing and completing, you PUT the bytes straight to object storage yourself.

StepFluent callBody modelReturns
Initiateclient.uploads.initiate(...)UploadInitiateUploadOutput (id)
Sign one fileclient.uploads.signed_url(...)SignedUrlInputSignedUrlOutput (upload_url, object_path)
Transfer bytes— (your httpx)
Report progressclient.uploads.progress(...)UploadProgressInputUploadOutput
Completeclient.uploads.complete(...)UploadsCompleteBodyType0 (or none)decoded JSON (Any)
Abandonclient.uploads.fail(...)UploadFailInputUploadOutput

uploads.initiate reserves storage quota up front; uploads.fail releases it. uploads.complete is the only step whose body is optional, and the only one whose result is a decoded JSON value rather than a typed model.

The input fields

The dict you pass to each fluent call maps one-to-one onto the typed input model. Required fields have no default; everything else is optional.

python
# UploadInitiate  (client.uploads.initiate)
project_id: str                        # required — the project receiving the data
transfer_method: str = "browser"       # use "signed_url" for SDK / server uploads
file_count: int = 1                    # number of files in this session
total_bytes: int = 0                   # drives the quota reservation; sum across all files
target_collection_id: str | None = None
target_item_id: str | None = None      # set to add an asset to an existing item
asset_key: str | None = None           # the asset key when targeting an item
is_categorical: bool = False           # mark categorical (class) rasters

# SignedUrlInput  (client.uploads.signed_url)
upload_id: str                         # the id returned by uploads.initiate
file_name: str                         # required — the file's name in the session
content_type: str = "application/octet-stream"

# UploadProgressInput  (client.uploads.progress)
upload_id: str                         # required
status: str | None = None              # free-form, e.g. "uploading"
bytes_uploaded: int | None = None      # cumulative bytes transferred

# UploadFailInput  (client.uploads.fail)
upload_id: str                         # required
error_message: str                     # required
error_step: str | None = None          # e.g. "transfer", "sign", "complete"

uploads.initiate returns UploadOutput, whose only field is id — that is the upload_id every later step needs. uploads.signed_url returns SignedUrlOutput with upload_url (the URL you PUT to) and object_path (where the blob lands in storage). With dict bodies these are typed objects, so you read .id, .upload_url, and .object_path as attributes.

Complete upload example (fluent)

This uploads one Cloud-Optimized GeoTIFF into a project. Note the explicit httpx.put to the signed URL in the middle — that step is yours, not the SDK’s.

python
import os

import httpx

from geopera import Geopera

client = Geopera(token="gpra_...")

PROJECT_ID = "your-project-id"
FILE_PATH = "./harbour_2024.tif"
FILE_NAME = "harbour_2024.tif"
CONTENT_TYPE = "image/tiff"

total_bytes = os.path.getsize(FILE_PATH)

# 1. Initiate — reserves storage quota up front.
session = client.uploads.initiate({
    "project_id": PROJECT_ID,
    "transfer_method": "signed_url",
    "file_count": 1,
    "total_bytes": total_bytes,
})
upload_id = session.id  # UploadOutput.id

try:
    # 2. Sign one file.
    signed = client.uploads.signed_url({
        "upload_id": upload_id,
        "file_name": FILE_NAME,
        "content_type": CONTENT_TYPE,
    })

    # 3. Transfer the bytes yourself — straight to object storage.
    #    The SDK does NOT do this for you.
    with open(FILE_PATH, "rb") as fh:
        put = httpx.put(
            signed.upload_url,
            content=fh.read(),
            headers={"Content-Type": CONTENT_TYPE},
            timeout=None,
        )
    put.raise_for_status()

    # 4. Complete — creates STAC item(s) + asset(s) and runs the pipeline.
    result = client.uploads.complete({"upload_id": upload_id})
    print(result)  # -> {"id": "u_...", "item_ids": ["it_..."]}

except Exception as exc:
    # 5. Release the reservation on any client-side failure.
    client.uploads.fail({
        "upload_id": upload_id,
        "error_message": str(exc),
        "error_step": "transfer",
    })
    raise

For very large files, stream from disk rather than reading the whole blob into memory — pass a file handle or a generator as content= to httpx.put, and set timeout=None (or a generous timeout) because you are moving bytes over the open internet, not talking to the Geopera API.

The body of uploads.complete

uploads.complete is the one upload step whose body is optional. Its typed model, UploadsCompleteBodyType0, carries no declared fields, so the upload_id rides as an extra key. With the fluent client you simply pass the dict:

python
result = client.uploads.complete({"upload_id": upload_id})

The result is a decoded JSON object (a dict), not a typed model — there is no attribute access. Read it by key:

python
item_ids = result["item_ids"]   # e.g. ["it_3a9c..."]

A completion that produces no new item (already complete, or metadata-only) is a legitimate no-op. If you call with no body at all (client.uploads.complete()), the fluent client sends an empty object — only do this when the session already knows its upload_id; in practice always pass it.

Multi-file uploads

Set file_count and total_bytes on uploads.initiate to cover every file in the session, then call uploads.signed_url once per file (each with its own file_name), PUT each blob, and call uploads.complete once after all transfers succeed:

python
files = [
    ("scene_b04.tif", "image/tiff"),
    ("scene_b08.tif", "image/tiff"),
]
total = sum(os.path.getsize(p) for p, _ in [("./scene_b04.tif", None), ("./scene_b08.tif", None)])

session = client.uploads.initiate({
    "project_id": PROJECT_ID,
    "transfer_method": "signed_url",
    "file_count": len(files),
    "total_bytes": total,
})
upload_id = session.id

try:
    for file_name, content_type in files:
        signed = client.uploads.signed_url({
            "upload_id": upload_id,
            "file_name": file_name,
            "content_type": content_type,
        })
        with open(f"./{file_name}", "rb") as fh:
            httpx.put(
                signed.upload_url,
                content=fh.read(),
                headers={"Content-Type": content_type},
                timeout=None,
            ).raise_for_status()

    result = client.uploads.complete({"upload_id": upload_id})
except Exception as exc:
    client.uploads.fail({
        "upload_id": upload_id,
        "error_message": str(exc),
        "error_step": "transfer",
    })
    raise

To add an asset to an existing item rather than create new ones, set target_item_id (and usually asset_key) on uploads.initiate.

Reporting progress

For resumable clients or a progress bar, call uploads.progress during the transfer. It returns an UploadOutput and does not affect the session lifecycle — it is purely informational:

python
client.uploads.progress({
    "upload_id": upload_id,
    "status": "uploading",
    "bytes_uploaded": 131072,
})

You can call it as often as you like; a typical pattern is to emit it from an httpx upload callback once per chunk or once per second.

The generated escape hatch (uploads)

The same flow with the generated operation modules and typed models — use this for async transfers, for explicit Response objects via sync_detailed, or to share one AuthenticatedClient across a process. The fluent client’s underlying client is available at client.client.

python
import os

import httpx

from geopera import AuthenticatedClient
from geopera.api.operations import (
    uploads_initiate,
    uploads_signed_url,
    uploads_complete,
    uploads_fail,
)
from geopera.models import (
    UploadInitiate,
    SignedUrlInput,
    UploadsCompleteBodyType0,
    UploadFailInput,
)

client = AuthenticatedClient(base_url="https://api.geopera.com", token="gpra_...")

FILE_PATH = "./harbour_2024.tif"
total_bytes = os.path.getsize(FILE_PATH)

session = uploads_initiate.sync(
    client=client,
    body=UploadInitiate(
        project_id="your-project-id",
        transfer_method="signed_url",
        file_count=1,
        total_bytes=total_bytes,
    ),
)
upload_id = session.id

try:
    signed = uploads_signed_url.sync(
        client=client,
        body=SignedUrlInput(
            upload_id=upload_id,
            file_name="harbour_2024.tif",
            content_type="image/tiff",
        ),
    )
    with open(FILE_PATH, "rb") as fh:
        httpx.put(
            signed.upload_url,
            content=fh.read(),
            headers={"Content-Type": "image/tiff"},
            timeout=None,
        ).raise_for_status()

    result = uploads_complete.sync(
        client=client,
        body=UploadsCompleteBodyType0.from_dict({"upload_id": upload_id}),
    )
    print(result)  # decoded JSON dict
except Exception as exc:
    uploads_fail.sync(
        client=client,
        body=UploadFailInput(
            upload_id=upload_id,
            error_message=str(exc),
            error_step="transfer",
        ),
    )
    raise

Because UploadsCompleteBodyType0 declares no fields, build it from a dict with UploadsCompleteBodyType0.from_dict({"upload_id": upload_id}) so the id is carried through. For async, swap .sync for .asyncio (and await it); for the full Response, use .sync_detailed / .asyncio_detailed and read .parsed, .status_code, and .headers.

Downloads return descriptors, not bytes

The download operations do not stream a file to disk. Each returns a decoded JSON descriptor — typically a short-lived signed URL you then fetch yourself with httpx. With the fluent client the return is the decoded JSON value (so a dict), not a typed model.

OperationFluent callBody modelReturns
items.asset.downloadclient.items.asset.download(...)AssetDownloadInputJSON descriptor
clip.job.downloadclient.clip.job.download(...)ClipDownloadInputJSON descriptor
clip.job.downloadsclient.clip.job.downloads(...)ClipJobInputJSON descriptor (list of downloads)

The input fields:

python
# AssetDownloadInput  (client.items.asset.download)
item_id: str                           # required — the STAC item
asset_id: str                          # required — the asset key on that item

# ClipDownloadInput  (client.clip.job.download)
job_id: str                            # required — the clip job
mosaic_type: str                       # required — which mosaic output, e.g. "visual"
user_agent: str | None = None          # optional egress-tracking metadata
requester_ip: str | None = None        # optional egress-tracking metadata

# ClipJobInput  (client.clip.job.downloads)
job_id: str                            # required — the clip job to enumerate

items.asset.download resolves a single asset on a STAC item; clip.job.download resolves one mosaic output of a clip job; clip.job.downloads lists every available download for a clip job. These endpoints are egress-tracked — the descriptor’s URL is metered when fetched.

Download example (fluent)

Fetch an asset descriptor, then do the byte transfer yourself:

python
import httpx

from geopera import Geopera

client = Geopera(token="gpra_...")

descriptor = client.items.asset.download({
    "item_id": "it_3a9c...",
    "asset_id": "visual",
})

# `descriptor` is a decoded JSON dict, not raw bytes.
# It carries a short-lived signed URL — fetch the file yourself.
href = descriptor["url"]

with httpx.stream("GET", href) as resp:
    resp.raise_for_status()
    with open("visual.tif", "wb") as fh:
        for chunk in resp.iter_bytes():
            fh.write(chunk)

The exact descriptor shape is operation- and asset-dependent. Because the return is a decoded JSON value, treat it as a dict and read the signed-URL field it returns. Inspect it once with print(descriptor) to confirm the key for your asset type rather than assuming.

Downloading a clip job

A clip job can produce several mosaics. List them with clip.job.downloads, then resolve a specific one with clip.job.download:

python
available = client.clip.job.downloads({"job_id": "clip_7f2a..."})
print(available)   # the set of mosaics/files you can fetch

one = client.clip.job.download({
    "job_id": "clip_7f2a...",
    "mosaic_type": "visual",
})
# `one` is a JSON descriptor — fetch its signed URL with httpx as above.
href = one["url"]
with httpx.stream("GET", href) as resp:
    resp.raise_for_status()
    with open("clip_visual.tif", "wb") as fh:
        for chunk in resp.iter_bytes():
            fh.write(chunk)

The generated escape hatch (downloads)

The download operations with the generated modules and typed input models. The return is still a decoded JSON value, because the SDK passes the descriptor through untyped:

python
import httpx

from geopera import AuthenticatedClient
from geopera.api.operations import items_asset_download, clip_job_download, clip_job_downloads
from geopera.models import AssetDownloadInput, ClipDownloadInput, ClipJobInput

client = AuthenticatedClient(base_url="https://api.geopera.com", token="gpra_...")

descriptor = items_asset_download.sync(
    client=client,
    body=AssetDownloadInput(item_id="it_3a9c...", asset_id="visual"),
)

available = clip_job_downloads.sync(
    client=client,
    body=ClipJobInput(job_id="clip_7f2a..."),
)

one = clip_job_download.sync(
    client=client,
    body=ClipDownloadInput(job_id="clip_7f2a...", mosaic_type="visual"),
)

href = descriptor["url"]
with httpx.stream("GET", href) as resp:
    resp.raise_for_status()
    with open("visual.tif", "wb") as fh:
        for chunk in resp.iter_bytes():
            fh.write(chunk)

Errors and edge cases

Every operation on this page can return the typed Problem model (for 401, 403, 404, 500) or HTTPValidationError (for 422), in place of its success value. The fluent client returns these models directly; check the type before using the result. When you need the HTTP status code itself — for example to branch on a 402 quota error — pass detailed=True to the fluent call (or use sync_detailed on the generated module) and read response.status_code and response.parsed:

python
resp = client.uploads.initiate(
    {"project_id": PROJECT_ID, "transfer_method": "signed_url", "total_bytes": total_bytes},
    detailed=True,
)
if resp.status_code == 402:
    raise RuntimeError("Storage quota exceeded — top up before uploading.")
session = resp.parsed   # UploadOutput on success

See errors for the Problem and HTTPValidationError models and calling operations for parsed vs detailed.

Gotchas

  • The SDK never moves your bytes. Both directions hand you a signed URL; the PUT (upload) and GET (download) are your own httpx calls. Budget for that in timeouts (timeout=None or a long timeout) and retries.
  • Signed URLs are short-lived. Sign immediately before transferring. If a PUT or GET fails with a 403 from object storage, re-sign with uploads.signed_url (or re-resolve the download descriptor) rather than retrying the stale URL.
  • uploads.complete and the download ops return decoded JSON, not typed models. You get a dict — there is no attribute access and no IDE autocomplete on the result. Read by key, and print(...) once to confirm the field names for your case.
  • Always pair initiate with fail on the error path. uploads.initiate reserves quota; if you abandon mid-flow without calling uploads.fail, the reservation lingers. The try/except shape in the examples above is the recommended pattern.
  • Quota is enforced at initiate. If the org lacks storage, uploads.initiate returns a 402 Problem — use detailed=True (or sync_detailed) to branch on the status when you need to surface that to a user.
  • total_bytes drives the reservation. Sum it across every file in a multi-file session; an undersized value can cause complete to be rejected after you have already transferred bytes.
  • Downloads are egress-tracked. Fetching the descriptor’s signed URL is metered. ClipDownloadInput accepts optional user_agent and requester_ip if you want to attribute that egress.

Related

  • Uploads guide — the protocol-level walkthrough with raw HTTP.
  • Calling operations — fluent calls, detailed=True, and sync vs sync_detailed.
  • Asyncasyncio / asyncio_detailed for concurrent transfers.
  • Authentication and scopes — what uploads:write grants.
  • Errors — the Problem and HTTPValidationError models these operations return.