<!-- Source: https://docs.geopera.com/api-reference/sdks/python/uploads-downloads · Markdown for LLMs -->

# Uploads & downloads

Moving bytes in and out of Geopera is deliberately not a one-call helper: there is no `client.upload_file(...)` or `client.download(...)` convenience. The SDK reserves quota, signs URLs, and finalizes sessions, while you perform the actual byte transfer with your own `httpx`. Uploads are a multi-step **session** — initiate, sign, `PUT` the bytes yourself, complete — and downloads return JSON **descriptors** (a short-lived signed URL), never raw file bytes. This page shows the exact operations, fields, return shapes, and complete worked examples for both directions, leading with the fluent `Geopera` client and keeping the generated operation modules as the documented escape hatch.

This page assumes a configured client. With the fluent client that is `client = Geopera(token="gpra_...")`; with the generated layer it is an `AuthenticatedClient` called `client`. See [client setup](/api-reference/sdks/python) for both, [calling operations](/api-reference/sdks/python/operations) for how `sync` / `sync_detailed` work, and the [uploads guide](/api-reference/guides/uploads) for the protocol-level walkthrough.

## The two surfaces

Every operation here is reachable two ways, and both hit the same `POST` endpoint:

- **Fluent (recommended).** `client.<resource>.<action>(body)` where the resource path mirrors the dotted operation id. `uploads.initiate` is `client.uploads.initiate({...})`; the three-segment `items.asset.download` is `client.items.asset.download({...})`. The body is a plain dict (converted to the typed input model for you) or the typed model itself. The return is the typed output model, a decoded JSON value, or a typed error model. Pass `detailed=True` to get the full typed `Response` with the status code and headers.
- **Generated (escape hatch).** `from geopera.api.operations import <module>` then `<module>.sync(client=client, body=Model(...))`. Each module also offers `sync_detailed`, `asyncio`, and `asyncio_detailed`. The fluent client wraps this layer and exposes the configured `AuthenticatedClient` at `client.client`.

The operation ids, endpoints, and modules are:

| Operation id           | Fluent call                        | Generated module       | Endpoint                           |
| ---------------------- | ---------------------------------- | ---------------------- | ---------------------------------- |
| `uploads.initiate`     | `client.uploads.initiate(...)`     | `uploads_initiate`     | `POST /v1/op/uploads.initiate`     |
| `uploads.signed_url`   | `client.uploads.signed_url(...)`   | `uploads_signed_url`   | `POST /v1/op/uploads.signed_url`   |
| `uploads.progress`     | `client.uploads.progress(...)`     | `uploads_progress`     | `POST /v1/op/uploads.progress`     |
| `uploads.complete`     | `client.uploads.complete(...)`     | `uploads_complete`     | `POST /v1/op/uploads.complete`     |
| `uploads.fail`         | `client.uploads.fail(...)`         | `uploads_fail`         | `POST /v1/op/uploads.fail`         |
| `items.asset.download` | `client.items.asset.download(...)` | `items_asset_download` | `POST /v1/op/items.asset.download` |
| `clip.job.download`    | `client.clip.job.download(...)`    | `clip_job_download`    | `POST /v1/op/clip.job.download`    |
| `clip.job.downloads`   | `client.clip.job.downloads(...)`   | `clip_job_downloads`   | `POST /v1/op/clip.job.downloads`   |

All generated modules live under `geopera.api.operations`; all models under `geopera.models`. Uploads require the `uploads:write` scope; downloads require the matching read scope (see [scopes](/api-reference/scopes)).

## The upload session flow

An upload is a session you drive across several operations. The SDK never streams your file — between signing and completing, you `PUT` the bytes straight to object storage yourself.

| Step            | Fluent call                      | Body model                           | Returns                                         |
| --------------- | -------------------------------- | ------------------------------------ | ----------------------------------------------- |
| Initiate        | `client.uploads.initiate(...)`   | `UploadInitiate`                     | `UploadOutput` (`id`)                           |
| Sign one file   | `client.uploads.signed_url(...)` | `SignedUrlInput`                     | `SignedUrlOutput` (`upload_url`, `object_path`) |
| Transfer bytes  | — (your `httpx`)                 | —                                    | —                                               |
| Report progress | `client.uploads.progress(...)`   | `UploadProgressInput`                | `UploadOutput`                                  |
| Complete        | `client.uploads.complete(...)`   | `UploadsCompleteBodyType0` (or none) | decoded JSON (`Any`)                            |
| Abandon         | `client.uploads.fail(...)`       | `UploadFailInput`                    | `UploadOutput`                                  |

`uploads.initiate` reserves storage quota up front; `uploads.fail` releases it. `uploads.complete` is the only step whose body is optional, and the only one whose result is a decoded JSON value rather than a typed model.

### The input fields

The dict you pass to each fluent call maps one-to-one onto the typed input model. Required fields have no default; everything else is optional.

```python
# UploadInitiate  (client.uploads.initiate)
project_id: str                        # required — the project receiving the data
transfer_method: str = "browser"       # use "signed_url" for SDK / server uploads
file_count: int = 1                    # number of files in this session
total_bytes: int = 0                   # drives the quota reservation; sum across all files
target_collection_id: str | None = None
target_item_id: str | None = None      # set to add an asset to an existing item
asset_key: str | None = None           # the asset key when targeting an item
is_categorical: bool = False           # mark categorical (class) rasters

# SignedUrlInput  (client.uploads.signed_url)
upload_id: str                         # the id returned by uploads.initiate
file_name: str                         # required — the file's name in the session
content_type: str = "application/octet-stream"

# UploadProgressInput  (client.uploads.progress)
upload_id: str                         # required
status: str | None = None              # free-form, e.g. "uploading"
bytes_uploaded: int | None = None      # cumulative bytes transferred

# UploadFailInput  (client.uploads.fail)
upload_id: str                         # required
error_message: str                     # required
error_step: str | None = None          # e.g. "transfer", "sign", "complete"
```

`uploads.initiate` returns `UploadOutput`, whose only field is `id` — that is the `upload_id` every later step needs. `uploads.signed_url` returns `SignedUrlOutput` with `upload_url` (the URL you `PUT` to) and `object_path` (where the blob lands in storage). With dict bodies these are typed objects, so you read `.id`, `.upload_url`, and `.object_path` as attributes.

### Complete upload example (fluent)

This uploads one Cloud-Optimized GeoTIFF into a project. Note the explicit `httpx.put` to the signed URL in the middle — that step is yours, not the SDK's.

```python
import os

import httpx

from geopera import Geopera

client = Geopera(token="gpra_...")

PROJECT_ID = "your-project-id"
FILE_PATH = "./harbour_2024.tif"
FILE_NAME = "harbour_2024.tif"
CONTENT_TYPE = "image/tiff"

total_bytes = os.path.getsize(FILE_PATH)

# 1. Initiate — reserves storage quota up front.
session = client.uploads.initiate({
    "project_id": PROJECT_ID,
    "transfer_method": "signed_url",
    "file_count": 1,
    "total_bytes": total_bytes,
})
upload_id = session.id  # UploadOutput.id

try:
    # 2. Sign one file.
    signed = client.uploads.signed_url({
        "upload_id": upload_id,
        "file_name": FILE_NAME,
        "content_type": CONTENT_TYPE,
    })

    # 3. Transfer the bytes yourself — straight to object storage.
    #    The SDK does NOT do this for you.
    with open(FILE_PATH, "rb") as fh:
        put = httpx.put(
            signed.upload_url,
            content=fh.read(),
            headers={"Content-Type": CONTENT_TYPE},
            timeout=None,
        )
    put.raise_for_status()

    # 4. Complete — creates STAC item(s) + asset(s) and runs the pipeline.
    result = client.uploads.complete({"upload_id": upload_id})
    print(result)  # -> {"id": "u_...", "item_ids": ["it_..."]}

except Exception as exc:
    # 5. Release the reservation on any client-side failure.
    client.uploads.fail({
        "upload_id": upload_id,
        "error_message": str(exc),
        "error_step": "transfer",
    })
    raise
```

For very large files, stream from disk rather than reading the whole blob into memory — pass a file handle or a generator as `content=` to `httpx.put`, and set `timeout=None` (or a generous timeout) because you are moving bytes over the open internet, not talking to the Geopera API.

### The body of `uploads.complete`

`uploads.complete` is the one upload step whose body is optional. Its typed model, `UploadsCompleteBodyType0`, carries no declared fields, so the `upload_id` rides as an extra key. With the fluent client you simply pass the dict:

```python
result = client.uploads.complete({"upload_id": upload_id})
```

The result is a decoded JSON object (a `dict`), not a typed model — there is no attribute access. Read it by key:

```python
item_ids = result["item_ids"]   # e.g. ["it_3a9c..."]
```

A completion that produces no new item (already complete, or metadata-only) is a legitimate no-op. If you call with no body at all (`client.uploads.complete()`), the fluent client sends an empty object — only do this when the session already knows its `upload_id`; in practice always pass it.

### Multi-file uploads

Set `file_count` and `total_bytes` on `uploads.initiate` to cover every file in the session, then call `uploads.signed_url` once per file (each with its own `file_name`), `PUT` each blob, and call `uploads.complete` once after all transfers succeed:

```python
files = [
    ("scene_b04.tif", "image/tiff"),
    ("scene_b08.tif", "image/tiff"),
]
total = sum(os.path.getsize(p) for p, _ in [("./scene_b04.tif", None), ("./scene_b08.tif", None)])

session = client.uploads.initiate({
    "project_id": PROJECT_ID,
    "transfer_method": "signed_url",
    "file_count": len(files),
    "total_bytes": total,
})
upload_id = session.id

try:
    for file_name, content_type in files:
        signed = client.uploads.signed_url({
            "upload_id": upload_id,
            "file_name": file_name,
            "content_type": content_type,
        })
        with open(f"./{file_name}", "rb") as fh:
            httpx.put(
                signed.upload_url,
                content=fh.read(),
                headers={"Content-Type": content_type},
                timeout=None,
            ).raise_for_status()

    result = client.uploads.complete({"upload_id": upload_id})
except Exception as exc:
    client.uploads.fail({
        "upload_id": upload_id,
        "error_message": str(exc),
        "error_step": "transfer",
    })
    raise
```

To add an asset to an **existing** item rather than create new ones, set `target_item_id` (and usually `asset_key`) on `uploads.initiate`.

### Reporting progress

For resumable clients or a progress bar, call `uploads.progress` during the transfer. It returns an `UploadOutput` and does not affect the session lifecycle — it is purely informational:

```python
client.uploads.progress({
    "upload_id": upload_id,
    "status": "uploading",
    "bytes_uploaded": 131072,
})
```

You can call it as often as you like; a typical pattern is to emit it from an `httpx` upload callback once per chunk or once per second.

### The generated escape hatch (uploads)

The same flow with the generated operation modules and typed models — use this for async transfers, for explicit `Response` objects via `sync_detailed`, or to share one `AuthenticatedClient` across a process. The fluent client's underlying client is available at `client.client`.

```python
import os

import httpx

from geopera import AuthenticatedClient
from geopera.api.operations import (
    uploads_initiate,
    uploads_signed_url,
    uploads_complete,
    uploads_fail,
)
from geopera.models import (
    UploadInitiate,
    SignedUrlInput,
    UploadsCompleteBodyType0,
    UploadFailInput,
)

client = AuthenticatedClient(base_url="https://api.geopera.com", token="gpra_...")

FILE_PATH = "./harbour_2024.tif"
total_bytes = os.path.getsize(FILE_PATH)

session = uploads_initiate.sync(
    client=client,
    body=UploadInitiate(
        project_id="your-project-id",
        transfer_method="signed_url",
        file_count=1,
        total_bytes=total_bytes,
    ),
)
upload_id = session.id

try:
    signed = uploads_signed_url.sync(
        client=client,
        body=SignedUrlInput(
            upload_id=upload_id,
            file_name="harbour_2024.tif",
            content_type="image/tiff",
        ),
    )
    with open(FILE_PATH, "rb") as fh:
        httpx.put(
            signed.upload_url,
            content=fh.read(),
            headers={"Content-Type": "image/tiff"},
            timeout=None,
        ).raise_for_status()

    result = uploads_complete.sync(
        client=client,
        body=UploadsCompleteBodyType0.from_dict({"upload_id": upload_id}),
    )
    print(result)  # decoded JSON dict
except Exception as exc:
    uploads_fail.sync(
        client=client,
        body=UploadFailInput(
            upload_id=upload_id,
            error_message=str(exc),
            error_step="transfer",
        ),
    )
    raise
```

Because `UploadsCompleteBodyType0` declares no fields, build it from a dict with `UploadsCompleteBodyType0.from_dict({"upload_id": upload_id})` so the id is carried through. For async, swap `.sync` for `.asyncio` (and `await` it); for the full `Response`, use `.sync_detailed` / `.asyncio_detailed` and read `.parsed`, `.status_code`, and `.headers`.

## Downloads return descriptors, not bytes

The download operations do **not** stream a file to disk. Each returns a decoded JSON **descriptor** — typically a short-lived signed URL you then fetch yourself with `httpx`. With the fluent client the return is the decoded JSON value (so a `dict`), not a typed model.

| Operation              | Fluent call                        | Body model           | Returns                             |
| ---------------------- | ---------------------------------- | -------------------- | ----------------------------------- |
| `items.asset.download` | `client.items.asset.download(...)` | `AssetDownloadInput` | JSON descriptor                     |
| `clip.job.download`    | `client.clip.job.download(...)`    | `ClipDownloadInput`  | JSON descriptor                     |
| `clip.job.downloads`   | `client.clip.job.downloads(...)`   | `ClipJobInput`       | JSON descriptor (list of downloads) |

The input fields:

```python
# AssetDownloadInput  (client.items.asset.download)
item_id: str                           # required — the STAC item
asset_id: str                          # required — the asset key on that item

# ClipDownloadInput  (client.clip.job.download)
job_id: str                            # required — the clip job
mosaic_type: str                       # required — which mosaic output, e.g. "visual"
user_agent: str | None = None          # optional egress-tracking metadata
requester_ip: str | None = None        # optional egress-tracking metadata

# ClipJobInput  (client.clip.job.downloads)
job_id: str                            # required — the clip job to enumerate
```

`items.asset.download` resolves a single asset on a STAC item; `clip.job.download` resolves one mosaic output of a clip job; `clip.job.downloads` lists every available download for a clip job. These endpoints are egress-tracked — the descriptor's URL is metered when fetched.

### Download example (fluent)

Fetch an asset descriptor, then do the byte transfer yourself:

```python
import httpx

from geopera import Geopera

client = Geopera(token="gpra_...")

descriptor = client.items.asset.download({
    "item_id": "it_3a9c...",
    "asset_id": "visual",
})

# `descriptor` is a decoded JSON dict, not raw bytes.
# It carries a short-lived signed URL — fetch the file yourself.
href = descriptor["url"]

with httpx.stream("GET", href) as resp:
    resp.raise_for_status()
    with open("visual.tif", "wb") as fh:
        for chunk in resp.iter_bytes():
            fh.write(chunk)
```

The exact descriptor shape is operation- and asset-dependent. Because the return is a decoded JSON value, treat it as a dict and read the signed-URL field it returns. Inspect it once with `print(descriptor)` to confirm the key for your asset type rather than assuming.

### Downloading a clip job

A clip job can produce several mosaics. List them with `clip.job.downloads`, then resolve a specific one with `clip.job.download`:

```python
available = client.clip.job.downloads({"job_id": "clip_7f2a..."})
print(available)   # the set of mosaics/files you can fetch

one = client.clip.job.download({
    "job_id": "clip_7f2a...",
    "mosaic_type": "visual",
})
# `one` is a JSON descriptor — fetch its signed URL with httpx as above.
href = one["url"]
with httpx.stream("GET", href) as resp:
    resp.raise_for_status()
    with open("clip_visual.tif", "wb") as fh:
        for chunk in resp.iter_bytes():
            fh.write(chunk)
```

### The generated escape hatch (downloads)

The download operations with the generated modules and typed input models. The return is still a decoded JSON value, because the SDK passes the descriptor through untyped:

```python
import httpx

from geopera import AuthenticatedClient
from geopera.api.operations import items_asset_download, clip_job_download, clip_job_downloads
from geopera.models import AssetDownloadInput, ClipDownloadInput, ClipJobInput

client = AuthenticatedClient(base_url="https://api.geopera.com", token="gpra_...")

descriptor = items_asset_download.sync(
    client=client,
    body=AssetDownloadInput(item_id="it_3a9c...", asset_id="visual"),
)

available = clip_job_downloads.sync(
    client=client,
    body=ClipJobInput(job_id="clip_7f2a..."),
)

one = clip_job_download.sync(
    client=client,
    body=ClipDownloadInput(job_id="clip_7f2a...", mosaic_type="visual"),
)

href = descriptor["url"]
with httpx.stream("GET", href) as resp:
    resp.raise_for_status()
    with open("visual.tif", "wb") as fh:
        for chunk in resp.iter_bytes():
            fh.write(chunk)
```

## Errors and edge cases

Every operation on this page can return the typed `Problem` model (for `401`, `403`, `404`, `500`) or `HTTPValidationError` (for `422`), in place of its success value. The fluent client returns these models directly; check the type before using the result. When you need the HTTP status code itself — for example to branch on a `402` quota error — pass `detailed=True` to the fluent call (or use `sync_detailed` on the generated module) and read `response.status_code` and `response.parsed`:

```python
resp = client.uploads.initiate(
    {"project_id": PROJECT_ID, "transfer_method": "signed_url", "total_bytes": total_bytes},
    detailed=True,
)
if resp.status_code == 402:
    raise RuntimeError("Storage quota exceeded — top up before uploading.")
session = resp.parsed   # UploadOutput on success
```

See [errors](/api-reference/errors) for the `Problem` and `HTTPValidationError` models and [calling operations](/api-reference/sdks/python/operations) for `parsed` vs `detailed`.

## Gotchas

- **The SDK never moves your bytes.** Both directions hand you a signed URL; the `PUT` (upload) and `GET` (download) are your own `httpx` calls. Budget for that in timeouts (`timeout=None` or a long timeout) and retries.
- **Signed URLs are short-lived.** Sign immediately before transferring. If a `PUT` or `GET` fails with a `403` from object storage, re-sign with `uploads.signed_url` (or re-resolve the download descriptor) rather than retrying the stale URL.
- **`uploads.complete` and the download ops return decoded JSON, not typed models.** You get a `dict` — there is no attribute access and no IDE autocomplete on the result. Read by key, and `print(...)` once to confirm the field names for your case.
- **Always pair `initiate` with `fail` on the error path.** `uploads.initiate` reserves quota; if you abandon mid-flow without calling `uploads.fail`, the reservation lingers. The `try/except` shape in the examples above is the recommended pattern.
- **Quota is enforced at initiate.** If the org lacks storage, `uploads.initiate` returns a `402` `Problem` — use `detailed=True` (or `sync_detailed`) to branch on the status when you need to surface that to a user.
- **`total_bytes` drives the reservation.** Sum it across every file in a multi-file session; an undersized value can cause `complete` to be rejected after you have already transferred bytes.
- **Downloads are egress-tracked.** Fetching the descriptor's signed URL is metered. `ClipDownloadInput` accepts optional `user_agent` and `requester_ip` if you want to attribute that egress.

## Related

- [Uploads guide](/api-reference/guides/uploads) — the protocol-level walkthrough with raw HTTP.
- [Calling operations](/api-reference/sdks/python/operations) — fluent calls, `detailed=True`, and `sync` vs `sync_detailed`.
- [Async](/api-reference/sdks/python/async) — `asyncio` / `asyncio_detailed` for concurrent transfers.
- [Authentication](/api-reference/authentication) and [scopes](/api-reference/scopes) — what `uploads:write` grants.
- [Errors](/api-reference/errors) — the `Problem` and `HTTPValidationError` models these operations return.
