Uploading files from S3 into Multi FileFeeds

Copy files directly from your S3 bucket into a Multi FileFeed import using server-side S3 copy — no data flows through your application. Supports files up to 20 GiB.

Use the upload-from-s3 endpoint to import files from your own Amazon S3 bucket into a Multi FileFeed import. OneSchema copies the file server-side (S3 to S3) using your configured IAM role, so no file data passes through the API client.

This is the recommended upload method for large files (up to 20 GiB) and for automated pipelines where files already reside in S3. Note that presigned/direct uploads are limited to 5 GiB, making upload-from-s3 the only option for files larger than 5 GiB.


How it works

sequenceDiagram
    participant Client as API Client
    participant API as OneSchema API
    participant SrcS3 as Source S3 Bucket<br/>(your account)
    participant DstS3 as Destination S3 Bucket<br/>(OneSchema account)

    Client->>API: POST /upload-from-s3<br/>{s3_account_id, object_uri}
    API->>API: Assume your IAM role via STS<br/>(two-hop role chain)
    Note right of API: All S3 calls below use<br/>your role's credentials
    API->>SrcS3: HeadObject (verify file exists, get size)
    SrcS3-->>API: 200 OK (content_length, content_type)
    alt File ≤ 5 GiB
        API->>DstS3: CopyObject (single-part)
        DstS3-->>API: 200 OK
    else File > 5 GiB
        API->>DstS3: CreateMultipartUpload
        DstS3-->>API: upload_id
        API->>DstS3: UploadPartCopy ×N (1 GiB parts, parallel)
        DstS3-->>API: ETags
        API->>DstS3: CompleteMultipartUpload
        DstS3-->>API: 200 OK
    end
    API-->>Client: 200 OK
  1. Your API client sends a single POST request with the S3 URI and the ID of the S3 account to use.
  2. The OneSchema API assumes the IAM role configured on that S3 account (via STS) and verifies the file exists.
  3. The API copies the file directly from your bucket to OneSchema's internal storage using the S3 CopyObject API. For files larger than 5 GiB, the copy is performed as a multipart upload with 1 GiB parts copied in parallel.
  4. The API returns 200 OK once the copy is complete.

No file data is downloaded to or uploaded from the API server — the copy happens entirely within AWS.


Prerequisites

This guide uses the following placeholders — replace them with your actual values:

PlaceholderDescription
SOURCE_ACCOUNT_IDYour AWS account ID (12-digit number)
SOURCE_BUCKETThe S3 bucket containing files to import
YOUR_EXTERNAL_IDShared secret for the IAM trust policy
ONESCHEMA_ACCOUNT_IDOneSchema's AWS account ID (provided during onboarding)
ONESCHEMA_IMPORTS_BUCKETOneSchema's internal imports bucket (provided during onboarding)

1. Create an S3 Account

Register an IAM role that OneSchema will assume to access your bucket. You need:

  • Role ARN — the ARN of an IAM role in your AWS account (e.g. arn:aws:iam::SOURCE_ACCOUNT_ID:role/OneSchemaS3AccessRole)
  • External ID — a shared secret used in the role's trust policy for secure cross-account assumption
curl -X POST "https://api.oneschema.co/v0/s3-accounts" \
  -H "X-API-KEY: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Production S3 Access",
    "role_arn": "arn:aws:iam::SOURCE_ACCOUNT_ID:role/OneSchemaS3AccessRole",
    "external_id": "YOUR_EXTERNAL_ID"
  }'

Save the returned id — this is the s3_account_id used in the upload call.

API reference: Create S3 Account

2. Configure your IAM role

The S3 CopyObject API requires the caller to have both read access on the source bucket and write access on the destination bucket using the same credentials. Because OneSchema uses your IAM role's credentials to perform the copy, your role needs permissions on both sides.

Trust policy

Allow OneSchema's intermediate role to assume your role:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::ONESCHEMA_ACCOUNT_ID:role/OneSchemaS3ConnectionReader"
      },
      "Action": "sts:AssumeRole",
      "Condition": {
        "StringEquals": {
          "sts:ExternalId": "YOUR_EXTERNAL_ID"
        }
      }
    }
  ]
}

Permissions policy — source bucket (read)

Grant read access to the bucket(s) you want to copy files from:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "ReadSourceBucket",
      "Effect": "Allow",
      "Action": ["s3:GetObject", "s3:ListBucket"],
      "Resource": ["arn:aws:s3:::SOURCE_BUCKET", "arn:aws:s3:::SOURCE_BUCKET/*"]
    }
  ]
}

If your source objects are encrypted with a customer-managed KMS key (SSE-KMS), also grant kms:Decrypt on that key.

Permissions policy — destination bucket (write)

Grant write access to OneSchema's internal imports bucket so that CopyObject can write the file into OneSchema's storage:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "WriteDestinationBucket",
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:AbortMultipartUpload",
        "s3:ListMultipartUploadParts"
      ],
      "Resource": "arn:aws:s3:::ONESCHEMA_IMPORTS_BUCKET/*"
    }
  ]
}

s3:AbortMultipartUpload and s3:ListMultipartUploadParts are required for files larger than 5 GiB, which use multipart copy.

3. OneSchema-side configuration

OneSchema also configures a bucket policy on the destination imports bucket to allow your role to write to it. This is handled by the OneSchema team during onboarding — no action is required from you.

Contact your OneSchema account team to obtain the ONESCHEMA_ACCOUNT_ID and ONESCHEMA_IMPORTS_BUCKET values for your deployment region.


Step 1 — Create a Multi FileFeed Import

Create a new import on the target Multi FileFeed.

POST /v0/multi-file-feeds/{multi_file_feed_id}/imports
curl -X POST "https://api.oneschema.co/v0/multi-file-feeds/42/imports" \
  -H "X-API-KEY: YOUR_API_KEY"

Response

{
  "id": 101,
  "multi_file_feed_id": 42,
  "status": "initialized",
  "created_at": "2026-03-02T12:00:00Z"
}

Save the returned id — this is your multi_file_feed_import_id.

API reference: Create a Multi FileFeed import


Step 2 — Upload files from S3

For each file in your source bucket, call the upload-from-s3 endpoint. Provide the s3_account_id from the prerequisite step and the object_uri pointing to the file.

POST /v0/multi-file-feeds/{multi_file_feed_id}/imports/{multi_file_feed_import_id}/upload-from-s3
curl -X POST "https://api.oneschema.co/v0/multi-file-feeds/42/imports/101/upload-from-s3" \
  -H "X-API-KEY: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "s3_account_id": 1,
    "object_uri": "s3://SOURCE_BUCKET/path/to/orders.csv"
  }'

A 200 OK response indicates the file was copied successfully. Repeat for each file you want to add to the import.

API reference: Upload a file from S3 to a Multi FileFeed import

Error responses

StatusMeaning
400Invalid S3 URI, file not accessible, file exceeds 20 GiB limit, or import already submitted
400Unable to copy S3 object — your role lacks write permissions on the destination bucket (see Permissions policy — destination bucket)
409A file with the same S3 URI has already been uploaded to this import
500Infrastructure error (e.g. STS role assumption failure)

Step 3 — Submit the Import

Once all files have been uploaded, submit the import for processing.

POST /v0/multi-file-feeds/{multi_file_feed_id}/imports/{multi_file_feed_import_id}/submit
curl -X POST "https://api.oneschema.co/v0/multi-file-feeds/42/imports/101/submit" \
  -H "X-API-KEY: YOUR_API_KEY"

Response

{
  "id": 101,
  "multi_file_feed_id": 42,
  "status": "submitted",
  "created_at": "2026-03-02T12:00:00Z"
}

API reference: Submit a Multi FileFeed import


Full Example (Python)

import requests

API_KEY = "YOUR_API_KEY"
BASE_URL = "https://api.oneschema.co"
MFF_ID = 42
S3_ACCOUNT_ID = 1
HEADERS = {"X-API-KEY": API_KEY}

s3_files = [
    "s3://SOURCE_BUCKET/data/orders.csv",
    "s3://SOURCE_BUCKET/data/customers.csv",
]

# Step 1: Create the MFF import
resp = requests.post(f"{BASE_URL}/v0/multi-file-feeds/{MFF_ID}/imports", headers=HEADERS)
resp.raise_for_status()
import_id = resp.json()["id"]
print(f"Created import {import_id}")

# Step 2: Upload each file from S3
for uri in s3_files:
    resp = requests.post(
        f"{BASE_URL}/v0/multi-file-feeds/{MFF_ID}/imports/{import_id}/upload-from-s3",
        headers=HEADERS,
        json={"s3_account_id": S3_ACCOUNT_ID, "object_uri": uri},
    )
    resp.raise_for_status()
    print(f"Uploaded {uri}")

# Step 3: Submit the import
resp = requests.post(
    f"{BASE_URL}/v0/multi-file-feeds/{MFF_ID}/imports/{import_id}/submit",
    headers=HEADERS,
)
resp.raise_for_status()
print(f"Import {import_id} submitted for processing")

Limits

ConstraintValue
Maximum file size20 GiB
Files per importNo hard limit (submit when ready)
Concurrent uploadsOne upload-from-s3 call per import at a time (sequential)
RegionSource bucket should be in the same AWS region as your OneSchema deployment for best performance

Quick Reference