File handling in custom transforms

When you write or edit a Custom file transform or Custom file validation in a Multi FileFeed, your TypeScript code receives files as NamedFile objects. This page covers every read and write method available on those objects, with guidance on choosing the right one — especially for large files.

For the full catalog of transform types, see The transform library.

Choosing a read method

Method	Best for	Loads entire file into memory?
`readLines(options?)`	Text and CSV files of any size	No — streams line by line
`readTextChunks(options?)`	Text files where you need raw decoded chunks (not split by newline)	No — streams text chunks
`readChunks(options?)`	Binary files of any size (inspecting headers, streaming to output)	No — streams raw byte chunks
`readBytes()`	Small files that must be loaded whole (e.g., ZIP archives, binary formats)	Yes — limited by available memory
`sizeBytes()`	Checking file size before deciding how to read	N/A — returns the byte count only

Default to readLines() for text and CSV work. Only reach for readBytes() when the operation genuinely requires the entire file in memory at once (unzipping, binary format parsing).

Reading lines — `readLines`

readLines(options?: LineReadOptions): AsyncIterable<string>

Streams a file line by line. Lines are split on \n (and \r\n is trimmed by default). This is the most common entry point for text and CSV processing.

Options

Option	Type	Default	Description
`encoding`	`"utf-8"`	`"utf-8"`	Text encoding used to decode the file
`fatal`	`boolean`	`false`	If `true`, throws on invalid byte sequences instead of replacing them
`maxLineBytes`	`number`	8 MiB	Maximum bytes per line before throwing an error
`preserveLineEndings`	`boolean`	`false`	If `true`, keeps `\n` or `\r\n` at the end of each yielded string

Example: filter rows from a pipe-delimited file

import NamedFile, { createNewFile } from "oneschema/namedFile"

export default async function func(files: NamedFile[]): Promise<NamedFile[]> {
  const outputs: NamedFile[] = []

  for (const file of files) {
    const output = createNewFile(file.name)

    await output.writeLines(
      (async function* () {
        for await (const line of file.readLines()) {
          if (line.includes("|")) {
            yield line
          }
        }
      })(),
    )

    outputs.push(output)
  }

  return outputs
}

Reading chunks — `readChunks` and `readTextChunks`

`readChunks`

readChunks(options?: ReadChunksOptions): AsyncIterable<Uint8Array>

Streams raw bytes in fixed-size chunks. Use this for binary inspection, custom framing, or when you need byte-level control.

Option	Type	Default	Description
`chunkSize`	`number`	64 KiB	Bytes per chunk
`offset`	`number`	`0`	Byte offset to start reading from
`length`	`number`	entire file	Total bytes to read

`readTextChunks`

readTextChunks(options?: TextReadOptions): AsyncIterable<string>

Streams decoded text in chunks (not split by newline). Useful when you want to pipe text into a streaming parser like CsvParseStream.

Option	Type	Default	Description
`encoding`	`"utf-8"`	`"utf-8"`	Text encoding
`fatal`	`boolean`	`false`	Throw on invalid byte sequences
`chunkSize`	`number`	64 KiB	Underlying byte chunk size

Parsing CSV with `oneschema/csv`

Import the CSV utilities:

import { CsvParseStream, parse, stringify } from "oneschema/csv"

Large CSV files — use `CsvParseStream`

Pipe readTextChunks() through CsvParseStream to parse a CSV without loading the whole file into memory:

const rows = ReadableStream.from(file.readTextChunks()).pipeThrough(
  new CsvParseStream({ separator: "|" }),
)

for await (const row of rows) {
  // row is string[] (one element per column)
}

Small CSV files — use `parse` and `stringify`

const text = (await Array.fromAsync(file.readTextChunks())).join("")

// Array of arrays:
const rows = parse(text)

// Array of objects keyed by header names:
const records = parse(text, { skipFirstRow: true })

Custom delimiters

Pass a separator option to use a delimiter other than comma:

// Pipe-delimited
new CsvParseStream({ separator: "|" })
parse(text, { separator: "|" })
stringify(rows, { separator: "\t", columns: ["name", "email"] })

Common values: "," (default), "|" (pipe), "\t" (tab), ";" (semicolon).

Writing CSV

stringify() requires a columns array when the input rows are objects:

const csv = stringify(records, { columns: Object.keys(records[0]) })

Gotcha: parse(text, { skipFirstRow: true }) returns Record<string, string>[] — each row is an object keyed by header name, not an array. Access fields by name (e.g., row["email"]), not by index.

Encodings

readLines() and readTextChunks() accept an encoding option. The supported encoding is "utf-8" (the default).

Set fatal: true to reject files with invalid byte sequences rather than silently replacing them with the Unicode replacement character (U+FFFD):

for await (const line of file.readLines({ fatal: true })) {
  // throws if the file contains invalid UTF-8
}

For files that arrive in non-UTF-8 encodings, use the built-in Transcode file encoding transform upstream of your custom transform to normalize to UTF-8 first.

When to avoid `readBytes()`

readBytes() loads the entire file into memory as a Uint8Array. This works for any file size that fits in available memory, but for large files it can cause out-of-memory failures. Prefer a streaming method (readLines(), readChunks(), readTextChunks()) for files larger than a few hundred megabytes.

readBytes() remains the right choice for binary-only operations on files you know are small (e.g., unzipping an archive, parsing an image header).

Writing output

Method	Use case
`writeLines(lines, options?)`	Write text line by line (pair with `readLines`)
`writeChunks(chunks, options?)`	Write binary chunks
`writeTextChunks(chunks, options?)`	Write decoded text chunks
`writeBytes(data, options?)`	Write a complete `Uint8Array`
`copyTo(output, options?)`	Copy one file to another without reading into memory
`updateLines(transform, options?)`	Read → transform → write back to the same file
`updateChunks(transform, options?)`	Read → transform → write back to the same file (binary)

Creating new files: call createNewFile("output.csv") to produce additional output files. Your function returns all NamedFile objects that should be passed downstream.

Renaming files — `rename`

rename(newName: string): void

Changes a file's logical name without re-reading or copying its contents. This is the fastest way to change a file extension or normalize file names.

import NamedFile from "oneschema/namedFile"

export default async function func(inputFiles: NamedFile[]): Promise<NamedFile[]> {
  for (const file of inputFiles) {
    file.rename(file.name.replace(/\.txt$/i, ".csv"))
  }
  return inputFiles
}

rename operates in-place — it mutates the existing NamedFile object and returns nothing. You do not need to create a new file or copy bytes. The file's on-disk content is unchanged; only the name that downstream transforms and destinations see is updated.

Use rename when:

You need to change a file extension (e.g., .txt → .csv).
You want to normalize or sanitize file names before passing them downstream.
The file content does not need to change — only the name does.

Use copyTo instead when you need a new file with a different name while keeping the original intact.

Write options

Option	Type	Default	Description
`create`	`boolean`	`true`	Create the file if it doesn't exist
`append`	`boolean`	`false`	Append instead of overwriting
`createNew`	`boolean`	`false`	Error if the file already exists

writeLines also accepts:

Option	Type	Default	Description
`lineEnding`	`"\n"` \| `"\r\n"`	`"\n"`	Line ending to append
`preserveLineEndings`	`boolean`	`false`	If `true`, don't append a line ending (assumes lines already include one)

Complete type reference

import NamedFile, { createNewFile } from "oneschema/namedFile"

export type ReadChunksOptions = {
  chunkSize?: number
  offset?: number
  length?: number
}

export type TextReadOptions = {
  encoding?: "utf-8"
  fatal?: boolean
  chunkSize?: number
}

export type LineReadOptions = TextReadOptions & {
  maxLineBytes?: number
  preserveLineEndings?: boolean
}

export type WriteFileOptions = {
  create?: boolean
  append?: boolean
  createNew?: boolean
}

export type WriteTextOptions = WriteFileOptions & {
  encoding?: "utf-8"
}

export type WriteLinesOptions = WriteFileOptions & {
  lineEnding?: "\n" | "\r\n"
  preserveLineEndings?: boolean
}

export function createNewFile(name: string): NamedFile

export default interface NamedFile {
  name: string
  readonly relativePath: string

  // Read methods
  readBytes(): Promise<Uint8Array>
  sizeBytes(): Promise<number>
  readChunks(options?: ReadChunksOptions): AsyncIterable<Uint8Array>
  readTextChunks(options?: TextReadOptions): AsyncIterable<string>
  readLines(options?: LineReadOptions): AsyncIterable<string>

  // Write methods
  writeBytes(data: Uint8Array, options?: WriteFileOptions): Promise<void>
  writeChunks(
    chunks: AsyncIterable<Uint8Array> | ReadableStream<Uint8Array>,
    options?: WriteFileOptions,
  ): Promise<void>
  writeTextChunks(
    chunks: AsyncIterable<string> | Iterable<string> | ReadableStream<string>,
    options?: WriteTextOptions,
  ): Promise<void>
  writeLines(
    lines: AsyncIterable<string> | Iterable<string>,
    options?: WriteLinesOptions,
  ): Promise<void>
  copyTo(output: NamedFile, options?: WriteFileOptions): Promise<void>
  updateChunks(
    transform: (
      chunks: AsyncIterable<Uint8Array>,
    ) => AsyncIterable<Uint8Array> | ReadableStream<Uint8Array>,
    options?: WriteFileOptions,
  ): Promise<void>
  updateLines(
    transform: (lines: AsyncIterable<string>) => AsyncIterable<string> | Iterable<string>,
    options?: WriteLinesOptions & LineReadOptions,
  ): Promise<void>

  // Rename
  rename(newName: string): void
}

Choosing a read method

Reading lines — readLines

Reading chunks — readChunks and readTextChunks

readChunks

readTextChunks

Parsing CSV with oneschema/csv

Large CSV files — use CsvParseStream

Small CSV files — use parse and stringify