File handling in custom transforms

When you write or edit a Custom file transform or Custom file validation in a Multi FileFeed, your TypeScript code receives files as NamedFile objects. This page covers every read and write method available on those objects, with guidance on choosing the right one — especially for large files.

For the full catalog of transform types, see The transform library.

Choosing a read method

MethodBest forLoads entire file into memory?
readLines(options?)Text and CSV files of any sizeNo — streams line by line
readTextChunks(options?)Text files where you need raw decoded chunks (not split by newline)No — streams text chunks
readChunks(options?)Binary files of any size (inspecting headers, streaming to output)No — streams raw byte chunks
readBytes()Small files that must be loaded whole (e.g., ZIP archives, binary formats)Yes — has a built-in size guard
sizeBytes()Checking file size before deciding how to readN/A — returns the byte count only

Default to readLines() for text and CSV work. Only reach for readBytes() when the operation genuinely requires the entire file in memory at once (unzipping, binary format parsing).

Reading lines — readLines

readLines(options?: LineReadOptions): AsyncIterable<string>

Streams a file line by line. Lines are split on \n (and \r\n is trimmed by default). This is the most common entry point for text and CSV processing.

Options

OptionTypeDefaultDescription
encoding"utf-8""utf-8"Text encoding used to decode the file
fatalbooleanfalseIf true, throws on invalid byte sequences instead of replacing them
maxLineBytesnumber8 MiBMaximum bytes per line before throwing an error
preserveLineEndingsbooleanfalseIf true, keeps \n or \r\n at the end of each yielded string

Example: filter rows from a pipe-delimited file

import NamedFile, { createNewFile } from "oneschema/namedFile"

export default async function func(files: NamedFile[]): Promise<NamedFile[]> {
  const outputs: NamedFile[] = []

  for (const file of files) {
    const output = createNewFile(file.name)

    await output.writeLines(
      (async function* () {
        for await (const line of file.readLines()) {
          if (line.includes("|")) {
            yield line
          }
        }
      })(),
    )

    outputs.push(output)
  }

  return outputs
}

Reading chunks — readChunks and readTextChunks

readChunks

readChunks(options?: ReadChunksOptions): AsyncIterable<Uint8Array>

Streams raw bytes in fixed-size chunks. Use this for binary inspection, custom framing, or when you need byte-level control.

OptionTypeDefaultDescription
chunkSizenumber64 KiBBytes per chunk
offsetnumber0Byte offset to start reading from
lengthnumberentire fileTotal bytes to read

readTextChunks

readTextChunks(options?: TextReadOptions): AsyncIterable<string>

Streams decoded text in chunks (not split by newline). Useful when you want to pipe text into a streaming parser like CsvParseStream.

OptionTypeDefaultDescription
encoding"utf-8""utf-8"Text encoding
fatalbooleanfalseThrow on invalid byte sequences
chunkSizenumber64 KiBUnderlying byte chunk size

Parsing CSV with oneschema/csv

Import the CSV utilities:

import { CsvParseStream, parse, stringify } from "oneschema/csv"

Large CSV files — use CsvParseStream

Pipe readTextChunks() through CsvParseStream to parse a CSV without loading the whole file into memory:

const rows = ReadableStream.from(file.readTextChunks()).pipeThrough(
  new CsvParseStream({ separator: "|" }),
)

for await (const row of rows) {
  // row is string[] (one element per column)
}

Small CSV files — use parse and stringify

const text = (await Array.fromAsync(file.readTextChunks())).join("")

// Array of arrays:
const rows = parse(text)

// Array of objects keyed by header names:
const records = parse(text, { skipFirstRow: true })

Custom delimiters

Pass a separator option to use a delimiter other than comma:

// Pipe-delimited
new CsvParseStream({ separator: "|" })
parse(text, { separator: "|" })
stringify(rows, { separator: "\t", columns: ["name", "email"] })

Common values: "," (default), "|" (pipe), "\t" (tab), ";" (semicolon).

Writing CSV

stringify() requires a columns array when the input rows are objects:

const csv = stringify(records, { columns: Object.keys(records[0]) })

Gotcha: parse(text, { skipFirstRow: true }) returns Record<string, string>[] — each row is an object keyed by header name, not an array. Access fields by name (e.g., row["email"]), not by index.

Encodings

readLines() and readTextChunks() accept an encoding option. The supported encoding is "utf-8" (the default).

Set fatal: true to reject files with invalid byte sequences rather than silently replacing them with the Unicode replacement character (U+FFFD):

for await (const line of file.readLines({ fatal: true })) {
  // throws if the file contains invalid UTF-8
}

For files that arrive in non-UTF-8 encodings, use the built-in Transcode file encoding transform upstream of your custom transform to normalize to UTF-8 first.

The readBytes() size guard

readBytes() loads the entire file into memory as a Uint8Array. For very large files, it throws an error:

readBytes() cannot read <name> because it is <size>. The whole-file read limit is <limit>. Use readChunks(), readTextChunks(), or readLines() for large files.

If you see this error, switch to a streaming method. For most text and CSV workloads, readLines() is a drop-in replacement.

readBytes() remains the right choice for binary-only operations on files you know are small (e.g., unzipping an archive, parsing an image header).

Writing output

MethodUse case
writeLines(lines, options?)Write text line by line (pair with readLines)
writeChunks(chunks, options?)Write binary chunks
writeTextChunks(chunks, options?)Write decoded text chunks
writeBytes(data, options?)Write a complete Uint8Array
copyTo(output, options?)Copy one file to another without reading into memory
updateLines(transform, options?)Read → transform → write back to the same file
updateChunks(transform, options?)Read → transform → write back to the same file (binary)

Creating new files: call createNewFile("output.csv") to produce additional output files. Your function returns all NamedFile objects that should be passed downstream.

Renaming: call file.rename("new-name.csv") to change a file's name without re-reading its contents.

Write options

OptionTypeDefaultDescription
createbooleantrueCreate the file if it doesn't exist
appendbooleanfalseAppend instead of overwriting
createNewbooleanfalseError if the file already exists

writeLines also accepts:

OptionTypeDefaultDescription
lineEnding"\n" | "\r\n""\n"Line ending to append
preserveLineEndingsbooleanfalseIf true, don't append a line ending (assumes lines already include one)

Complete type reference

import NamedFile, { createNewFile } from "oneschema/namedFile"

export type ReadChunksOptions = {
  chunkSize?: number
  offset?: number
  length?: number
}

export type TextReadOptions = {
  encoding?: "utf-8"
  fatal?: boolean
  chunkSize?: number
}

export type LineReadOptions = TextReadOptions & {
  maxLineBytes?: number
  preserveLineEndings?: boolean
}

export type WriteFileOptions = {
  create?: boolean
  append?: boolean
  createNew?: boolean
}

export type WriteTextOptions = WriteFileOptions & {
  encoding?: "utf-8"
}

export type WriteLinesOptions = WriteFileOptions & {
  lineEnding?: "\n" | "\r\n"
  preserveLineEndings?: boolean
}

export function createNewFile(name: string): NamedFile

export default interface NamedFile {
  name: string
  readonly relativePath: string

  // Read methods
  readBytes(): Promise<Uint8Array>
  sizeBytes(): Promise<number>
  readChunks(options?: ReadChunksOptions): AsyncIterable<Uint8Array>
  readTextChunks(options?: TextReadOptions): AsyncIterable<string>
  readLines(options?: LineReadOptions): AsyncIterable<string>

  // Write methods
  writeBytes(data: Uint8Array, options?: WriteFileOptions): Promise<void>
  writeChunks(
    chunks: AsyncIterable<Uint8Array> | ReadableStream<Uint8Array>,
    options?: WriteFileOptions,
  ): Promise<void>
  writeTextChunks(
    chunks: AsyncIterable<string> | Iterable<string> | ReadableStream<string>,
    options?: WriteTextOptions,
  ): Promise<void>
  writeLines(
    lines: AsyncIterable<string> | Iterable<string>,
    options?: WriteLinesOptions,
  ): Promise<void>
  copyTo(output: NamedFile, options?: WriteFileOptions): Promise<void>
  updateChunks(
    transform: (
      chunks: AsyncIterable<Uint8Array>,
    ) => AsyncIterable<Uint8Array> | ReadableStream<Uint8Array>,
    options?: WriteFileOptions,
  ): Promise<void>
  updateLines(
    transform: (lines: AsyncIterable<string>) => AsyncIterable<string> | Iterable<string>,
    options?: WriteLinesOptions & LineReadOptions,
  ): Promise<void>

  // Rename
  rename(newName: string): void
}