File handling in custom transforms
When you write or edit a Custom file transform or Custom file validation in a Multi FileFeed, your TypeScript code receives files as NamedFile objects. This page covers every read and write method available on those objects, with guidance on choosing the right one — especially for large files.
For the full catalog of transform types, see The transform library.
Choosing a read method
| Method | Best for | Loads entire file into memory? |
|---|---|---|
readLines(options?) | Text and CSV files of any size | No — streams line by line |
readTextChunks(options?) | Text files where you need raw decoded chunks (not split by newline) | No — streams text chunks |
readChunks(options?) | Binary files of any size (inspecting headers, streaming to output) | No — streams raw byte chunks |
readBytes() | Small files that must be loaded whole (e.g., ZIP archives, binary formats) | Yes — has a built-in size guard |
sizeBytes() | Checking file size before deciding how to read | N/A — returns the byte count only |
Default to readLines() for text and CSV work. Only reach for readBytes() when the operation genuinely requires the entire file in memory at once (unzipping, binary format parsing).
Reading lines — readLines
readLinesreadLines(options?: LineReadOptions): AsyncIterable<string>Streams a file line by line. Lines are split on \n (and \r\n is trimmed by default). This is the most common entry point for text and CSV processing.
Options
| Option | Type | Default | Description |
|---|---|---|---|
encoding | "utf-8" | "utf-8" | Text encoding used to decode the file |
fatal | boolean | false | If true, throws on invalid byte sequences instead of replacing them |
maxLineBytes | number | 8 MiB | Maximum bytes per line before throwing an error |
preserveLineEndings | boolean | false | If true, keeps \n or \r\n at the end of each yielded string |
Example: filter rows from a pipe-delimited file
import NamedFile, { createNewFile } from "oneschema/namedFile"
export default async function func(files: NamedFile[]): Promise<NamedFile[]> {
const outputs: NamedFile[] = []
for (const file of files) {
const output = createNewFile(file.name)
await output.writeLines(
(async function* () {
for await (const line of file.readLines()) {
if (line.includes("|")) {
yield line
}
}
})(),
)
outputs.push(output)
}
return outputs
}Reading chunks — readChunks and readTextChunks
readChunks and readTextChunksreadChunks
readChunksreadChunks(options?: ReadChunksOptions): AsyncIterable<Uint8Array>Streams raw bytes in fixed-size chunks. Use this for binary inspection, custom framing, or when you need byte-level control.
| Option | Type | Default | Description |
|---|---|---|---|
chunkSize | number | 64 KiB | Bytes per chunk |
offset | number | 0 | Byte offset to start reading from |
length | number | entire file | Total bytes to read |
readTextChunks
readTextChunksreadTextChunks(options?: TextReadOptions): AsyncIterable<string>Streams decoded text in chunks (not split by newline). Useful when you want to pipe text into a streaming parser like CsvParseStream.
| Option | Type | Default | Description |
|---|---|---|---|
encoding | "utf-8" | "utf-8" | Text encoding |
fatal | boolean | false | Throw on invalid byte sequences |
chunkSize | number | 64 KiB | Underlying byte chunk size |
Parsing CSV with oneschema/csv
oneschema/csvImport the CSV utilities:
import { CsvParseStream, parse, stringify } from "oneschema/csv"Large CSV files — use CsvParseStream
CsvParseStreamPipe readTextChunks() through CsvParseStream to parse a CSV without loading the whole file into memory:
const rows = ReadableStream.from(file.readTextChunks()).pipeThrough(
new CsvParseStream({ separator: "|" }),
)
for await (const row of rows) {
// row is string[] (one element per column)
}Small CSV files — use parse and stringify
parse and stringifyconst text = (await Array.fromAsync(file.readTextChunks())).join("")
// Array of arrays:
const rows = parse(text)
// Array of objects keyed by header names:
const records = parse(text, { skipFirstRow: true })Custom delimiters
Pass a separator option to use a delimiter other than comma:
// Pipe-delimited
new CsvParseStream({ separator: "|" })
parse(text, { separator: "|" })
stringify(rows, { separator: "\t", columns: ["name", "email"] })Common values: "," (default), "|" (pipe), "\t" (tab), ";" (semicolon).
Writing CSV
stringify() requires a columns array when the input rows are objects:
const csv = stringify(records, { columns: Object.keys(records[0]) })Gotcha:
parse(text, { skipFirstRow: true })returnsRecord<string, string>[]— each row is an object keyed by header name, not an array. Access fields by name (e.g.,row["email"]), not by index.
Encodings
readLines() and readTextChunks() accept an encoding option. The supported encoding is "utf-8" (the default).
Set fatal: true to reject files with invalid byte sequences rather than silently replacing them with the Unicode replacement character (U+FFFD):
for await (const line of file.readLines({ fatal: true })) {
// throws if the file contains invalid UTF-8
}For files that arrive in non-UTF-8 encodings, use the built-in Transcode file encoding transform upstream of your custom transform to normalize to UTF-8 first.
The readBytes() size guard
readBytes() size guardreadBytes() loads the entire file into memory as a Uint8Array. For very large files, it throws an error:
readBytes() cannot read <name> because it is <size>. The whole-file read limit is <limit>. Use readChunks(), readTextChunks(), or readLines() for large files.
If you see this error, switch to a streaming method. For most text and CSV workloads, readLines() is a drop-in replacement.
readBytes() remains the right choice for binary-only operations on files you know are small (e.g., unzipping an archive, parsing an image header).
Writing output
| Method | Use case |
|---|---|
writeLines(lines, options?) | Write text line by line (pair with readLines) |
writeChunks(chunks, options?) | Write binary chunks |
writeTextChunks(chunks, options?) | Write decoded text chunks |
writeBytes(data, options?) | Write a complete Uint8Array |
copyTo(output, options?) | Copy one file to another without reading into memory |
updateLines(transform, options?) | Read → transform → write back to the same file |
updateChunks(transform, options?) | Read → transform → write back to the same file (binary) |
Creating new files: call createNewFile("output.csv") to produce additional output files. Your function returns all NamedFile objects that should be passed downstream.
Renaming: call file.rename("new-name.csv") to change a file's name without re-reading its contents.
Write options
| Option | Type | Default | Description |
|---|---|---|---|
create | boolean | true | Create the file if it doesn't exist |
append | boolean | false | Append instead of overwriting |
createNew | boolean | false | Error if the file already exists |
writeLines also accepts:
| Option | Type | Default | Description |
|---|---|---|---|
lineEnding | "\n" | "\r\n" | "\n" | Line ending to append |
preserveLineEndings | boolean | false | If true, don't append a line ending (assumes lines already include one) |
Complete type reference
import NamedFile, { createNewFile } from "oneschema/namedFile"
export type ReadChunksOptions = {
chunkSize?: number
offset?: number
length?: number
}
export type TextReadOptions = {
encoding?: "utf-8"
fatal?: boolean
chunkSize?: number
}
export type LineReadOptions = TextReadOptions & {
maxLineBytes?: number
preserveLineEndings?: boolean
}
export type WriteFileOptions = {
create?: boolean
append?: boolean
createNew?: boolean
}
export type WriteTextOptions = WriteFileOptions & {
encoding?: "utf-8"
}
export type WriteLinesOptions = WriteFileOptions & {
lineEnding?: "\n" | "\r\n"
preserveLineEndings?: boolean
}
export function createNewFile(name: string): NamedFile
export default interface NamedFile {
name: string
readonly relativePath: string
// Read methods
readBytes(): Promise<Uint8Array>
sizeBytes(): Promise<number>
readChunks(options?: ReadChunksOptions): AsyncIterable<Uint8Array>
readTextChunks(options?: TextReadOptions): AsyncIterable<string>
readLines(options?: LineReadOptions): AsyncIterable<string>
// Write methods
writeBytes(data: Uint8Array, options?: WriteFileOptions): Promise<void>
writeChunks(
chunks: AsyncIterable<Uint8Array> | ReadableStream<Uint8Array>,
options?: WriteFileOptions,
): Promise<void>
writeTextChunks(
chunks: AsyncIterable<string> | Iterable<string> | ReadableStream<string>,
options?: WriteTextOptions,
): Promise<void>
writeLines(
lines: AsyncIterable<string> | Iterable<string>,
options?: WriteLinesOptions,
): Promise<void>
copyTo(output: NamedFile, options?: WriteFileOptions): Promise<void>
updateChunks(
transform: (
chunks: AsyncIterable<Uint8Array>,
) => AsyncIterable<Uint8Array> | ReadableStream<Uint8Array>,
options?: WriteFileOptions,
): Promise<void>
updateLines(
transform: (lines: AsyncIterable<string>) => AsyncIterable<string> | Iterable<string>,
options?: WriteLinesOptions & LineReadOptions,
): Promise<void>
// Rename
rename(newName: string): void
}Updated about 2 hours ago