public class TextIO
extends java.lang.Object
To read a PCollection
from one or more text files, use
TextIO.Read
, specifying TextIO.Read.from(java.lang.String)
to specify
the path of the file(s) to read from (e.g., a local filename or
filename pattern if running locally, or a Google Cloud Storage
filename or filename pattern of the form
"gs://<bucket>/<filepath>"
), and optionally
TextIO.Read.named(java.lang.String)
to specify the name of the pipeline step
and/or TextIO.Read.withCoder(com.google.cloud.dataflow.sdk.coders.Coder<T>)
to specify the Coder to use to
decode the text lines into Java values. For example:
Pipeline p = ...;
// A simple Read of a local file (only runs locally):
PCollection<String> lines =
p.apply(TextIO.Read.from("/path/to/file.txt"));
// A fully-specified Read from a GCS file (runs locally and via the
// Google Cloud Dataflow service):
PCollection<Integer> numbers =
p.apply(TextIO.Read.named("ReadNumbers")
.from("gs://my_bucket/path/to/numbers-*.txt")
.withCoder(TextualIntegerCoder.of()));
To write a PCollection
to one or more text files, use
TextIO.Write
, specifying TextIO.Write.to(java.lang.String)
to specify
the path of the file to write to (e.g., a local filename or sharded
filename pattern if running locally, or a Google Cloud Storage
filename or sharded filename pattern of the form
"gs://<bucket>/<filepath>"
), and optionally
TextIO.Write.named(java.lang.String)
to specify the name of the pipeline step
and/or TextIO.Write.withCoder(com.google.cloud.dataflow.sdk.coders.Coder<T>)
to specify the Coder to use
to encode the Java values into text lines. For example:
// A simple Write to a local file (only runs locally):
PCollection<String> lines = ...;
lines.apply(TextIO.Write.to("/path/to/file.txt"));
// A fully-specified Write to a sharded GCS file (runs locally and via the
// Google Cloud Dataflow service):
PCollection<Integer> numbers = ...;
numbers.apply(TextIO.Write.named("WriteNumbers")
.to("gs://my_bucket/path/to/numbers")
.withSuffix(".txt")
.withCoder(TextualIntegerCoder.of()));
Modifier and Type | Class and Description |
---|---|
static class |
TextIO.CompressionType
Possible text file compression types.
|
static class |
TextIO.Read
A root PTransform that reads from a text file (or multiple text
files matching a pattern) and returns a PCollection containing
the decoding of each of the lines of the text file(s).
|
static class |
TextIO.Write
A PTransform that writes a PCollection to a text file (or
multiple text files matching a sharding pattern), with each
PCollection element being encoded into its own line.
|
Modifier and Type | Field and Description |
---|---|
static Coder<java.lang.String> |
DEFAULT_TEXT_CODER |
Constructor and Description |
---|
TextIO() |
public static final Coder<java.lang.String> DEFAULT_TEXT_CODER