public static class TextIO.Read extends Object
PTransform that reads from a text file (or multiple text
files matching a pattern) and returns a PCollection containing
the decoding of each of the lines of the text file(s). The
default decoding just returns the lines.| Modifier and Type | Class and Description |
|---|---|
static class |
TextIO.Read.Bound<T>
A
PTransform that reads from a text file (or multiple text files
matching a pattern) and returns a bounded PCollection containing the
decoding of each of the lines of the text file(s). |
| Constructor and Description |
|---|
Read() |
| Modifier and Type | Method and Description |
|---|---|
static TextIO.Read.Bound<String> |
from(String filepattern)
Returns a
TextIO.Read PTransform that reads from the file(s)
with the given name or pattern. |
static TextIO.Read.Bound<String> |
named(String name)
Returns a
TextIO.Read PTransform with the given step name. |
static <T> TextIO.Read.Bound<T> |
withCoder(Coder<T> coder)
Returns a TextIO.Read PTransform that uses the given
Coder<T> to decode each of the lines of the file into a
value of type T. |
static TextIO.Read.Bound<String> |
withCompressionType(TextIO.CompressionType compressionType)
Returns a TextIO.Read PTransform that reads from a file with the
specified compression type.
|
static TextIO.Read.Bound<String> |
withoutValidation()
Returns a TextIO.Read PTransform that has GCS path validation on
pipeline creation disabled.
|
public static TextIO.Read.Bound<String> named(String name)
TextIO.Read PTransform with the given step name.public static TextIO.Read.Bound<String> from(String filepattern)
TextIO.Read PTransform that reads from the file(s)
with the given name or pattern. This can be a local filename
or filename pattern (if running locally), or a Google Cloud
Storage filename or filename pattern of the form
"gs://<bucket>/<filepath>") (if running locally or via
the Google Cloud Dataflow service). Standard
Java Filesystem glob patterns ("*", "?", "[..]") are supported.public static <T> TextIO.Read.Bound<T> withCoder(Coder<T> coder)
Coder<T> to decode each of the lines of the file into a
value of type T.
By default, uses StringUtf8Coder, which just
returns the text lines as Java strings.
T - the type of the decoded elements, and the elements
of the resulting PCollectionpublic static TextIO.Read.Bound<String> withoutValidation()
This can be useful in the case where the GCS input does not exist at the pipeline creation time, but is expected to be available at execution time.
public static TextIO.Read.Bound<String> withCompressionType(TextIO.CompressionType compressionType)
If no compression type is specified, the default is AUTO. In this mode, the compression type of the file is determined by its extension (e.g., *.gz is gzipped, *.bz2 is bzipped, all other extensions are uncompressed).