T - the type of each of the elements of the resulting
PCollection, decoded from the lines of the text filepublic static class TextIO.Read.Bound<T> extends PTransform<PInput,PCollection<T>>
PTransform that reads from a text file (or multiple text files
matching a pattern) and returns a bounded PCollection containing the
decoding of each of the lines of the text file(s). The default
decoding just returns the lines.name| Modifier and Type | Method and Description |
|---|---|
PCollection<T> |
apply(PInput input)
Applies this
PTransform on the given InputT, and returns its
Output. |
TextIO.Read.Bound<T> |
from(String filepattern)
Returns a new TextIO.Read PTransform that's like this one but
that reads from the file(s) with the given name or pattern.
|
TextIO.CompressionType |
getCompressionType() |
protected Coder<T> |
getDefaultOutputCoder()
Returns the default
Coder to use for the output of this
single-output PTransform. |
String |
getFilepattern() |
TextIO.Read.Bound<T> |
named(String name)
Returns a new TextIO.Read PTransform that's like this one but
with the given step name.
|
boolean |
needsValidation() |
<X> TextIO.Read.Bound<X> |
withCoder(Coder<X> coder)
Returns a new TextIO.Read PTransform that's like this one but
that uses the given
Coder<X> to decode each of the
lines of the file into a value of type X. |
TextIO.Read.Bound<T> |
withCompressionType(TextIO.CompressionType compressionType)
Returns a new TextIO.Read PTransform that's like this one but
reads from input sources using the specified compression type.
|
TextIO.Read.Bound<T> |
withoutValidation()
Returns a new TextIO.Read PTransform that's like this one but
that has GCS path validation on pipeline creation disabled.
|
getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, toString, validatepublic TextIO.Read.Bound<T> named(String name)
public TextIO.Read.Bound<T> from(String filepattern)
TextIO.Read.from(java.lang.String) for a description of
filepatterns.) Does not modify this object.public <X> TextIO.Read.Bound<X> withCoder(Coder<X> coder)
Coder<X> to decode each of the
lines of the file into a value of type X. Does not
modify this object.X - the type of the decoded elements, and the
elements of the resulting PCollectionpublic TextIO.Read.Bound<T> withoutValidation()
This can be useful in the case where the GCS input does not exist at the pipeline creation time, but is expected to be available at execution time.
public TextIO.Read.Bound<T> withCompressionType(TextIO.CompressionType compressionType)
If AUTO compression type is specified, a compression type is selected on a per-file basis, based on the file's extension (e.g., .gz will be processed as a gzipped file, .bz will be processed as a bzipped file, other extensions with be treated as uncompressed input).
If no compression type is specified, the default is AUTO.
public PCollection<T> apply(PInput input)
PTransformPTransform on the given InputT, and returns its
Output.
Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).
The default implementation throws an exception. A derived class must
either implement apply, or else each runner must supply a custom
implementation via
PipelineRunner.apply(com.google.cloud.dataflow.sdk.transforms.PTransform<InputT, OutputT>, InputT).
apply in class PTransform<PInput,PCollection<T>>protected Coder<T> getDefaultOutputCoder()
PTransformCoder to use for the output of this
single-output PTransform.
By default, always throws
getDefaultOutputCoder in class PTransform<PInput,PCollection<T>>public String getFilepattern()
public boolean needsValidation()
public TextIO.CompressionType getCompressionType()