T
- the type of each of the elements of the resulting
PCollectionpublic static class AvroIO.Read.Bound<T> extends PTransform<PInput,PCollection<T>>
PTransform
that reads from an Avro file (or multiple Avro
files matching a pattern) and returns a bounded PCollection
containing
the decoding of each record.name
Modifier and Type | Method and Description |
---|---|
PCollection<T> |
apply(PInput input)
Applies this
PTransform on the given InputT , and returns its
Output . |
AvroIO.Read.Bound<T> |
from(String filepattern)
Returns a new
PTransform that's like this one but
that reads from the file(s) with the given name or pattern. |
protected Coder<T> |
getDefaultOutputCoder()
Returns the default
Coder to use for the output of this
single-output PTransform . |
String |
getFilepattern() |
Schema |
getSchema() |
AvroIO.Read.Bound<T> |
named(String name)
Returns a new
PTransform that's like this one but
with the given step name. |
boolean |
needsValidation() |
void |
populateDisplayData(DisplayData.Builder builder)
Register display data for the given transform or component.
|
AvroIO.Read.Bound<T> |
withoutValidation()
Returns a new
PTransform that's like this one but
that has GCS input path validation on pipeline creation disabled. |
<X> AvroIO.Read.Bound<X> |
withSchema(Class<X> type)
Returns a new
PTransform that's like this one but
that reads Avro file(s) containing records whose type is the
specified Avro-generated class. |
AvroIO.Read.Bound<GenericRecord> |
withSchema(Schema schema)
Returns a new
PTransform that's like this one but
that reads Avro file(s) containing records of the specified schema. |
AvroIO.Read.Bound<GenericRecord> |
withSchema(String schema)
Returns a new
PTransform that's like this one but
that reads Avro file(s) containing records of the specified schema
in a JSON-encoded string form. |
getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, toString, validate
public AvroIO.Read.Bound<T> named(String name)
PTransform
that's like this one but
with the given step name.
Does not modify this object.
public AvroIO.Read.Bound<T> from(String filepattern)
PTransform
that's like this one but
that reads from the file(s) with the given name or pattern.
(See AvroIO.Read.from(java.lang.String)
for a description of
filepatterns.)
Does not modify this object.
public <X> AvroIO.Read.Bound<X> withSchema(Class<X> type)
PTransform
that's like this one but
that reads Avro file(s) containing records whose type is the
specified Avro-generated class.
Does not modify this object.
X
- the type of the decoded elements and the elements of
the resulting PCollectionpublic AvroIO.Read.Bound<GenericRecord> withSchema(Schema schema)
PTransform
that's like this one but
that reads Avro file(s) containing records of the specified schema.
Does not modify this object.
public AvroIO.Read.Bound<GenericRecord> withSchema(String schema)
PTransform
that's like this one but
that reads Avro file(s) containing records of the specified schema
in a JSON-encoded string form.
Does not modify this object.
public AvroIO.Read.Bound<T> withoutValidation()
PTransform
that's like this one but
that has GCS input path validation on pipeline creation disabled.
Does not modify this object.
This can be useful in the case where the GCS input location does not exist at the pipeline creation time, but is expected to be available at execution time.
public PCollection<T> apply(PInput input)
PTransform
PTransform
on the given InputT
, and returns its
Output
.
Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).
The default implementation throws an exception. A derived class must
either implement apply, or else each runner must supply a custom
implementation via
PipelineRunner.apply(com.google.cloud.dataflow.sdk.transforms.PTransform<InputT, OutputT>, InputT)
.
apply
in class PTransform<PInput,PCollection<T>>
public void populateDisplayData(DisplayData.Builder builder)
PTransform
populateDisplayData(DisplayData.Builder)
is invoked by Pipeline runners to collect
display data via DisplayData.from(HasDisplayData)
. Implementations may call
super.populateDisplayData(builder)
in order to register display data in the current
namespace, but should otherwise use subcomponent.populateDisplayData(builder)
to use
the namespace of the subcomponent.
By default, does not register any display data. Implementors may override this method to provide their own display data.
populateDisplayData
in interface HasDisplayData
populateDisplayData
in class PTransform<PInput,PCollection<T>>
builder
- The builder to populate with display data.HasDisplayData
protected Coder<T> getDefaultOutputCoder()
PTransform
Coder
to use for the output of this
single-output PTransform
.
By default, always throws
getDefaultOutputCoder
in class PTransform<PInput,PCollection<T>>
public String getFilepattern()
public Schema getSchema()
public boolean needsValidation()