T
- the type of each of the elements of the input PCollectionpublic static class AvroIO.Write.Bound<T> extends PTransform<PCollection<T>,PDone>
PTransform
that writes a bounded PCollection
to an Avro file (or
multiple Avro files matching a sharding pattern).name
Modifier and Type | Method and Description |
---|---|
PDone |
apply(PCollection<T> input)
Applies this
PTransform on the given InputT , and returns its
Output . |
protected Coder<Void> |
getDefaultOutputCoder()
Returns the default
Coder to use for the output of this
single-output PTransform . |
String |
getFilenamePrefix() |
String |
getFilenameSuffix() |
int |
getNumShards() |
Schema |
getSchema() |
String |
getShardNameTemplate()
Returns the current shard name template string.
|
String |
getShardTemplate() |
Class<T> |
getType() |
AvroIO.Write.Bound<T> |
named(String name)
Returns a new
PTransform that's like this one but
with the given step name. |
boolean |
needsValidation() |
void |
populateDisplayData(DisplayData.Builder builder)
Register display data for the given transform or component.
|
AvroIO.Write.Bound<T> |
to(String filenamePrefix)
Returns a new
PTransform that's like this one but
that writes to the file(s) with the given filename prefix. |
AvroIO.Write.Bound<T> |
withNumShards(int numShards)
Returns a new
PTransform that's like this one but
that uses the provided shard count. |
AvroIO.Write.Bound<T> |
withoutSharding()
Returns a new
PTransform that's like this one but
that forces a single file as output. |
AvroIO.Write.Bound<T> |
withoutValidation()
Returns a new
PTransform that's like this one but
that has GCS output path validation on pipeline creation disabled. |
<X> AvroIO.Write.Bound<X> |
withSchema(Class<X> type)
Returns a new
PTransform that's like this one but
that writes to Avro file(s) containing records whose type is the
specified Avro-generated class. |
AvroIO.Write.Bound<GenericRecord> |
withSchema(Schema schema)
Returns a new
PTransform that's like this one but
that writes to Avro file(s) containing records of the specified
schema. |
AvroIO.Write.Bound<GenericRecord> |
withSchema(String schema)
Returns a new
PTransform that's like this one but
that writes to Avro file(s) containing records of the specified
schema in a JSON-encoded string form. |
AvroIO.Write.Bound<T> |
withShardNameTemplate(String shardTemplate)
Returns a new
PTransform that's like this one but
that uses the given shard name template. |
AvroIO.Write.Bound<T> |
withSuffix(String filenameSuffix)
Returns a new
PTransform that's like this one but
that writes to the file(s) with the given filename suffix. |
getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, toString, validate
public AvroIO.Write.Bound<T> named(String name)
PTransform
that's like this one but
with the given step name.
Does not modify this object.
public AvroIO.Write.Bound<T> to(String filenamePrefix)
PTransform
that's like this one but
that writes to the file(s) with the given filename prefix.
See AvroIO.Write.to(String)
for more information
about filenames.
Does not modify this object.
public AvroIO.Write.Bound<T> withSuffix(String filenameSuffix)
PTransform
that's like this one but
that writes to the file(s) with the given filename suffix.
See ShardNameTemplate
for a description of shard templates.
Does not modify this object.
public AvroIO.Write.Bound<T> withNumShards(int numShards)
PTransform
that's like this one but
that uses the provided shard count.
Constraining the number of shards is likely to reduce the performance of a pipeline. Setting this value is not recommended unless you require a specific number of output files.
Does not modify this object.
numShards
- the number of shards to use, or 0 to let the system
decide.ShardNameTemplate
public AvroIO.Write.Bound<T> withShardNameTemplate(String shardTemplate)
PTransform
that's like this one but
that uses the given shard name template.
Does not modify this object.
ShardNameTemplate
public AvroIO.Write.Bound<T> withoutSharding()
PTransform
that's like this one but
that forces a single file as output.
This is a shortcut for
.withNumShards(1).withShardNameTemplate("")
Does not modify this object.
public <X> AvroIO.Write.Bound<X> withSchema(Class<X> type)
PTransform
that's like this one but
that writes to Avro file(s) containing records whose type is the
specified Avro-generated class.
Does not modify this object.
X
- the type of the elements of the input PCollectionpublic AvroIO.Write.Bound<GenericRecord> withSchema(Schema schema)
PTransform
that's like this one but
that writes to Avro file(s) containing records of the specified
schema.
Does not modify this object.
public AvroIO.Write.Bound<GenericRecord> withSchema(String schema)
PTransform
that's like this one but
that writes to Avro file(s) containing records of the specified
schema in a JSON-encoded string form.
Does not modify this object.
public AvroIO.Write.Bound<T> withoutValidation()
PTransform
that's like this one but
that has GCS output path validation on pipeline creation disabled.
Does not modify this object.
This can be useful in the case where the GCS output location does not exist at the pipeline creation time, but is expected to be available at execution time.
public PDone apply(PCollection<T> input)
PTransform
PTransform
on the given InputT
, and returns its
Output
.
Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).
The default implementation throws an exception. A derived class must
either implement apply, or else each runner must supply a custom
implementation via
PipelineRunner.apply(com.google.cloud.dataflow.sdk.transforms.PTransform<InputT, OutputT>, InputT)
.
apply
in class PTransform<PCollection<T>,PDone>
public void populateDisplayData(DisplayData.Builder builder)
PTransform
populateDisplayData(DisplayData.Builder)
is invoked by Pipeline runners to collect
display data via DisplayData.from(HasDisplayData)
. Implementations may call
super.populateDisplayData(builder)
in order to register display data in the current
namespace, but should otherwise use subcomponent.populateDisplayData(builder)
to use
the namespace of the subcomponent.
By default, does not register any display data. Implementors may override this method to provide their own display data.
populateDisplayData
in interface HasDisplayData
populateDisplayData
in class PTransform<PCollection<T>,PDone>
builder
- The builder to populate with display data.HasDisplayData
public String getShardNameTemplate()
protected Coder<Void> getDefaultOutputCoder()
PTransform
Coder
to use for the output of this
single-output PTransform
.
By default, always throws
getDefaultOutputCoder
in class PTransform<PCollection<T>,PDone>
public String getFilenamePrefix()
public String getShardTemplate()
public int getNumShards()
public String getFilenameSuffix()
public Schema getSchema()
public boolean needsValidation()