public abstract static class AvroIO.TypedWrite<UserT,DestinationT,OutputT>
extends org.apache.beam.sdk.transforms.PTransform<org.apache.beam.sdk.values.PCollection<UserT>,org.apache.beam.sdk.io.WriteFilesResult<DestinationT>>
AvroIO.write(java.lang.Class<T>)
.Constructor and Description |
---|
TypedWrite() |
Modifier and Type | Method and Description |
---|---|
org.apache.beam.sdk.io.WriteFilesResult<DestinationT> |
expand(org.apache.beam.sdk.values.PCollection<UserT> input) |
void |
populateDisplayData(org.apache.beam.sdk.transforms.display.DisplayData.Builder builder) |
<NewDestinationT> |
to(DynamicAvroDestinations<UserT,NewDestinationT,OutputT> dynamicDestinations)
Deprecated.
Use
FileIO.write() or FileIO.writeDynamic() instead. |
AvroIO.TypedWrite<UserT,DestinationT,OutputT> |
to(org.apache.beam.sdk.io.FileBasedSink.FilenamePolicy filenamePolicy)
Writes to files named according to the given
FileBasedSink.FilenamePolicy . |
AvroIO.TypedWrite<UserT,DestinationT,OutputT> |
to(org.apache.beam.sdk.io.fs.ResourceId outputPrefix)
Writes to file(s) with the given output prefix.
|
AvroIO.TypedWrite<UserT,DestinationT,OutputT> |
to(java.lang.String outputPrefix)
Writes to file(s) with the given output prefix.
|
AvroIO.TypedWrite<UserT,DestinationT,OutputT> |
to(org.apache.beam.sdk.options.ValueProvider<java.lang.String> outputPrefix)
Like
to(String) . |
AvroIO.TypedWrite<UserT,DestinationT,OutputT> |
toResource(org.apache.beam.sdk.options.ValueProvider<org.apache.beam.sdk.io.fs.ResourceId> outputPrefix)
Like
to(ResourceId) . |
AvroIO.TypedWrite<UserT,DestinationT,OutputT> |
withBadRecordErrorHandler(org.apache.beam.sdk.transforms.errorhandling.ErrorHandler<org.apache.beam.sdk.transforms.errorhandling.BadRecord,?> errorHandler)
See
FileIO.Write#withBadRecordErrorHandler(ErrorHandler) for details on usage. |
AvroIO.TypedWrite<UserT,DestinationT,OutputT> |
withCodec(org.apache.avro.file.CodecFactory codec)
Writes to Avro file(s) compressed using specified codec.
|
AvroIO.TypedWrite<UserT,DestinationT,OutputT> |
withDatumWriterFactory(AvroSink.DatumWriterFactory<OutputT> datumWriterFactory)
Specifies a
AvroSink.DatumWriterFactory to use for creating DatumWriter instances. |
AvroIO.TypedWrite<UserT,DestinationT,OutputT> |
withFormatFunction(@Nullable org.apache.beam.sdk.transforms.SerializableFunction<UserT,OutputT> formatFunction)
Specifies a format function to convert
UserT to the output type. |
AvroIO.TypedWrite<UserT,DestinationT,OutputT> |
withMetadata(java.util.Map<java.lang.String,java.lang.Object> metadata)
Writes to Avro file(s) with the specified metadata.
|
AvroIO.TypedWrite<UserT,DestinationT,OutputT> |
withNoSpilling()
See
WriteFiles.withNoSpilling() . |
AvroIO.TypedWrite<UserT,DestinationT,OutputT> |
withNumShards(int numShards)
Configures the number of output shards produced overall (when using unwindowed writes) or
per-window (when using windowed writes).
|
AvroIO.TypedWrite<UserT,DestinationT,OutputT> |
withoutSharding()
Forces a single file as output and empty shard name template.
|
AvroIO.TypedWrite<UserT,DestinationT,OutputT> |
withSchema(org.apache.avro.Schema schema)
Sets the output schema.
|
AvroIO.TypedWrite<UserT,DestinationT,OutputT> |
withShardNameTemplate(java.lang.String shardTemplate)
Uses the given
ShardNameTemplate for naming output files. |
AvroIO.TypedWrite<UserT,DestinationT,OutputT> |
withSuffix(java.lang.String filenameSuffix)
Configures the filename suffix for written files.
|
AvroIO.TypedWrite<UserT,DestinationT,OutputT> |
withSyncInterval(int syncInterval)
Sets the approximate number of uncompressed bytes to write in each block for the AVRO
container format.
|
AvroIO.TypedWrite<UserT,DestinationT,OutputT> |
withTempDirectory(org.apache.beam.sdk.io.fs.ResourceId tempDirectory)
Set the base directory used to generate temporary files.
|
AvroIO.TypedWrite<UserT,DestinationT,OutputT> |
withTempDirectory(org.apache.beam.sdk.options.ValueProvider<org.apache.beam.sdk.io.fs.ResourceId> tempDirectory)
Set the base directory used to generate temporary files.
|
AvroIO.TypedWrite<UserT,DestinationT,OutputT> |
withWindowedWrites()
Preserves windowing of input elements and writes them to files based on the element's window.
|
addAnnotation, compose, compose, getAdditionalInputs, getAnnotations, getDefaultOutputCoder, getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, getResourceHints, setDisplayData, setResourceHints, toString, validate, validate
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> to(java.lang.String outputPrefix)
FileSystems
for information on
supported file systems.
The name of the output files will be determined by the FileBasedSink.FilenamePolicy
used.
By default, a DefaultFilenamePolicy
will build output filenames using the
specified prefix, a shard name template (see withShardNameTemplate(String)
, and a
common suffix (if supplied using withSuffix(String)
). This default can be overridden
using #to(FilenamePolicy)
.
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> to(org.apache.beam.sdk.io.fs.ResourceId outputPrefix)
FileSystems
for information on
supported file systems. This prefix is used by the DefaultFilenamePolicy
to generate
filenames.
By default, a DefaultFilenamePolicy
will build output filenames using the
specified prefix, a shard name template (see withShardNameTemplate(String)
, and a
common suffix (if supplied using withSuffix(String)
). This default can be overridden
using #to(FilenamePolicy)
.
This default policy can be overridden using #to(FilenamePolicy)
, in which case
withShardNameTemplate(String)
and withSuffix(String)
should not be set.
Custom filename policies do not automatically see this prefix - you should explicitly pass
the prefix into your FileBasedSink.FilenamePolicy
object if you need this.
If withTempDirectory(org.apache.beam.sdk.options.ValueProvider<org.apache.beam.sdk.io.fs.ResourceId>)
has not been called, this filename prefix will be used to
infer a directory for temporary files.
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> to(org.apache.beam.sdk.options.ValueProvider<java.lang.String> outputPrefix)
to(String)
.public AvroIO.TypedWrite<UserT,DestinationT,OutputT> toResource(org.apache.beam.sdk.options.ValueProvider<org.apache.beam.sdk.io.fs.ResourceId> outputPrefix)
to(ResourceId)
.public AvroIO.TypedWrite<UserT,DestinationT,OutputT> to(org.apache.beam.sdk.io.FileBasedSink.FilenamePolicy filenamePolicy)
FileBasedSink.FilenamePolicy
. A directory for
temporary files must be specified using withTempDirectory(org.apache.beam.sdk.options.ValueProvider<org.apache.beam.sdk.io.fs.ResourceId>)
.@Deprecated public <NewDestinationT> AvroIO.TypedWrite<UserT,NewDestinationT,OutputT> to(DynamicAvroDestinations<UserT,NewDestinationT,OutputT> dynamicDestinations)
FileIO.write()
or FileIO.writeDynamic()
instead.DynamicAvroDestinations
object to vend FileBasedSink.FilenamePolicy
objects. These
objects can examine the input record when creating a FileBasedSink.FilenamePolicy
. A directory for
temporary files must be specified using withTempDirectory(org.apache.beam.sdk.options.ValueProvider<org.apache.beam.sdk.io.fs.ResourceId>)
.public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withSyncInterval(int syncInterval)
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withSchema(org.apache.avro.Schema schema)
GenericRecord
and
when not using to(DynamicAvroDestinations)
.public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withFormatFunction(@Nullable org.apache.beam.sdk.transforms.SerializableFunction<UserT,OutputT> formatFunction)
UserT
to the output type. If to(DynamicAvroDestinations)
is used, FileBasedSink.DynamicDestinations.formatRecord(UserT)
must be
used instead.public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withTempDirectory(org.apache.beam.sdk.options.ValueProvider<org.apache.beam.sdk.io.fs.ResourceId> tempDirectory)
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withTempDirectory(org.apache.beam.sdk.io.fs.ResourceId tempDirectory)
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withShardNameTemplate(java.lang.String shardTemplate)
ShardNameTemplate
for naming output files. This option may only be
used when using one of the default filename-prefix to() overrides.
See DefaultFilenamePolicy
for how the prefix, shard name template, and suffix are
used.
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withSuffix(java.lang.String filenameSuffix)
See DefaultFilenamePolicy
for how the prefix, shard name template, and suffix are
used.
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withNumShards(int numShards)
For unwindowed writes, constraining the number of shards is likely to reduce the performance of a pipeline. Setting this value is not recommended unless you require a specific number of output files.
numShards
- the number of shards to use, or 0 to let the system decide.public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withoutSharding()
For unwindowed writes, constraining the number of shards is likely to reduce the performance of a pipeline. Setting this value is not recommended unless you require a specific number of output files.
This is equivalent to .withNumShards(1).withShardNameTemplate("")
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withWindowedWrites()
If using #to(FilenamePolicy)
. Filenames will be generated using FileBasedSink.FilenamePolicy.windowedFilename(int, int, org.apache.beam.sdk.transforms.windowing.BoundedWindow, org.apache.beam.sdk.transforms.windowing.PaneInfo, org.apache.beam.sdk.io.FileBasedSink.OutputFileHints)
. See also WriteFiles.withWindowedWrites()
.
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withNoSpilling()
WriteFiles.withNoSpilling()
.public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withCodec(org.apache.avro.file.CodecFactory codec)
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withDatumWriterFactory(AvroSink.DatumWriterFactory<OutputT> datumWriterFactory)
AvroSink.DatumWriterFactory
to use for creating DatumWriter
instances.public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withMetadata(java.util.Map<java.lang.String,java.lang.Object> metadata)
Supported value types are String, Long, and byte[].
public AvroIO.TypedWrite<UserT,DestinationT,OutputT> withBadRecordErrorHandler(org.apache.beam.sdk.transforms.errorhandling.ErrorHandler<org.apache.beam.sdk.transforms.errorhandling.BadRecord,?> errorHandler)
FileIO.Write#withBadRecordErrorHandler(ErrorHandler)
for details on usage.public org.apache.beam.sdk.io.WriteFilesResult<DestinationT> expand(org.apache.beam.sdk.values.PCollection<UserT> input)
expand
in class org.apache.beam.sdk.transforms.PTransform<org.apache.beam.sdk.values.PCollection<UserT>,org.apache.beam.sdk.io.WriteFilesResult<DestinationT>>
public void populateDisplayData(org.apache.beam.sdk.transforms.display.DisplayData.Builder builder)
populateDisplayData
in interface org.apache.beam.sdk.transforms.display.HasDisplayData
populateDisplayData
in class org.apache.beam.sdk.transforms.PTransform<org.apache.beam.sdk.values.PCollection<UserT>,org.apache.beam.sdk.io.WriteFilesResult<DestinationT>>