public static class TextIO.Write extends Object
PTransform
that writes a PCollection
to text file (or
multiple text files matching a sharding pattern), with each
element of the input collection encoded into its own line.Modifier and Type | Class and Description |
---|---|
static class |
TextIO.Write.Bound<T>
A PTransform that writes a bounded PCollection to a text file (or
multiple text files matching a sharding pattern), with each
PCollection element being encoded into its own line.
|
Constructor and Description |
---|
Write() |
Modifier and Type | Method and Description |
---|---|
static TextIO.Write.Bound<String> |
named(String name)
Returns a transform for writing to text files with the given step name.
|
static TextIO.Write.Bound<String> |
to(String prefix)
Returns a transform for writing to text files that writes to the file(s)
with the given prefix.
|
static <T> TextIO.Write.Bound<T> |
withCoder(Coder<T> coder)
Returns a transform for writing to text files that uses the given
Coder to encode each of the elements of the input
PCollection into an output text line. |
static TextIO.Write.Bound<String> |
withFooter(String footer)
Returns a transform for writing to text files that adds a footer string to the files
it writes.
|
static TextIO.Write.Bound<String> |
withHeader(String header)
Returns a transform for writing to text files that adds a header string to the files
it writes.
|
static TextIO.Write.Bound<String> |
withNumShards(int numShards)
Returns a transform for writing to text files that uses the provided shard count.
|
static TextIO.Write.Bound<String> |
withoutSharding()
Returns a transform for writing to text files that forces a single file as
output.
|
static TextIO.Write.Bound<String> |
withoutValidation()
Returns a transform for writing to text files that has GCS path validation on
pipeline creation disabled.
|
static TextIO.Write.Bound<String> |
withShardNameTemplate(String shardTemplate)
Returns a transform for writing to text files that uses the given shard name
template.
|
static TextIO.Write.Bound<String> |
withSuffix(String nameExtension)
Returns a transform for writing to text files that appends the specified suffix
to the created files.
|
public static TextIO.Write.Bound<String> named(String name)
public static TextIO.Write.Bound<String> to(String prefix)
"gs://<bucket>/<filepath>"
(if running locally or via the Google Cloud Dataflow service).
The files written will begin with this prefix, followed by
a shard identifier (see TextIO.Write.Bound.withNumShards(int)
, and end
in a common extension, if given by TextIO.Write.Bound.withSuffix(String)
.
public static TextIO.Write.Bound<String> withSuffix(String nameExtension)
public static TextIO.Write.Bound<String> withNumShards(int numShards)
Constraining the number of shards is likely to reduce the performance of a pipeline. Setting this value is not recommended unless you require a specific number of output files.
numShards
- the number of shards to use, or 0 to let the system
decide.public static TextIO.Write.Bound<String> withShardNameTemplate(String shardTemplate)
See ShardNameTemplate
for a description of shard templates.
public static TextIO.Write.Bound<String> withoutSharding()
public static <T> TextIO.Write.Bound<T> withCoder(Coder<T> coder)
Coder
to encode each of the elements of the input
PCollection
into an output text line.
By default, uses StringUtf8Coder
, which writes input
Java strings directly as output lines.
T
- the type of the elements of the input PCollection
public static TextIO.Write.Bound<String> withoutValidation()
This can be useful in the case where the GCS output location does not exist at the pipeline creation time, but is expected to be available at execution time.
public static TextIO.Write.Bound<String> withHeader(@Nullable String header)
A null
value will clear any previously configured header.
header
- the string to be added as file headerpublic static TextIO.Write.Bound<String> withFooter(@Nullable String footer)
A null
value will clear any previously configured footer.
footer
- the string to be added as file footer