public class PCollectionTuple extends java.lang.Object implements PInput, POutput
PCollectionTuple is an immutable tuple of
heterogeneously-typed PCollections, "keyed" by
TupleTags. A PCollectionTuple can be used as the input or
output of a
PTransform taking
or producing multiple PCollection inputs or outputs that can be of
different types, for instance a
ParDo with side
outputs.
PCollectionTuples can be created and accessed like follows:
PCollection<String> pc1 = ...;
PCollection<Integer> pc2 = ...;
PCollection<Iterable<String>> pc3 = ...;
// Create TupleTags for each of the PCollections to put in the
// PCollectionTuple (the type of the TupleTag enables tracking the
// static type of each of the PCollections in the PCollectionTuple):
TupleTag<String> tag1 = new TupleTag<>();
TupleTag<Integer> tag2 = new TupleTag<>();
TupleTag<Iterable<String>> tag3 = new TupleTag<>();
// Create a PCollectionTuple with three PCollections:
PCollectionTuple pcs =
PCollectionTuple.of(tag1, pc1)
.and(tag2, pc2)
.and(tag3, pc3);
// Create an empty PCollectionTuple:
Pipeline p = ...;
PCollectionTuple pcs2 = PCollectionTuple.empty(p);
// Get PCollections out of a PCollectionTuple, using the same tags
// that were used to put them in:
PCollection<Integer> pcX = pcs.get(tag2);
PCollection<String> pcY = pcs.get(tag1);
PCollection<Iterable<String>> pcZ = pcs.get(tag3);
// Get a map of all PCollections in a PCollectionTuple:
Map<TupleTag<?>, PCollection<?>> allPcs = pcs.getAll();
| Modifier and Type | Method and Description |
|---|---|
<T> PCollectionTuple |
and(TupleTag<T> tag,
PCollection<T> pc)
Returns a new PCollectionTuple that has all the PCollections and
tags of this PCollectionTuple plus the given PCollection and tag.
|
<Output extends POutput> |
apply(PTransform<PCollectionTuple,Output> t)
Applies the given PTransform to this input PCollectionTuple, and
returns the PTransform's Output.
|
static PCollectionTuple |
empty(Pipeline pipeline)
Returns an empty PCollectionTuple that is part of the given Pipeline.
|
java.util.Collection<? extends PValue> |
expand()
Expands this PInput into a list of its component input PValues.
|
void |
finishSpecifying()
After building, finalizes this PInput to make it ready for
being used as an input to a PTransform.
|
void |
finishSpecifyingOutput()
As part of finishing the producing
PTransform, finalizes this
PTransform output to make it ready for being used as an input and
for running. |
<T> PCollection<T> |
get(TupleTag<T> tag)
Returns the PCollection with the given tag in this
PCollectionTuple.
|
java.util.Map<TupleTag<?>,PCollection<?>> |
getAll()
Returns an immutable Map from TupleTag to corresponding
PCollection, for all the members of this PCollectionTuple.
|
Pipeline |
getPipeline()
Returns the owning Pipeline of this PInput.
|
<T> boolean |
has(TupleTag<T> tag)
Returns whether this PCollectionTuple contains a PCollection with
the given tag.
|
static <T> PCollectionTuple |
of(TupleTag<T> tag,
PCollection<T> pc)
Returns a singleton PCollectionTuple containing the given
PCollection keyed by the given TupleTag.
|
static PCollectionTuple |
ofPrimitiveOutputsInternal(TupleTagList outputTags,
WindowFn<?,?> windowFn)
Returns a PCollectionTuple with each of the given tags mapping to a new
output PCollection.
|
void |
recordAsOutput(Pipeline pipeline,
PTransform<?,?> transform)
Records that this
POutput is an output of the given
PTransform in the given Pipeline. |
public static PCollectionTuple empty(Pipeline pipeline)
Longer PCollectionTuples can be created by calling
and(com.google.cloud.dataflow.sdk.values.TupleTag<T>, com.google.cloud.dataflow.sdk.values.PCollection<T>) on the result.
public static <T> PCollectionTuple of(TupleTag<T> tag, PCollection<T> pc)
Longer PCollectionTuples can be created by calling
and(com.google.cloud.dataflow.sdk.values.TupleTag<T>, com.google.cloud.dataflow.sdk.values.PCollection<T>) on the result.
public <T> PCollectionTuple and(TupleTag<T> tag, PCollection<T> pc)
The given TupleTag should not already be mapped to a PCollection in this PCollectionTuple.
All the PCollections in the resulting PCollectionTuple must be part of the same Pipeline.
public <T> boolean has(TupleTag<T> tag)
public <T> PCollection<T> get(TupleTag<T> tag)
!has(tag).public java.util.Map<TupleTag<?>,PCollection<?>> getAll()
public <Output extends POutput> Output apply(PTransform<PCollectionTuple,Output> t)
public static PCollectionTuple ofPrimitiveOutputsInternal(TupleTagList outputTags, WindowFn<?,?> windowFn)
For use by primitive transformations only.
public Pipeline getPipeline()
PInputgetPipeline in interface PInputpublic java.util.Collection<? extends PValue> expand()
PInputA PValue expands to itself.
A tuple or list of PValues (e.g., PCollectionTuple, and PCollectionList) expands to its component PValues.
Not intended to be invoked directly by user code.
public void recordAsOutput(Pipeline pipeline, PTransform<?,?> transform)
POutputPOutput is an output of the given
PTransform in the given Pipeline.
Should expand this POutput and invoke
PValue#recordAsOutput(Pipeline,
com.google.cloud.dataflow.sdk.transforms.PTransform,
String) on each component output PValue.
Automatically invoked as part of applying a
PTransform. Not to be invoked directly by user code.
recordAsOutput in interface POutputpublic void finishSpecifying()
PInputAfter building, finalizes this PInput to make it ready for being used as an input to a PTransform.
Automatically invoked whenever apply() is invoked on
this PInput, so users do not normally call this explicitly.
finishSpecifying in interface PInputpublic void finishSpecifyingOutput()
POutputPTransform, finalizes this
PTransform output to make it ready for being used as an input and
for running.
This includes ensuring that all PCollections
have Coders specified or defaulted.
Automatically invoked whenever this POutput is used
as a PInput to another PTransform, or if never
used as a PInput, when Pipeline.run() is called, so
users do not normally call this explicitly.
finishSpecifyingOutput in interface POutput