Class Combine.GroupedValues<K,InputT,OutputT>
- java.lang.Object
-
- org.apache.beam.sdk.transforms.PTransform<PCollection<? extends KV<K,? extends java.lang.Iterable<InputT>>>,PCollection<KV<K,OutputT>>>
-
- org.apache.beam.sdk.transforms.Combine.GroupedValues<K,InputT,OutputT>
-
- Type Parameters:
K
- type of input and output keysInputT
- type of input valuesOutputT
- type of output values
- All Implemented Interfaces:
java.io.Serializable
,HasDisplayData
- Enclosing class:
- Combine
public static class Combine.GroupedValues<K,InputT,OutputT> extends PTransform<PCollection<? extends KV<K,? extends java.lang.Iterable<InputT>>>,PCollection<KV<K,OutputT>>>
GroupedValues<K, InputT, OutputT>
takes aPCollection<KV<K, Iterable<InputT>>>
, such as the result ofGroupByKey
, applies a specifiedCombineFn<InputT, AccumT, OutputT>
to each of the inputKV<K, Iterable<InputT>>
elements to produce a combined outputKV<K, OutputT>
element, and returns aPCollection<KV<K, OutputT>>
containing all the combined output elements. It is common forInputT == OutputT
, but not required. Common combining functions include sums, mins, maxes, and averages of numbers, conjunctions and disjunctions of booleans, statistical aggregations, etc.Example of use:
PCollection<KV<String, Integer>> pc = ...; PCollection<KV<String, Iterable<Integer>>> groupedByKey = pc.apply( GroupByKey.create()); PCollection<KV<String, Integer>> sumByKey = groupedByKey.apply(Combine.groupedValues( Sum.ofIntegers()));
See also
Combine.perKey(org.apache.beam.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>)
/Combine.PerKey
, which captures the common pattern of "combining by key" in a single easy-to-usePTransform
.Combining for different keys can happen in parallel. Moreover, combining of the
Iterable<InputT>
values associated a single key can happen in parallel, with different subsets of the values being combined separately, and their intermediate results combined further, in an arbitrary tree reduction pattern, until a single result value is produced for each key.By default, the
Coder
of the keys of the outputPCollection<KV<K, OutputT>>
is that of the keys of the inputPCollection<KV<K, InputT>>
, and theCoder
of the values of the outputPCollection<KV<K, OutputT>>
is inferred from the concrete type of theCombineFn<InputT, AccumT, OutputT>
's output typeOutputT
.Each output element has the same timestamp and is in the same window as its corresponding input element, and the output
PCollection
has the sameWindowFn
associated with it as the input.See also
Combine.globally(org.apache.beam.sdk.transforms.SerializableFunction<java.lang.Iterable<V>, V>)
/Combine.Globally
, which combines all the values in aPCollection
into a single value in aPCollection
.- See Also:
- Serialized Form
-
-
Field Summary
-
Fields inherited from class org.apache.beam.sdk.transforms.PTransform
name, resourceHints
-
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description PCollection<KV<K,OutputT>>
expand(PCollection<? extends KV<K,? extends java.lang.Iterable<InputT>>> input)
Override this method to specify how thisPTransform
should be expanded on the givenInputT
.AppliedCombineFn<? super K,? super InputT,?,OutputT>
getAppliedFn(CoderRegistry registry, Coder<? extends KV<K,? extends java.lang.Iterable<InputT>>> inputCoder, WindowingStrategy<?,?> windowingStrategy)
Returns theCombine.CombineFn
bound to its coders.CombineFnBase.GlobalCombineFn<? super InputT,?,OutputT>
getFn()
Returns theCombineFnBase.GlobalCombineFn
used by this Combine operation.java.util.List<PCollectionView<?>>
getSideInputs()
void
populateDisplayData(DisplayData.Builder builder)
Register display data for the given transform or component.Combine.GroupedValues<K,InputT,OutputT>
withSideInputs(java.lang.Iterable<? extends PCollectionView<?>> sideInputs)
Combine.GroupedValues<K,InputT,OutputT>
withSideInputs(PCollectionView<?>... sideInputs)
-
Methods inherited from class org.apache.beam.sdk.transforms.PTransform
compose, compose, getAdditionalInputs, getDefaultOutputCoder, getDefaultOutputCoder, getDefaultOutputCoder, getKindString, getName, getResourceHints, setResourceHints, toString, validate, validate
-
-
-
-
Method Detail
-
withSideInputs
public Combine.GroupedValues<K,InputT,OutputT> withSideInputs(PCollectionView<?>... sideInputs)
-
withSideInputs
public Combine.GroupedValues<K,InputT,OutputT> withSideInputs(java.lang.Iterable<? extends PCollectionView<?>> sideInputs)
-
getFn
public CombineFnBase.GlobalCombineFn<? super InputT,?,OutputT> getFn()
Returns theCombineFnBase.GlobalCombineFn
used by this Combine operation.
-
getSideInputs
public java.util.List<PCollectionView<?>> getSideInputs()
-
expand
public PCollection<KV<K,OutputT>> expand(PCollection<? extends KV<K,? extends java.lang.Iterable<InputT>>> input)
Description copied from class:PTransform
Override this method to specify how thisPTransform
should be expanded on the givenInputT
.NOTE: This method should not be called directly. Instead apply the
PTransform
should be applied to theInputT
using theapply
method.Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).
- Specified by:
expand
in classPTransform<PCollection<? extends KV<K,? extends java.lang.Iterable<InputT>>>,PCollection<KV<K,OutputT>>>
-
getAppliedFn
public AppliedCombineFn<? super K,? super InputT,?,OutputT> getAppliedFn(CoderRegistry registry, Coder<? extends KV<K,? extends java.lang.Iterable<InputT>>> inputCoder, WindowingStrategy<?,?> windowingStrategy)
Returns theCombine.CombineFn
bound to its coders.For internal use.
-
populateDisplayData
public void populateDisplayData(DisplayData.Builder builder)
Description copied from class:PTransform
Register display data for the given transform or component.populateDisplayData(DisplayData.Builder)
is invoked by Pipeline runners to collect display data viaDisplayData.from(HasDisplayData)
. Implementations may callsuper.populateDisplayData(builder)
in order to register display data in the current namespace, but should otherwise usesubcomponent.populateDisplayData(builder)
to use the namespace of the subcomponent.By default, does not register any display data. Implementors may override this method to provide their own display data.
- Specified by:
populateDisplayData
in interfaceHasDisplayData
- Overrides:
populateDisplayData
in classPTransform<PCollection<? extends KV<K,? extends java.lang.Iterable<InputT>>>,PCollection<KV<K,OutputT>>>
- Parameters:
builder
- The builder to populate with display data.- See Also:
HasDisplayData
-
-