K - the type of the keys of the input and output
PCollectionsVI - the type of the values of the input PCollectionVO - the type of the values of the output PCollectionpublic static class Combine.PerKey<K,VI,VO> extends PTransform<PCollection<KV<K,VI>>,PCollection<KV<K,VO>>>
PerKey<K, VI, VO> takes a
PCollection<KV<K, VI>>, groups it by key, applies a
combining function to the VI values associated with each
key to produce a combined VO value, and returns a
PCollection<KV<K, VO>> representing a map from each
distinct key of the input PCollection to the corresponding
combined value. VI and VO are often the same.
This is a concise shorthand for an application of
GroupByKey followed by an application of
Combine.GroupedValues. See those
operations for more details on how keys are compared for equality
and on the default Coder for the output.
Example of use:
PCollection<KV<String, Double>> salesRecords = ...;
PCollection<KV<String, Double>> totalSalesPerPerson =
salesRecords.apply(Combine.<String, Double>perKey(
new Sum.SumDoubleFn()));
Each output element is in the window by which its corresponding input
was grouped, and has the timestamp of the end of that window. The output
PCollection has the same
WindowFn
as the input.
name| Modifier and Type | Method and Description |
|---|---|
PCollection<KV<K,VO>> |
apply(PCollection<KV<K,VI>> input)
Applies this
PTransform on the given Input, and returns its
Output. |
Combine.KeyedCombineFn<? super K,? super VI,?,VO> |
getFn()
Returns the KeyedCombineFn used by this Combine operation.
|
protected java.lang.String |
getKindString()
Returns a string describing what kind of
PTransform this is. |
Combine.PerKeyWithHotKeys<K,VI,VO> |
withHotKeys(int hotKeySpread)
Like
withHotKeys(SerializableFunction), but returning the given
constant value for every key. |
Combine.PerKeyWithHotKeys<K,VI,VO> |
withHotKeys(SerializableFunction<? super K,java.lang.Integer> hotKeySpread)
If a single key has disproportionately many values, it may become a
bottleneck, especially in streaming mode.
|
Combine.PerKey<K,VI,VO> |
withName(java.lang.String name)
Sets the base name of this
PTransform and returns itself. |
finishSpecifying, getCoderRegistry, getDefaultName, getDefaultOutputCoder, getDefaultOutputCoder, getInput, getName, getOutput, getPipeline, setName, setPipeline, toStringpublic Combine.PerKeyWithHotKeys<K,VI,VO> withHotKeys(SerializableFunction<? super K,java.lang.Integer> hotKeySpread)
hotKeySpread - a function from keys to an integer N, where the key
will be spread among N intermediate nodes for partial combining.
If N is less than or equal to 1, this key will not be sent through an
intermediate node.public Combine.PerKeyWithHotKeys<K,VI,VO> withHotKeys(int hotKeySpread)
withHotKeys(SerializableFunction), but returning the given
constant value for every key.public Combine.PerKey<K,VI,VO> withName(java.lang.String name)
PTransformPTransform and returns itself.
This is a shortcut for calling PTransform.setName(java.lang.String), which allows method
chaining.
withName in class PTransform<PCollection<KV<K,VI>>,PCollection<KV<K,VO>>>public Combine.KeyedCombineFn<? super K,? super VI,?,VO> getFn()
public PCollection<KV<K,VO>> apply(PCollection<KV<K,VI>> input)
PTransformPTransform on the given Input, and returns its
Output.
Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).
The default implementation throws an exception. A derived class must
either implement apply, or else each runner must supply a custom
implementation via
PipelineRunner.apply(com.google.cloud.dataflow.sdk.transforms.PTransform<Input, Output>, Input).
apply in class PTransform<PCollection<KV<K,VI>>,PCollection<KV<K,VO>>>protected java.lang.String getKindString()
PTransformPTransform this is.
By default, returns the base name of this
PTransform's class.
getKindString in class PTransform<PCollection<KV<K,VI>>,PCollection<KV<K,VO>>>