GROUP_KEY
- group key type. Must be a supported typeGROUP_VALUE
- group value type. Must be a supported typeAGG_VALUE
- agg value typeOUT
- output object type@Beta public abstract class BatchReducibleAggregator<GROUP_KEY,GROUP_VALUE,AGG_VALUE,OUT> extends BatchConfigurable<BatchAggregatorContext> implements ReducibleAggregator<GROUP_KEY,GROUP_VALUE,AGG_VALUE,OUT>, PipelineConfigurable, StageLifecycle<BatchRuntimeContext>
ReducibleAggregator
used in batch programs. As it is used in batch programs, a
BatchReducibleAggregator must be parameterized with supported group key and value classes. Group
keys and values can be a byte[], Boolean, Integer, Long, Float, Double, String, or
StructuredRecord. If the group key is not one of those types and is being used in mapreduce, it
must implement Hadoop's org.apache.hadoop.io.WritableComparable interface. If the group value is
not one of those types and is being used in mapreduce, it must implement Hadoop's
org.apache.hadoop.io.Writable interface. If the aggregator is being used in spark, both the group
key and value must implement the Serializable
interface.Modifier and Type | Field and Description |
---|---|
static String |
PLUGIN_TYPE |
Constructor and Description |
---|
BatchReducibleAggregator() |
Modifier and Type | Method and Description |
---|---|
void |
configurePipeline(PipelineConfigurer pipelineConfigurer)
Configure the pipeline.
|
void |
destroy()
Destroy the Batch Aggregator.
|
void |
initialize(BatchRuntimeContext context)
Initialize the Batch Reduce Aggregator.
|
void |
prepareRun(BatchAggregatorContext context)
Prepare a pipeline run.
|
onRunFinish
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
finalize, groupBy, initializeAggregateValue, mergePartitions, mergeValues
public static final String PLUGIN_TYPE
public void configurePipeline(PipelineConfigurer pipelineConfigurer)
configurePipeline
in interface PipelineConfigurable
configurePipeline
in class BatchConfigurable<BatchAggregatorContext>
pipelineConfigurer
- the configurer used to add required datasets and streamsValidationException
- if the given config is invalidpublic void prepareRun(BatchAggregatorContext context) throws Exception
prepareRun
in interface SubmitterLifecycle<BatchAggregatorContext>
prepareRun
in class BatchConfigurable<BatchAggregatorContext>
context
- batch execution contextException
- if there's an error during this method invocationpublic void initialize(BatchRuntimeContext context) throws Exception
ReducibleAggregator.groupBy(Object, Emitter)
and ReducibleAggregator.finalize(Object, Object, Emitter)
are made.initialize
in interface StageLifecycle<BatchRuntimeContext>
context
- BatchRuntimeContext
Exception
- if there is any error during initializationpublic void destroy()
destroy
in interface Destroyable
Copyright © 2023 Cask Data, Inc. Licensed under the Apache License, Version 2.0.