GROUP_KEY
- group key type. Must be a supported typeGROUP_VALUE
- group value type. Must be a supported typeOUT
- output object type@Beta public abstract class BatchAggregator<GROUP_KEY,GROUP_VALUE,OUT> extends BatchConfigurable<BatchAggregatorContext> implements Aggregator<GROUP_KEY,GROUP_VALUE,OUT>, PipelineConfigurable, StageLifecycle<BatchRuntimeContext>
Aggregator
used in batch programs.
As it is used in batch programs, a BatchAggregator must be parameterized
with supported group key and value classes. Group keys and values can be a
byte[], Boolean, Integer, Long, Float, Double, String, or StructuredRecord.
If the group key is not one of those types and is being used in mapreduce,
it must implement Hadoop's org.apache.hadoop.io.WritableComparable interface.
If the group value is not one of those types and is being used in mapreduce,
it must implement Hadoop's org.apache.hadoop.io.Writable interface.
If the aggregator is being used in spark, both the group key and value must implement the
Serializable
interface.Modifier and Type | Field and Description |
---|---|
static String |
PLUGIN_TYPE |
Constructor and Description |
---|
BatchAggregator() |
Modifier and Type | Method and Description |
---|---|
void |
configurePipeline(PipelineConfigurer pipelineConfigurer)
Configure the pipeline.
|
void |
destroy()
Destroy the Batch Aggregator.
|
void |
initialize(BatchRuntimeContext context)
Initialize the Batch Aggregator.
|
void |
prepareRun(BatchAggregatorContext context)
Prepare a pipeline run.
|
onRunFinish
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
aggregate, groupBy
public static final String PLUGIN_TYPE
public void configurePipeline(PipelineConfigurer pipelineConfigurer)
configurePipeline
in interface PipelineConfigurable
configurePipeline
in class BatchConfigurable<BatchAggregatorContext>
pipelineConfigurer
- the configurer used to add required datasets and streamsValidationException
- if the given config is invalidpublic void prepareRun(BatchAggregatorContext context) throws Exception
prepareRun
in interface SubmitterLifecycle<BatchAggregatorContext>
prepareRun
in class BatchConfigurable<BatchAggregatorContext>
context
- batch execution contextException
public void initialize(BatchRuntimeContext context) throws Exception
Aggregator.groupBy(Object, Emitter)
and Aggregator.aggregate(Object, Iterator, Emitter)
are made.initialize
in interface StageLifecycle<BatchRuntimeContext>
context
- BatchRuntimeContext
Exception
- if there is any error during initializationpublic void destroy()
destroy
in interface Destroyable
Copyright © 2020 Cask Data, Inc. Licensed under the Apache License, Version 2.0.