BatchReducibleAggregator (CDAP ETL App Template API 6.9.1 API)

java.lang.Object
- io.cdap.cdap.etl.api.batch.BatchConfigurable<BatchAggregatorContext>
- - io.cdap.cdap.etl.api.batch.BatchReducibleAggregator<GROUP_KEY,GROUP_VALUE,AGG_VALUE,OUT>

Type Parameters:

GROUP_KEY - group key type. Must be a supported type

GROUP_VALUE - group value type. Must be a supported type

AGG_VALUE - agg value type

OUT - output object type

All Implemented Interfaces:

Destroyable, PipelineConfigurable, ReducibleAggregator<GROUP_KEY,GROUP_VALUE,AGG_VALUE,OUT>, StageLifecycle<BatchRuntimeContext>, SubmitterLifecycle<BatchAggregatorContext>
```
@Beta
public abstract class BatchReducibleAggregator<GROUP_KEY,GROUP_VALUE,AGG_VALUE,OUT>
extends BatchConfigurable<BatchAggregatorContext>
implements ReducibleAggregator<GROUP_KEY,GROUP_VALUE,AGG_VALUE,OUT>, PipelineConfigurable, StageLifecycle<BatchRuntimeContext>
```
A ReducibleAggregator used in batch programs. As it is used in batch programs, a BatchReducibleAggregator must be parameterized with supported group key and value classes. Group keys and values can be a byte[], Boolean, Integer, Long, Float, Double, String, or StructuredRecord. If the group key is not one of those types and is being used in mapreduce, it must implement Hadoop's org.apache.hadoop.io.WritableComparable interface. If the group value is not one of those types and is being used in mapreduce, it must implement Hadoop's org.apache.hadoop.io.Writable interface. If the aggregator is being used in spark, both the group key and value must implement the Serializable interface.

Field Summary

Fields
Modifier and Type Field and Description

static String PLUGIN_TYPE

Fields
Modifier and Type	Field and Description
`static String`	`PLUGIN_TYPE`

Constructor Summary

Constructors
Constructor and Description

BatchReducibleAggregator()

Constructors
Constructor and Description
`BatchReducibleAggregator()`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`void`	`configurePipeline(PipelineConfigurer pipelineConfigurer)` Configure the pipeline.
`void`	`destroy()` Destroy the Batch Aggregator.
`void`	`initialize(BatchRuntimeContext context)` Initialize the Batch Reduce Aggregator.
`void`	`prepareRun(BatchAggregatorContext context)` Prepare a pipeline run.

Methods inherited from class io.cdap.cdap.etl.api.batch.BatchConfigurable
onRunFinish

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface io.cdap.cdap.etl.api.ReducibleAggregator
finalize, groupBy, initializeAggregateValue, mergePartitions, mergeValues

- Field Detail
  - PLUGIN_TYPE
```
public static final String PLUGIN_TYPE
```
    See Also:
    
    Constant Field Values
- Constructor Detail
  - BatchReducibleAggregator
```
public BatchReducibleAggregator()
```
- Method Detail
  - configurePipeline
```
public void configurePipeline(PipelineConfigurer pipelineConfigurer)
```
    Configure the pipeline. This is run once when the pipeline is being published. This is where you perform any static logic, like creating required datasets, performing schema validation, setting output schema, and things of that nature.
    
    Specified by:
    
    configurePipeline in interface PipelineConfigurable
    
    Overrides:
    
    configurePipeline in class BatchConfigurable<BatchAggregatorContext>
    
    Parameters:
    
    pipelineConfigurer - the configurer used to add required datasets and streams
    
    Throws:
    
    ValidationException - if the given config is invalid
  - prepareRun
```
public void prepareRun(BatchAggregatorContext context)
                throws Exception
```
    Prepare a pipeline run. This is run every time before a pipeline runs in order to help set up the run. This is where you would set things like the number of partitions to use when grouping, and setting the group key and value classes if they are not known at compile time.
    
    Specified by:
    
    prepareRun in interface SubmitterLifecycle<BatchAggregatorContext>
    
    Specified by:
    
    prepareRun in class BatchConfigurable<BatchAggregatorContext>
    
    Parameters:
    
    context - batch execution context
    
    Throws:
    
    Exception - if there's an error during this method invocation
  - initialize
```
public void initialize(BatchRuntimeContext context)
                throws Exception
```
    Initialize the Batch Reduce Aggregator. Executed inside the Batch Run. This method is guaranteed to be invoked before any calls to ReducibleAggregator.groupBy(Object, Emitter) and ReducibleAggregator.finalize(Object, Object, Emitter) are made.
    
    Specified by:
    
    initialize in interface StageLifecycle<BatchRuntimeContext>
    
    Parameters:
    
    context - BatchRuntimeContext
    
    Throws:
    
    Exception - if there is any error during initialization
  - destroy
```
public void destroy()
```
    Destroy the Batch Aggregator. Executed at the end of the Batch Run.
    
    Specified by:
    
    destroy in interface Destroyable

Class BatchReducibleAggregator<GROUP_KEY,GROUP_VALUE,AGG_VALUE,OUT>

Field Summary

Constructor Summary

Method Summary

Methods inherited from class io.cdap.cdap.etl.api.batch.BatchConfigurable

Methods inherited from class java.lang.Object

Methods inherited from interface io.cdap.cdap.etl.api.ReducibleAggregator

Field Detail

PLUGIN_TYPE

Constructor Detail

BatchReducibleAggregator

Method Detail

configurePipeline

prepareRun

initialize

destroy