public abstract class AggregatorFactory extends Object implements Cacheable
CardinalityAggregatorFactory).
Implementations of AggregatorFactory which need to Support Nullable Aggregations are encouraged
to extend NullableNumericAggregatorFactory.
Implementations are also expected to correctly handle single/multi value string type columns as it makes sense
for them e.g. doubleSum aggregator tries to parse the string value as double and assumes it to be zero if parsing
fails.
If it is a multi value column then each individual value should be taken into account for aggregation e.g. if a row
had value ["1","1","1"], doubleSum aggregation would take each of them and sum them to 3.| Constructor and Description |
|---|
AggregatorFactory() |
| Modifier and Type | Method and Description |
|---|---|
boolean |
canVectorize(ColumnInspector columnInspector)
Returns whether or not this aggregation class supports vectorization.
|
abstract Object |
combine(Object lhs,
Object rhs)
A method that knows how to combine the outputs of
Aggregator.get() produced via factorize(org.apache.druid.segment.ColumnSelectorFactory) or BufferAggregator.get(java.nio.ByteBuffer, int) produced via factorizeBuffered(org.apache.druid.segment.ColumnSelectorFactory). |
abstract Object |
deserialize(Object object)
A method that knows how to "deserialize" the object from whatever form it might have been put into
in order to transfer via JSON.
|
abstract Aggregator |
factorize(ColumnSelectorFactory metricFactory) |
abstract BufferAggregator |
factorizeBuffered(ColumnSelectorFactory metricFactory) |
VectorAggregator |
factorizeVector(VectorColumnSelectorFactory selectorFactory)
Create a VectorAggregator based on the provided column selector factory.
|
AggregatorAndSize |
factorizeWithSize(ColumnSelectorFactory metricFactory)
Creates an
Aggregator based on the provided column selector factory. |
abstract Object |
finalizeComputation(Object object)
"Finalizes" the computation of an object.
|
abstract AggregatorFactory |
getCombiningFactory()
Returns an AggregatorFactory that can be used to combine the output of aggregators from this factory.
|
abstract Comparator |
getComparator() |
String |
getComplexTypeName()
Deprecated.
|
ValueType |
getFinalizedType()
Deprecated.
|
ColumnType |
getIntermediateType()
Get the "intermediate"
ColumnType for this aggregator. |
abstract int |
getMaxIntermediateSize()
Returns the maximum size that this aggregator will require in bytes for intermediate storage of results.
|
int |
getMaxIntermediateSizeWithNulls()
Returns the maximum size that this aggregator will require in bytes for intermediate storage of results.
|
AggregatorFactory |
getMergingFactory(AggregatorFactory other)
Returns an AggregatorFactory that can be used to combine the output of aggregators from this factory and
another factory.
|
abstract String |
getName() |
abstract List<AggregatorFactory> |
getRequiredColumns()
Used by
GroupByStrategyV1 when running nested groupBys, to
"transfer" values from this aggreagtor to an incremental index that the outer query will run on. |
ColumnType |
getResultType()
Get the
ColumnType for the final form of this aggregator, i.e. |
ValueType |
getType()
Deprecated.
|
int |
guessAggregatorHeapFootprint(long rows)
Returns a best guess as to how much memory the on-heap
Aggregator returned by factorize(org.apache.druid.segment.ColumnSelectorFactory) will
require when a certain number of rows have been aggregated into it. |
AggregateCombiner |
makeAggregateCombiner()
Creates an AggregateCombiner to fold rollup aggregation results from serveral "rows" of different indexes during
index merging.
|
AggregateCombiner |
makeNullableAggregateCombiner()
Creates an
AggregateCombiner which supports nullability. |
static AggregatorFactory[] |
mergeAggregators(List<AggregatorFactory[]> aggregatorsList)
Merges the list of AggregatorFactory[] (presumable from metadata of some segments being merged) and
returns merged AggregatorFactory[] (for the metadata for merged segment).
|
AggregatorFactory |
optimizeForSegment(PerSegmentQueryOptimizationContext optimizationContext)
Return a potentially optimized form of this AggregatorFactory for per-segment queries.
|
abstract List<String> |
requiredFields()
Get a list of fields that aggregators built by this factory will need to read.
|
AggregatorFactory |
withName(String newName)
Used in cases where we want to change the output name of the aggregator to something else.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitgetCacheKeypublic abstract Aggregator factorize(ColumnSelectorFactory metricFactory)
public abstract BufferAggregator factorizeBuffered(ColumnSelectorFactory metricFactory)
public VectorAggregator factorizeVector(VectorColumnSelectorFactory selectorFactory)
public AggregatorAndSize factorizeWithSize(ColumnSelectorFactory metricFactory)
Aggregator based on the provided column selector factory.
The returned value is a holder object which contains both the aggregator
and its initial size in bytes. The callers can then invoke
Aggregator.aggregateWithSize() to perform aggregation and get back
the incremental memory required in each aggregate call. Combined with the
initial size, this gives the total on-heap memory required by the aggregator.
This method must include JVM object overheads in the estimated size and must ensure not to underestimate required memory as that might lead to OOM errors.
This flow does not require invoking guessAggregatorHeapFootprint(long)
which tends to over-estimate the required memory.
public boolean canVectorize(ColumnInspector columnInspector)
public abstract Comparator getComparator()
@Nullable public abstract Object combine(@Nullable Object lhs, @Nullable Object rhs)
Aggregator.get() produced via factorize(org.apache.druid.segment.ColumnSelectorFactory) or BufferAggregator.get(java.nio.ByteBuffer, int) produced via factorizeBuffered(org.apache.druid.segment.ColumnSelectorFactory). Note, even though this method is called "combine",
this method's contract *does* allow for mutation of the input objects. Thus, any use of lhs or rhs after calling
this method is highly discouraged.lhs - The left hand side of the combinerhs - The right hand side of the combinepublic AggregateCombiner makeAggregateCombiner()
combine(java.lang.Object, java.lang.Object), with the difference that it uses
ColumnValueSelector and it's subinterfaces to get inputs and implements ColumnValueSelector to provide output.AggregateCombiner,
IndexMergerpublic AggregateCombiner makeNullableAggregateCombiner()
AggregateCombiner which supports nullability.
Implementations of AggregatorFactory which need to Support Nullable Aggregations are encouraged
to extend NullableNumericAggregatorFactory instead of overriding this method.
Default implementation calls makeAggregateCombiner() for backwards compatibility.public abstract AggregatorFactory getCombiningFactory()
CountAggregatorFactory getCombiningFactory method will return a
LongSumAggregatorFactory, because counts are combined by summing.
No matter what, `foo.getCombiningFactory()` and `foo.getCombiningFactory().getCombiningFactory()` should return
the same result.public AggregatorFactory getMergingFactory(AggregatorFactory other) throws AggregatorFactoryNotMergeableException
AggregatorFactoryNotMergeableException, meaning that "this" and "other" are not
compatible and values from one cannot sensibly be combined with values from the other.AggregatorFactoryNotMergeableExceptionwhich is equivalent to {@code foo.getMergingFactory(foo)} (when "this" and "other"
are the same instance).public abstract List<AggregatorFactory> getRequiredColumns()
GroupByStrategyV1 when running nested groupBys, to
"transfer" values from this aggreagtor to an incremental index that the outer query will run on. This method
only exists due to the design of GroupByStrategyV1, and should probably not be used for anything else. If you are
here because you are looking for a way to get the input fields required by this aggregator, and thought
"getRequiredColumns" sounded right, please use requiredFields() instead.a similarly-named method that is perhaps the one you want instead.public abstract Object deserialize(Object object)
object - the object to deserialize@Nullable public abstract Object finalizeComputation(@Nullable Object object)
object - the object to be finalizedpublic abstract String getName()
public abstract List<String> requiredFields()
public ColumnType getIntermediateType()
ColumnType for this aggregator. This is the same as the type returned by
deserialize(java.lang.Object) and the type accepted by combine(java.lang.Object, java.lang.Object). However, it is *not* necessarily the same type
returned by finalizeComputation(java.lang.Object).
Refer to the ColumnType javadocs for details on the implications of choosing a type.public ColumnType getResultType()
ColumnType for the final form of this aggregator, i.e. the type of the value returned by
finalizeComputation(java.lang.Object). This may be the same as or different than the types expected in deserialize(java.lang.Object)
and combine(java.lang.Object, java.lang.Object).
Refer to the ColumnType javadocs for details on the implications of choosing a type.@Deprecated public ValueType getType()
getIntermediateType() instead. Do not call this
method, it will likely produce incorrect results, it exists for backwards compatibility.@Deprecated public ValueType getFinalizedType()
getResultType() instead. Do not call this
method, it will likely produce incorrect results, it exists for backwards compatibility.@Nullable @Deprecated public String getComplexTypeName()
getIntermediateType() instead. Do not call this
method, it will likely produce incorrect results, it exists for backwards compatibility.public abstract int getMaxIntermediateSize()
public int getMaxIntermediateSizeWithNulls()
AggregatorFactory which need to Support Nullable Aggregations are encouraged
to extend NullableNumericAggregatorFactory instead of overriding this method.
Default implementation calls makeAggregateCombiner() for backwards compatibility.public int guessAggregatorHeapFootprint(long rows)
Aggregator returned by factorize(org.apache.druid.segment.ColumnSelectorFactory) will
require when a certain number of rows have been aggregated into it.
The main user of this method is OnheapIncrementalIndex, which
uses it to determine when to persist the current in-memory data to disk.
Important note for callers! In nearly all cases, callers that wish to constrain memory would be better off
using factorizeBuffered(org.apache.druid.segment.ColumnSelectorFactory) or factorizeVector(org.apache.druid.segment.vector.VectorColumnSelectorFactory), which offer precise control over how much memory
is being used.public AggregatorFactory optimizeForSegment(PerSegmentQueryOptimizationContext optimizationContext)
public AggregatorFactory withName(String newName)
org.apache.druid.sql.calcite.rel.DruidQuery#computeAggregations. We can use withName("total") to set the output name
of the aggregator to "total".
As all implementations of this interface method may not exist, callers of this method are advised to handle such a case.
newName - newName of the output for aggregator factory@Nullable public static AggregatorFactory[] mergeAggregators(List<AggregatorFactory[]> aggregatorsList)
aggregatorsList - Copyright © 2011–2023 The Apache Software Foundation. All rights reserved.