Package org.apache.druid.query.groupby
Class GroupingEngine
- java.lang.Object
-
- org.apache.druid.query.groupby.GroupingEngine
-
public class GroupingEngine extends Object
-
-
Field Summary
Fields Modifier and Type Field Description static String
CTX_KEY_FUDGE_TIMESTAMP
static String
CTX_KEY_OUTERMOST
-
Constructor Summary
Constructors Constructor Description GroupingEngine(DruidProcessingConfig processingConfig, com.google.common.base.Supplier<GroupByQueryConfig> configSupplier, NonBlockingPool<ByteBuffer> bufferPool, BlockingPool<ByteBuffer> mergeBufferPool, com.fasterxml.jackson.databind.ObjectMapper jsonMapper, com.fasterxml.jackson.databind.ObjectMapper spillMapper, QueryWatcher queryWatcher)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description Sequence<ResultRow>
applyPostProcessing(Sequence<ResultRow> results, GroupByQuery query)
Apply theGroupByQuery
"postProcessingFn", which is responsible for HavingSpec and LimitSpec.BinaryOperator<ResultRow>
createMergeFn(Query<ResultRow> queryParam)
SeeQueryToolChest.createMergeFn(Query)
for details, allowsGroupByQueryQueryToolChest
to delegate implementation to the strategyComparator<ResultRow>
createResultComparator(Query<ResultRow> queryParam)
SeeQueryToolChest.createResultComparator(Query)
, allowsGroupByQueryQueryToolChest
to delegate implementation to the strategySequence<ResultRow>
mergeResults(QueryRunner<ResultRow> baseRunner, GroupByQuery query, ResponseContext responseContext)
Runs a providedQueryRunner
on a providedGroupByQuery
, which is assumed to return rows that are properly sorted (by timestamp and dimensions) but not necessarily fully merged (that is, there may be adjacent rows with the same timestamp and dimensions) and without PostAggregators computed.QueryRunner<ResultRow>
mergeRunners(QueryProcessingPool queryProcessingPool, Iterable<QueryRunner<ResultRow>> queryRunners)
Merge a variety of single-segment query runners into a combined runner.GroupByQuery
prepareGroupByQuery(GroupByQuery query)
GroupByQueryResources
prepareResource(GroupByQuery query)
Initializes resources required to runGroupByQueryQueryToolChest.mergeResults(QueryRunner)
for a particular query.Sequence<ResultRow>
process(GroupByQuery query, StorageAdapter storageAdapter, GroupByQueryMetrics groupByQueryMetrics)
Process a groupBy query on a singleStorageAdapter
.Sequence<ResultRow>
processSubqueryResult(GroupByQuery subquery, GroupByQuery query, GroupByQueryResources resource, Sequence<ResultRow> subqueryResult, boolean wasQueryPushedDown)
Called byGroupByQueryQueryToolChest.mergeResults(QueryRunner)
when it needs to process a subquery.Sequence<ResultRow>
processSubtotalsSpec(GroupByQuery query, GroupByQueryResources resource, Sequence<ResultRow> queryResult)
Called byGroupByQueryQueryToolChest.mergeResults(QueryRunner)
when it needs to generate subtotals.static Sequence<ResultRow>
wrapSummaryRowIfNeeded(GroupByQuery query, Sequence<ResultRow> process)
Wraps the sequence around if for this query a summary row might be needed in case the input becomes empty.
-
-
-
Field Detail
-
CTX_KEY_FUDGE_TIMESTAMP
public static final String CTX_KEY_FUDGE_TIMESTAMP
- See Also:
- Constant Field Values
-
CTX_KEY_OUTERMOST
public static final String CTX_KEY_OUTERMOST
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
GroupingEngine
@Inject public GroupingEngine(DruidProcessingConfig processingConfig, com.google.common.base.Supplier<GroupByQueryConfig> configSupplier, NonBlockingPool<ByteBuffer> bufferPool, BlockingPool<ByteBuffer> mergeBufferPool, com.fasterxml.jackson.databind.ObjectMapper jsonMapper, com.fasterxml.jackson.databind.ObjectMapper spillMapper, QueryWatcher queryWatcher)
-
-
Method Detail
-
prepareResource
public GroupByQueryResources prepareResource(GroupByQuery query)
Initializes resources required to runGroupByQueryQueryToolChest.mergeResults(QueryRunner)
for a particular query. That method is also the primary caller of this method. Used byGroupByQueryQueryToolChest.mergeResults(QueryRunner)
.- Parameters:
query
- a groupBy query to be processed- Returns:
- broker resource
-
createResultComparator
public Comparator<ResultRow> createResultComparator(Query<ResultRow> queryParam)
SeeQueryToolChest.createResultComparator(Query)
, allowsGroupByQueryQueryToolChest
to delegate implementation to the strategy
-
createMergeFn
public BinaryOperator<ResultRow> createMergeFn(Query<ResultRow> queryParam)
SeeQueryToolChest.createMergeFn(Query)
for details, allowsGroupByQueryQueryToolChest
to delegate implementation to the strategy
-
prepareGroupByQuery
public GroupByQuery prepareGroupByQuery(GroupByQuery query)
-
mergeResults
public Sequence<ResultRow> mergeResults(QueryRunner<ResultRow> baseRunner, GroupByQuery query, ResponseContext responseContext)
Runs a providedQueryRunner
on a providedGroupByQuery
, which is assumed to return rows that are properly sorted (by timestamp and dimensions) but not necessarily fully merged (that is, there may be adjacent rows with the same timestamp and dimensions) and without PostAggregators computed. This method will fully merge the rows, apply PostAggregators, and return the resultingSequence
. The query will be modified usingprepareGroupByQuery(GroupByQuery)
before passing it down to the base runner. For example, "having" clauses will be removed and various context parameters will be adjusted. Despite the similar name, this method is much reduced in scope compared toGroupByQueryQueryToolChest.mergeResults(QueryRunner)
. That method does delegate to this one at some points, but has a truckload of other responsibility, including computing outer query results (if there are subqueries), computing subtotals (like GROUPING SETS), and computing the havingSpec and limitSpec.- Parameters:
baseRunner
- base query runnerquery
- the groupBy query to run inside the base query runnerresponseContext
- the response context to pass to the base query runner- Returns:
- merged result sequence
-
mergeRunners
public QueryRunner<ResultRow> mergeRunners(QueryProcessingPool queryProcessingPool, Iterable<QueryRunner<ResultRow>> queryRunners)
Merge a variety of single-segment query runners into a combined runner. Used byGroupByQueryRunnerFactory.mergeRunners(QueryProcessingPool, Iterable)
. In that sense, it is intended to go along withprocess(GroupByQuery, StorageAdapter, GroupByQueryMetrics)
(the runners created by that method will be fed into this method).This method is only called on data servers, like Historicals (not the Broker).
- Parameters:
queryProcessingPool
-QueryProcessingPool
service used for parallel execution of the query runnersqueryRunners
- collection of query runners to merge- Returns:
- merged query runner
-
process
public Sequence<ResultRow> process(GroupByQuery query, StorageAdapter storageAdapter, @Nullable GroupByQueryMetrics groupByQueryMetrics)
Process a groupBy query on a singleStorageAdapter
. This is used byGroupByQueryRunnerFactory.createRunner(org.apache.druid.segment.Segment)
to create per-segment QueryRunners. This method is only called on data servers, like Historicals (not the Broker).- Parameters:
query
- the groupBy querystorageAdapter
- storage adatper for the segment in question- Returns:
- result sequence for the storage adapter
-
applyPostProcessing
public Sequence<ResultRow> applyPostProcessing(Sequence<ResultRow> results, GroupByQuery query)
Apply theGroupByQuery
"postProcessingFn", which is responsible for HavingSpec and LimitSpec.- Parameters:
results
- sequence of resultsquery
- the groupBy query- Returns:
- post-processed results, with HavingSpec and LimitSpec applied
-
processSubqueryResult
public Sequence<ResultRow> processSubqueryResult(GroupByQuery subquery, GroupByQuery query, GroupByQueryResources resource, Sequence<ResultRow> subqueryResult, boolean wasQueryPushedDown)
Called byGroupByQueryQueryToolChest.mergeResults(QueryRunner)
when it needs to process a subquery.- Parameters:
subquery
- inner queryquery
- outer queryresource
- resources returned byprepareResource(GroupByQuery)
subqueryResult
- result rows from the subquerywasQueryPushedDown
- true if the outer query was pushed down (so we only need to merge the outer query's results, not run it from scratch like a normal outer query)- Returns:
- results of the outer query
-
processSubtotalsSpec
public Sequence<ResultRow> processSubtotalsSpec(GroupByQuery query, GroupByQueryResources resource, Sequence<ResultRow> queryResult)
Called byGroupByQueryQueryToolChest.mergeResults(QueryRunner)
when it needs to generate subtotals.- Parameters:
query
- query that has a "subtotalsSpec"resource
- resources returned byprepareResource(GroupByQuery)
queryResult
- result rows from the main query- Returns:
- results for each list of subtotals in the query, concatenated together
-
wrapSummaryRowIfNeeded
public static Sequence<ResultRow> wrapSummaryRowIfNeeded(GroupByQuery query, Sequence<ResultRow> process)
Wraps the sequence around if for this query a summary row might be needed in case the input becomes empty.
-
-