public abstract class QueryToolChest<ResultType,QueryType extends Query<ResultType>> extends Object
| Modifier | Constructor and Description |
|---|---|
protected |
QueryToolChest() |
| Modifier and Type | Method and Description |
|---|---|
boolean |
canPerformSubquery(Query<?> subquery)
Returns whether this toolchest is able to handle the provided subquery.
|
BinaryOperator<ResultType> |
createMergeFn(Query<ResultType> query)
Creates a merge function that is used to merge intermediate aggregates from historicals in broker.
|
Comparator<ResultType> |
createResultComparator(Query<ResultType> query)
Creates an ordering comparator that is used to order results.
|
com.fasterxml.jackson.databind.ObjectMapper |
decorateObjectMapper(com.fasterxml.jackson.databind.ObjectMapper objectMapper,
QueryType query)
Perform any per-query decoration of an
ObjectMapper that enables it to read and write objects of the
query's ResultType. |
<T extends LogicalSegment> |
filterSegments(QueryType query,
List<T> segments)
This method is called to allow the query to prune segments that it does not believe need to actually
be queried.
|
com.fasterxml.jackson.databind.JavaType |
getBaseResultType() |
com.fasterxml.jackson.databind.JavaType |
getBySegmentResultType() |
<T> CacheStrategy<ResultType,T,QueryType> |
getCacheStrategy(QueryType query)
Returns a CacheStrategy to be used to load data into the cache and remove it from the cache.
|
abstract com.fasterxml.jackson.core.type.TypeReference<ResultType> |
getResultTypeReference()
Returns a TypeReference object that is just passed through to Jackson in order to deserialize
the results of this type of query.
|
abstract QueryMetrics<? super QueryType> |
makeMetrics(QueryType query)
Creates a
QueryMetrics object that is used to generate metrics for this specific query type. |
com.google.common.base.Function<ResultType,ResultType> |
makePostComputeManipulatorFn(QueryType query,
MetricManipulationFn fn)
Generally speaking this is the exact same thing as makePreComputeManipulatorFn.
|
abstract com.google.common.base.Function<ResultType,ResultType> |
makePreComputeManipulatorFn(QueryType query,
MetricManipulationFn fn)
Creates a Function that can take in a ResultType and return a new ResultType having applied
the MetricManipulatorFn to each of the metrics.
|
QueryRunner<ResultType> |
mergeResults(QueryRunner<ResultType> runner)
This method wraps a QueryRunner.
|
QueryRunner<ResultType> |
postMergeQueryDecoration(QueryRunner<ResultType> runner)
Wraps a QueryRunner.
|
QueryRunner<ResultType> |
preMergeQueryDecoration(QueryRunner<ResultType> runner)
Wraps a QueryRunner.
|
RowSignature |
resultArraySignature(QueryType query)
Returns a
RowSignature for the arrays returned by resultsAsArrays(QueryType, org.apache.druid.java.util.common.guava.Sequence<ResultType>). |
Sequence<Object[]> |
resultsAsArrays(QueryType query,
Sequence<ResultType> resultSequence)
Converts a sequence of this query's ResultType into arrays.
|
Optional<Sequence<FrameSignaturePair>> |
resultsAsFrames(QueryType query,
Sequence<ResultType> resultSequence,
MemoryAllocatorFactory memoryAllocatorFactory,
boolean useNestedForUnknownTypes)
Converts a sequence of this query's ResultType into a sequence of
FrameSignaturePair. |
public final com.fasterxml.jackson.databind.JavaType getBaseResultType()
public final com.fasterxml.jackson.databind.JavaType getBySegmentResultType()
public com.fasterxml.jackson.databind.ObjectMapper decorateObjectMapper(com.fasterxml.jackson.databind.ObjectMapper objectMapper,
QueryType query)
ObjectMapper that enables it to read and write objects of the
query's ResultType. It is used by QueryResource on the write side, and DirectDruidClient on the read side.
For most queries, this is a no-op, but it can be useful for query types that support more than one result
serialization format. Queries that implement this method must not modify the provided ObjectMapper, but instead
must return a copy.public QueryRunner<ResultType> mergeResults(QueryRunner<ResultType> runner)
ResultMergeQueryRunner which creates a
CombiningSequence using the supplied QueryRunner with
createResultComparator(Query) and createMergeFn(Query)} supplied by this
toolchest.runner - A QueryRunner that provides a series of ResultType objects in time order (ascending or descending)@Nullable public BinaryOperator<ResultType> createMergeFn(Query<ResultType> query)
ResultMergeQueryRunner provided by
mergeResults(QueryRunner) and also used in
ParallelMergeCombiningSequence by 'CachingClusteredClient' if it
does not return null.
Returning null from this function means that a query does not support result merging, at
least via the mechanisms that utilize this function.public Comparator<ResultType> createResultComparator(Query<ResultType> query)
ResultMergeQueryRunner provided by mergeResults(QueryRunner)public abstract QueryMetrics<? super QueryType> makeMetrics(QueryType query)
QueryMetrics object that is used to generate metrics for this specific query type. This exists
to allow for query-specific dimensions and metrics. That is, the ToolChest is expected to set some
meaningful dimensions for metrics given this query type. Examples might be the topN threshold for
a TopN query or the number of dimensions included for a groupBy query.
QueryToolChests for query types in core (druid-processing) and public extensions (belonging to the Druid source
tree) should use delegate this method to GenericQueryMetricsFactory.makeMetrics(Query) on an injected
instance of GenericQueryMetricsFactory, as long as they don't need to emit custom dimensions and/or
metrics.
If some custom dimensions and/or metrics should be emitted for a query type, a plan described in
"Making subinterfaces of QueryMetrics" section in QueryMetrics's class-level Javadocs should be followed.
One way or another, this method should ensure that QueryMetrics.query(Query) is called with the given
query passed on the created QueryMetrics object before returning.
query - The query that is being processedpublic abstract com.google.common.base.Function<ResultType,ResultType> makePreComputeManipulatorFn(QueryType query, MetricManipulationFn fn)
This exists because the QueryToolChest is the only thing that understands the internal serialization format of ResultType, so it's primary responsibility is to "decompose" that structure and apply the given function to all metrics.
This function is called very early in the processing pipeline on the Broker.
query - The Query that is currently being processedfn - The function that should be applied to all metrics in the resultspublic com.google.common.base.Function<ResultType,ResultType> makePostComputeManipulatorFn(QueryType query, MetricManipulationFn fn)
query - The Query that is currently being processedfn - The function that should be applied to all metrics in the resultspublic abstract com.fasterxml.jackson.core.type.TypeReference<ResultType> getResultTypeReference()
@Nullable public <T> CacheStrategy<ResultType,T,QueryType> getCacheStrategy(QueryType query)
This is optional. If it returns null, caching is effectively disabled for the query.
T - The type of object that will be stored in the cachequery - The query whose results might be cachedpublic QueryRunner<ResultType> preMergeQueryDecoration(QueryRunner<ResultType> runner)
In fact, the return value of this method is always passed to mergeResults, so it is equivalent to just implement this functionality as extra decoration on the QueryRunner during mergeResults().
In the interests of potentially simplifying these interfaces, the recommendation is to actually not override this method and instead apply anything that might be needed here in the mergeResults() call.
runner - The runner to be wrappedpublic QueryRunner<ResultType> postMergeQueryDecoration(QueryRunner<ResultType> runner)
In fact, the input value of this method is always the return value from mergeResults, so it is equivalent to just implement this functionality as extra decoration on the QueryRunner during mergeResults().
In the interests of potentially simplifying these interfaces, the recommendation is to actually not override this method and instead apply anything that might be needed here in the mergeResults() call.
runner - The runner to be wrappedpublic <T extends LogicalSegment> List<T> filterSegments(QueryType query, List<T> segments)
T - A Generic parameter because Java is coolquery - The query being processedsegments - The list of candidate segments to be queriedpublic boolean canPerformSubquery(Query<?> subquery)
public RowSignature resultArraySignature(QueryType query)
RowSignature for the arrays returned by resultsAsArrays(QueryType, org.apache.druid.java.util.common.guava.Sequence<ResultType>). The returned signature will
be the same length as each array returned by resultsAsArrays(QueryType, org.apache.druid.java.util.common.guava.Sequence<ResultType>).query - same query passed to resultsAsArrays(QueryType, org.apache.druid.java.util.common.guava.Sequence<ResultType>)UnsupportedOperationException - if this query type does not support returning results as arrayspublic Sequence<Object[]> resultsAsArrays(QueryType query, Sequence<ResultType> resultSequence)
resultArraySignature(QueryType). This functionality is useful because it allows higher-level processors to operate on
the results of any query in a consistent way. This is useful for the SQL layer and for any algorithm that might
operate on the results of an inner query.
Not all query types support this method. They will throw UnsupportedOperationException, and they cannot
be used by the SQL layer or by generic higher-level algorithms.
Some query types return less information after translating their results into arrays, especially in situations
where there is no clear way to translate fully rich results into flat arrays. For example, the scan query does not
include the segmentId in its array-based results, because it could potentially conflict with a 'segmentId' field
in the actual datasource being scanned.
It is possible that there will be multiple arrays returned for a single result object. For example, in the topN
query, each TopNResultValue will generate a separate array for each of its
values.
By convention, the array form should include the __time column, if present, as a long (milliseconds since epoch).resultSequence - results of the form returned by mergeResults(org.apache.druid.query.QueryRunner<ResultType>)UnsupportedOperationException - if this query type does not support returning results as arrayspublic Optional<Sequence<FrameSignaturePair>> resultsAsFrames(QueryType query, Sequence<ResultType> resultSequence, MemoryAllocatorFactory memoryAllocatorFactory, boolean useNestedForUnknownTypes)
FrameSignaturePair. The array signature
is the one give by resultArraySignature(Query). If the toolchest doesn't support this method, then it can
return an empty optional. It is the duty of the callees to throw an appropriate exception in that case or use an
alternative fallback approach
Check documentation of resultsAsArrays(Query, Sequence) as the behaviour of the rows represented by the
frame sequence is identical.
Each Frame has a separate RowSignature because for some query types like the Scan query, every
column in the final result might not be present in the individual ResultType (and subsequently Frame). Therefore,
this is done to preserve the space by not populating the column in that particular Frame and omitting it from its
signaturequery - Query being executed by the toolchest. Used to determine the rowSignature of the FramesresultSequence - results of the form returned by mergeResults(QueryRunner)memoryAllocatorFactory - useNestedForUnknownTypes - true if the unknown types in the results can be serded using complex typesCopyright © 2011–2023 The Apache Software Foundation. All rights reserved.