Class DruidProcessingConfig
- java.lang.Object
-
- org.apache.druid.query.DruidProcessingConfig
-
- All Implemented Interfaces:
ColumnConfig
public class DruidProcessingConfig extends Object implements ColumnConfig
-
-
Field Summary
-
Fields inherited from interface org.apache.druid.segment.column.ColumnConfig
ALWAYS_USE_INDEXES, DEFAULT, DEFAULT_SKIP_VALUE_PREDICATE_INDEX_SCALE, DEFAULT_SKIP_VALUE_RANGE_INDEX_SCALE
-
-
Constructor Summary
Constructors Constructor Description DruidProcessingConfig()
DruidProcessingConfig(String formatString, Integer numThreads, Integer numMergeBuffers, Boolean fifo, String tmpDir, DruidProcessingBufferConfig buffer, DruidProcessingIndexesConfig indexes)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description String
getFormatString()
int
getNumInitalBuffersForIntermediatePool()
int
getNumMergeBuffers()
int
getNumThreads()
String
getTmpDir()
int
intermediateComputeSizeBytes()
boolean
isFifo()
boolean
isNumMergeBuffersConfigured()
boolean
isNumThreadsConfigured()
int
poolCacheMaxCount()
double
skipValuePredicateIndexScale()
If the total number of rows in a column multiplied by this value is smaller than the total number of bitmap index operations required to perform to useDruidPredicateIndexes
then for anyColumnIndexSupplier
which chooses to participate in this config it will skip computing the index, in favor of doing a full scan and using aValueMatcher
instead.double
skipValueRangeIndexScale()
If the total number of rows in a column multiplied by this value is smaller than the total number of bitmap index operations required to perform to useLexicographicalRangeIndexes
orNumericRangeIndexes
, then for anyColumnIndexSupplier
which chooses to participate in this config it will skip computing the index, indicated by a return value of null from the 'forRange' methods, to force the filter to be processed with a scan using aValueMatcher
instead.
-
-
-
Constructor Detail
-
DruidProcessingConfig
public DruidProcessingConfig(@Nullable String formatString, @Nullable Integer numThreads, @Nullable Integer numMergeBuffers, @Nullable Boolean fifo, @Nullable String tmpDir, DruidProcessingBufferConfig buffer, DruidProcessingIndexesConfig indexes)
-
DruidProcessingConfig
public DruidProcessingConfig()
-
-
Method Detail
-
getFormatString
public String getFormatString()
-
getNumThreads
public int getNumThreads()
-
getNumMergeBuffers
public int getNumMergeBuffers()
-
isFifo
public boolean isFifo()
-
getTmpDir
public String getTmpDir()
-
intermediateComputeSizeBytes
public int intermediateComputeSizeBytes()
-
poolCacheMaxCount
public int poolCacheMaxCount()
-
getNumInitalBuffersForIntermediatePool
public int getNumInitalBuffersForIntermediatePool()
-
skipValueRangeIndexScale
public double skipValueRangeIndexScale()
Description copied from interface:ColumnConfig
If the total number of rows in a column multiplied by this value is smaller than the total number of bitmap index operations required to perform to useLexicographicalRangeIndexes
orNumericRangeIndexes
, then for anyColumnIndexSupplier
which chooses to participate in this config it will skip computing the index, indicated by a return value of null from the 'forRange' methods, to force the filter to be processed with a scan using aValueMatcher
instead.For range indexes on columns where every value has an index, the number of bitmap operations is determined by how many individual values fall in the range, a subset of the columns total cardinality.
Currently only the
NestedCommonFormatColumn
implementations ofColumnIndexSupplier
support this behavior.This can make some standalone filters faster in cases where the overhead of walking the value dictionary and combining bitmaps to construct a
BitmapOffset
orBitmapVectorOffset
can exceed the cost of just using doing a full scan and using aValueMatcher
.Where this is especially useful is in cases where the range index is used as part of some
AndFilter
, which segment processing partitions into groups of 'pre' filters, composed of those which should use indexes, and 'post' filters, which should use a matcher on the offset created by the indexes to filter the remaining results. This value pushes what would have been expensive index computations to go into the 'pre' group into using a value matcher as part of the 'post' group instead, sometimes providing an order of magnitude or higher performance increase.- Specified by:
skipValueRangeIndexScale
in interfaceColumnConfig
-
skipValuePredicateIndexScale
public double skipValuePredicateIndexScale()
Description copied from interface:ColumnConfig
If the total number of rows in a column multiplied by this value is smaller than the total number of bitmap index operations required to perform to useDruidPredicateIndexes
then for anyColumnIndexSupplier
which chooses to participate in this config it will skip computing the index, in favor of doing a full scan and using aValueMatcher
instead. This is indicated returning null fromColumnIndexSupplier.as(Class)
even though it would have otherwise been able to create aBitmapColumnIndex
. For predicate indexes, this is determined by the total value cardinality of the column for columns with an index for every value.Currently only the
NestedCommonFormatColumn
implementations ofColumnIndexSupplier
support this behavior.This can make some standalone filters faster in cases where the overhead of walking the value dictionary and combining bitmaps to construct a
BitmapOffset
orBitmapVectorOffset
can exceed the cost of just using doing a full scan and using aValueMatcher
.Where this is especially useful is in cases where the predicate index is used as part of some
AndFilter
, which segment processing partitions into groups of 'pre' filters, composed of those which should use indexes, and 'post' filters, which should use a matcher on the offset created by the indexes to filter the remaining results. This value pushes what would have been expensive index computations to go into the 'pre' group into using a value matcher as part of the 'post' group instead, sometimes providing an order of magnitude or higher performance increase.This value is separate from
ColumnConfig.skipValueRangeIndexScale()
since the dynamics of computing predicate indexes is potentially different than the much cheaper range calculations (especially for numeric values), so having a separate control knob allows for corrections to be done to tune things separately from ranges.- Specified by:
skipValuePredicateIndexScale
in interfaceColumnConfig
-
isNumThreadsConfigured
public boolean isNumThreadsConfigured()
-
isNumMergeBuffersConfigured
public boolean isNumMergeBuffersConfigured()
-
-