Interface ColumnConfig
-
- All Known Implementing Classes:
DruidProcessingConfig
public interface ColumnConfig
-
-
Field Summary
Fields Modifier and Type Field Description static ColumnConfig
ALWAYS_USE_INDEXES
static ColumnConfig
DEFAULT
static double
DEFAULT_SKIP_VALUE_PREDICATE_INDEX_SCALE
static double
DEFAULT_SKIP_VALUE_RANGE_INDEX_SCALE
this value was chosen testing bound filters on double columns with a variety of ranges at which this ratio of number of bitmaps compared to total number of rows appeared to be around the threshold where indexes stopped performing consistently faster than a full scan + value matcher
-
Method Summary
All Methods Instance Methods Default Methods Modifier and Type Method Description default double
skipValuePredicateIndexScale()
If the total number of rows in a column multiplied by this value is smaller than the total number of bitmap index operations required to perform to useDruidPredicateIndexes
then for anyColumnIndexSupplier
which chooses to participate in this config it will skip computing the index, in favor of doing a full scan and using aValueMatcher
instead.default double
skipValueRangeIndexScale()
If the total number of rows in a column multiplied by this value is smaller than the total number of bitmap index operations required to perform to useLexicographicalRangeIndexes
orNumericRangeIndexes
, then for anyColumnIndexSupplier
which chooses to participate in this config it will skip computing the index, indicated by a return value of null from the 'forRange' methods, to force the filter to be processed with a scan using aValueMatcher
instead.
-
-
-
Field Detail
-
DEFAULT_SKIP_VALUE_RANGE_INDEX_SCALE
static final double DEFAULT_SKIP_VALUE_RANGE_INDEX_SCALE
this value was chosen testing bound filters on double columns with a variety of ranges at which this ratio of number of bitmaps compared to total number of rows appeared to be around the threshold where indexes stopped performing consistently faster than a full scan + value matcher- See Also:
- Constant Field Values
-
DEFAULT_SKIP_VALUE_PREDICATE_INDEX_SCALE
static final double DEFAULT_SKIP_VALUE_PREDICATE_INDEX_SCALE
- See Also:
- Constant Field Values
-
DEFAULT
static final ColumnConfig DEFAULT
-
ALWAYS_USE_INDEXES
static final ColumnConfig ALWAYS_USE_INDEXES
-
-
Method Detail
-
skipValueRangeIndexScale
default double skipValueRangeIndexScale()
If the total number of rows in a column multiplied by this value is smaller than the total number of bitmap index operations required to perform to useLexicographicalRangeIndexes
orNumericRangeIndexes
, then for anyColumnIndexSupplier
which chooses to participate in this config it will skip computing the index, indicated by a return value of null from the 'forRange' methods, to force the filter to be processed with a scan using aValueMatcher
instead.For range indexes on columns where every value has an index, the number of bitmap operations is determined by how many individual values fall in the range, a subset of the columns total cardinality.
Currently only the
NestedCommonFormatColumn
implementations ofColumnIndexSupplier
support this behavior.This can make some standalone filters faster in cases where the overhead of walking the value dictionary and combining bitmaps to construct a
BitmapOffset
orBitmapVectorOffset
can exceed the cost of just using doing a full scan and using aValueMatcher
.Where this is especially useful is in cases where the range index is used as part of some
AndFilter
, which segment processing partitions into groups of 'pre' filters, composed of those which should use indexes, and 'post' filters, which should use a matcher on the offset created by the indexes to filter the remaining results. This value pushes what would have been expensive index computations to go into the 'pre' group into using a value matcher as part of the 'post' group instead, sometimes providing an order of magnitude or higher performance increase.
-
skipValuePredicateIndexScale
default double skipValuePredicateIndexScale()
If the total number of rows in a column multiplied by this value is smaller than the total number of bitmap index operations required to perform to useDruidPredicateIndexes
then for anyColumnIndexSupplier
which chooses to participate in this config it will skip computing the index, in favor of doing a full scan and using aValueMatcher
instead. This is indicated returning null fromColumnIndexSupplier.as(Class)
even though it would have otherwise been able to create aBitmapColumnIndex
. For predicate indexes, this is determined by the total value cardinality of the column for columns with an index for every value.Currently only the
NestedCommonFormatColumn
implementations ofColumnIndexSupplier
support this behavior.This can make some standalone filters faster in cases where the overhead of walking the value dictionary and combining bitmaps to construct a
BitmapOffset
orBitmapVectorOffset
can exceed the cost of just using doing a full scan and using aValueMatcher
.Where this is especially useful is in cases where the predicate index is used as part of some
AndFilter
, which segment processing partitions into groups of 'pre' filters, composed of those which should use indexes, and 'post' filters, which should use a matcher on the offset created by the indexes to filter the remaining results. This value pushes what would have been expensive index computations to go into the 'pre' group into using a value matcher as part of the 'post' group instead, sometimes providing an order of magnitude or higher performance increase.This value is separate from
skipValueRangeIndexScale()
since the dynamics of computing predicate indexes is potentially different than the much cheaper range calculations (especially for numeric values), so having a separate control knob allows for corrections to be done to tune things separately from ranges.
-
-