Class DruidProcessingConfig

    • Method Detail

      • getFormatString

        public String getFormatString()
      • getNumThreads

        public int getNumThreads()
      • getNumMergeBuffers

        public int getNumMergeBuffers()
      • isFifo

        public boolean isFifo()
      • getTmpDir

        public String getTmpDir()
      • intermediateComputeSizeBytes

        public int intermediateComputeSizeBytes()
      • poolCacheMaxCount

        public int poolCacheMaxCount()
      • getNumInitalBuffersForIntermediatePool

        public int getNumInitalBuffersForIntermediatePool()
      • skipValueRangeIndexScale

        public double skipValueRangeIndexScale()
        Description copied from interface: ColumnConfig
        If the total number of rows in a column multiplied by this value is smaller than the total number of bitmap index operations required to perform to use LexicographicalRangeIndexes or NumericRangeIndexes, then for any ColumnIndexSupplier which chooses to participate in this config it will skip computing the index, indicated by a return value of null from the 'forRange' methods, to force the filter to be processed with a scan using a ValueMatcher instead.

        For range indexes on columns where every value has an index, the number of bitmap operations is determined by how many individual values fall in the range, a subset of the columns total cardinality.

        Currently only the NestedCommonFormatColumn implementations of ColumnIndexSupplier support this behavior.

        This can make some standalone filters faster in cases where the overhead of walking the value dictionary and combining bitmaps to construct a BitmapOffset or BitmapVectorOffset can exceed the cost of just using doing a full scan and using a ValueMatcher.

        Where this is especially useful is in cases where the range index is used as part of some AndFilter, which segment processing partitions into groups of 'pre' filters, composed of those which should use indexes, and 'post' filters, which should use a matcher on the offset created by the indexes to filter the remaining results. This value pushes what would have been expensive index computations to go into the 'pre' group into using a value matcher as part of the 'post' group instead, sometimes providing an order of magnitude or higher performance increase.

        Specified by:
        skipValueRangeIndexScale in interface ColumnConfig
      • skipValuePredicateIndexScale

        public double skipValuePredicateIndexScale()
        Description copied from interface: ColumnConfig
        If the total number of rows in a column multiplied by this value is smaller than the total number of bitmap index operations required to perform to use DruidPredicateIndexes then for any ColumnIndexSupplier which chooses to participate in this config it will skip computing the index, in favor of doing a full scan and using a ValueMatcher instead. This is indicated returning null from ColumnIndexSupplier.as(Class) even though it would have otherwise been able to create a BitmapColumnIndex. For predicate indexes, this is determined by the total value cardinality of the column for columns with an index for every value.

        Currently only the NestedCommonFormatColumn implementations of ColumnIndexSupplier support this behavior.

        This can make some standalone filters faster in cases where the overhead of walking the value dictionary and combining bitmaps to construct a BitmapOffset or BitmapVectorOffset can exceed the cost of just using doing a full scan and using a ValueMatcher.

        Where this is especially useful is in cases where the predicate index is used as part of some AndFilter, which segment processing partitions into groups of 'pre' filters, composed of those which should use indexes, and 'post' filters, which should use a matcher on the offset created by the indexes to filter the remaining results. This value pushes what would have been expensive index computations to go into the 'pre' group into using a value matcher as part of the 'post' group instead, sometimes providing an order of magnitude or higher performance increase.

        This value is separate from ColumnConfig.skipValueRangeIndexScale() since the dynamics of computing predicate indexes is potentially different than the much cheaper range calculations (especially for numeric values), so having a separate control knob allows for corrections to be done to tune things separately from ranges.

        Specified by:
        skipValuePredicateIndexScale in interface ColumnConfig
      • isNumThreadsConfigured

        public boolean isNumThreadsConfigured()
      • isNumMergeBuffersConfigured

        public boolean isNumMergeBuffersConfigured()