public interface VectorColumnProcessorFactory<T>
ColumnProcessors.makeVectorProcessor(java.lang.String, org.apache.druid.segment.VectorColumnProcessorFactory<T>, org.apache.druid.segment.vector.VectorColumnSelectorFactory).
Column processors can be any type "T". The idea is that a ColumnProcessorFactory embodies the logic for wrapping
and processing selectors of various types, and so enables nice code design, where type-dependent code is not
sprinkled throughout.
Unlike ColumnProcessorFactory, this interface does not have a "defaultType" method, because vector
column types are always known, so it isn't necessary.the non-vectorized version| Modifier and Type | Method and Description |
|---|---|
T |
makeArrayProcessor(ColumnCapabilities capabilities,
VectorObjectSelector selector) |
T |
makeDoubleProcessor(ColumnCapabilities capabilities,
VectorValueSelector selector)
Called when
TypeSignature.getType() is DOUBLE. |
T |
makeFloatProcessor(ColumnCapabilities capabilities,
VectorValueSelector selector)
Called when
TypeSignature.getType() is FLOAT. |
T |
makeLongProcessor(ColumnCapabilities capabilities,
VectorValueSelector selector)
Called when
TypeSignature.getType() is LONG. |
T |
makeMultiValueDimensionProcessor(ColumnCapabilities capabilities,
MultiValueDimensionVectorSelector selector)
Called only if
TypeSignature.getType() is STRING and the underlying column may have multiple values
per row. |
T |
makeObjectProcessor(ColumnCapabilities capabilities,
VectorObjectSelector selector)
Called when
TypeSignature.getType() is COMPLEX. |
T |
makeSingleValueDimensionProcessor(ColumnCapabilities capabilities,
SingleValueDimensionVectorSelector selector)
Called only if
TypeSignature.getType() is STRING and the underlying column always has a single value
per row. |
default boolean |
useDictionaryEncodedSelector(ColumnCapabilities capabilities)
The processor factory can influence the decision on whether or not to prefer a dictionary encoded column value
selector over a an object selector by examining the
ColumnCapabilities. |
T makeSingleValueDimensionProcessor(ColumnCapabilities capabilities, SingleValueDimensionVectorSelector selector)
TypeSignature.getType() is STRING and the underlying column always has a single value
per row.
Note that for STRING-typed columns where the dictionary does not exist or is not expected to be useful,
makeObjectProcessor(org.apache.druid.segment.column.ColumnCapabilities, org.apache.druid.segment.vector.VectorObjectSelector) may be called instead. To handle all string inputs properly, processors must implement
all three methods (single-value, multi-value, object).T makeMultiValueDimensionProcessor(ColumnCapabilities capabilities, MultiValueDimensionVectorSelector selector)
TypeSignature.getType() is STRING and the underlying column may have multiple values
per row.
Note that for STRING-typed columns where the dictionary does not exist or is not expected to be useful,
makeObjectProcessor(org.apache.druid.segment.column.ColumnCapabilities, org.apache.druid.segment.vector.VectorObjectSelector) may be called instead. To handle all string inputs properly, processors must implement
all three methods (single-value, multi-value, object).T makeFloatProcessor(ColumnCapabilities capabilities, VectorValueSelector selector)
TypeSignature.getType() is FLOAT.T makeDoubleProcessor(ColumnCapabilities capabilities, VectorValueSelector selector)
TypeSignature.getType() is DOUBLE.T makeLongProcessor(ColumnCapabilities capabilities, VectorValueSelector selector)
TypeSignature.getType() is LONG.T makeArrayProcessor(ColumnCapabilities capabilities, VectorObjectSelector selector)
T makeObjectProcessor(ColumnCapabilities capabilities, VectorObjectSelector selector)
TypeSignature.getType() is COMPLEX. May also be called for STRING typed columns in
cases where the dictionary does not exist or is not expected to be useful.default boolean useDictionaryEncodedSelector(ColumnCapabilities capabilities)
ColumnCapabilities.
By default, all processor factories prefer to use a dictionary encoded selector if the column has a dictionary
available (ColumnCapabilities.isDictionaryEncoded() is true), and there is a unique mapping of dictionary
id to value (ColumnCapabilities.areDictionaryValuesUnique() is true), but this can be overridden
if there is more appropriate behavior for a given processor.
For processors, this means by default only actual dictionary encoded string columns (likely from real segments)
will use SingleValueDimensionVectorSelector and MultiValueDimensionVectorSelector, while
processors on things like string expression virtual columns will prefer to use VectorObjectSelector. In
other words, it is geared towards use cases where there is a clear opportunity to benefit to deferring having to
deal with the actual string value in exchange for the increased complexity of dealing with dictionary encoded
selectors.Copyright © 2011–2023 The Apache Software Foundation. All rights reserved.