Interface TypeSignature<Type extends TypeDescriptor>

  • All Known Subinterfaces:
    ColumnCapabilities
    All Known Implementing Classes:
    BaseTypeSignature, ColumnCapabilitiesImpl, ColumnType, ExpressionType

    public interface TypeSignature<Type extends TypeDescriptor>
    This interface serves as a common foundation for Druids native type system, and provides common methods for reasoning about and handling type matters. Additional type common type handling methods are provided by Types utility. This information is used by Druid to make decisions about how to correctly process inputs and determine output types at all layers of the engine, from how to group, filter, aggregate, and transform columns up to how to best plan SQL into native Druid queries. The native Druid type system can currently be broken down at a high level into 'primitive' types, 'array' types, and 'complex' types, and this classification is defined by an enumeration which implements TypeDescriptor such as ValueType for the general query engines and ExprType for low level expression processing. This is exposed via getType(), and will be most callers first point of contact with the TypeSignature when trying to decide how to handle a given input. Druid 'primitive' types includes strings and numeric types. Note: multi-value string columns are still considered 'primitive' string types, because they do not behave as traditional arrays (unless explicitly converted to an array), and are always serialized as opportunistically single valued, so whether or not any particular string column is multi-valued might vary from segment to segment. The concept of multi-valued strings only exists at a very low engine level and are only modeled by the ColumnCapabilities implementation of TypeSignature. 'array' types contain additional nested type information about the elements of an array, a reference to another TypeSignature through the getElementType() method. If TypeDescriptor.isArray() is true, then getElementType() should never return null. 'complex' types are Druids extensible types, which have a registry that allows these types to be defined and associated with a name which is available as getComplexTypeName(). These type names are unique, so this information is used to allow handling of these 'complex' types to confirm. TypeSignature is currently manifested in 3 forms: ColumnType which is the high level 'native' Druid type definitions using ValueType, and is used by row signatures and SQL schemas, used by callers as input to various API methods, and most general purpose type handling. In 'druid-processing' there is an additional type ... type, ColumnCapabilities, which is effectively a ColumnType but includes some additional information for low level query processing, such as details about whether a column has indexes, dictionaries, null values, is a multi-value string column, and more. The third is ExpressionType, which instead of ValueType uses ExprType, and is used exclusively for handling Druid native expression evaluation. ExpressionType exists because the Druid expression system does not natively handle float types, so it is essentially a mapping of ColumnType where floats are coerced to double typed values. Ideally at some point Druid expressions can just handle floats directly, and these two TypeSignature can be merged, which will simplify this interface to no longer need be generic, allow ColumnType to be collapsed into BaseTypeSignature, and finally unify the type system.