Interface DimensionDictionarySelector

    • Method Detail

      • getValueCardinality

        int getValueCardinality()
        Value cardinality is the cardinality of the different occurring values. If there were 4 rows: A,B A B A Value cardinality would be 2. Cardinality may be unknown (e.g. the selector used by IncrementalIndex while reading input rows), in which case this method will return -1. If cardinality is unknown, you should assume this dimension selector has no dictionary, and avoid storing ids, calling "lookupId", or calling "lookupName" outside of the context of operating on a single row. If cardinality is known then it is assumed that underlying dictionary is lexicographically sorted by the encoded value. For example if there are values "A" , "B" , "C" in a column with cardinality 3 then it is assumed that id("A") < id("B") < id("C")
        Returns:
        the value cardinality, or CARDINALITY_UNKNOWN if unknown.
      • lookupName

        @CalledFromHotLoop
        @Nullable
        String lookupName​(int id)
        Returns the value for a particular dictionary id as a Java String. For example, if a column has four rows: A,B A A,B B getRow() would return getRow(0) => [0 1] getRow(1) => [0] getRow(2) => [0 1] getRow(3) => [1] and then lookupName would return: lookupName(0) => A lookupName(1) => B Performance note: if you want a java.lang.String, always use this method. It will be at least as fast as calling lookupNameUtf8(int) and decoding the bytes. However, if you want UTF-8 bytes, then check if supportsLookupNameUtf8() returns true, and if it does, use lookupNameUtf8(int) instead.
        Parameters:
        id - id to lookup the dictionary value for
        Returns:
        dictionary value for the given id, or null if the value is itself null
      • lookupNameUtf8

        @Nullable
        default ByteBuffer lookupNameUtf8​(int id)
        Returns the value for a particular dictionary id as UTF-8 bytes. The returned buffer is in big-endian order. It is not reused, so callers may modify the position, limit, byte order, etc of the buffer. The returned buffer may point to the original data, so callers must take care not to use it outside the valid lifetime of this selector. In particular, if the original data came from a reference-counted segment, callers must not use the returned ByteBuffer after releasing their reference to the relevant ReferenceCountingSegment. Performance note: if you want UTF-8 bytes, and supportsLookupNameUtf8() returns true, always use this method. It will be at least as fast as calling lookupName(int) and encoding the bytes. However, if you want a java.lang.String, then use lookupName(int) instead of this method.
        Parameters:
        id - id to lookup the dictionary value for
        Returns:
        dictionary value for the given id, or null if the value is itself null
        Throws:
        UnsupportedOperationException - if supportsLookupNameUtf8() is false
      • supportsLookupNameUtf8

        default boolean supportsLookupNameUtf8()
        Returns whether this selector supports lookupNameUtf8(int).
      • nameLookupPossibleInAdvance

        boolean nameLookupPossibleInAdvance()
        Returns true if it is possible to lookupName(int) by ids from 0 to getValueCardinality() before the rows with those ids are returned.

        Returns false if lookupName(int) could be called with ids, returned from the most recent row (or row vector) returned by this DimensionSelector, but not earlier. If getValueCardinality() of this selector additionally returns CARDINALITY_UNKNOWN, lookupName() couldn't be called with ids, returned by not the most recent row (or row vector), i. e. names for ids couldn't be looked up "later". If getValueCardinality() returns a non-negative number, lookupName() could be called with any ids, returned from rows (or row vectors) returned since the creation of this DimensionSelector.

        If lookupName(int) is called with an ineligible id, result is undefined: exception could be thrown, or null returned, or some other random value.