Class FloatDimensionIndexer

    • Field Detail

    • Constructor Detail

      • FloatDimensionIndexer

        public FloatDimensionIndexer()
    • Method Detail

      • processRowValsToUnsortedEncodedKeyComponent

        public EncodedKeyComponent<Float> processRowValsToUnsortedEncodedKeyComponent​(@Nullable
                                                                                      Object dimValues,
                                                                                      boolean reportParseExceptions)
        Description copied from interface: DimensionIndexer
        Encodes the given row value(s) of the dimension to be used within a row key. It also updates the internal state of the DimensionIndexer, e.g. the dimLookup.

        For example, the dictionary-encoded String-type column will return an int[] containing dictionary IDs.

        Specified by:
        processRowValsToUnsortedEncodedKeyComponent in interface DimensionIndexer<Float,​Float,​Float>
        Parameters:
        dimValues - Value(s) of the dimension in a row. This can either be a single value or a list of values (for multi-valued dimensions)
        reportParseExceptions - true if parse exceptions should be reported, false otherwise
        Returns:
        Encoded dimension value(s) to be used as a component for the row key. Contains an object of the DimensionIndexer and the effective size of the key component in bytes.
      • setSparseIndexed

        public void setSparseIndexed()
        Description copied from interface: DimensionIndexer
        This method will be called while building an IncrementalIndex whenever a known dimension column (either through an explicit schema on the ingestion spec, or auto-discovered while processing rows) is absent in any row that is processed, to allow an indexer to account for any missing rows if necessary. Useful so that a string DimensionSelector built on top of an IncrementalIndex may accurately report DimensionDictionarySelector.nameLookupPossibleInAdvance() by allowing it to track if it has any implicit null valued rows. At index persist/merge time all missing columns for a row will be explicitly replaced with the value appropriate null or default value.
        Specified by:
        setSparseIndexed in interface DimensionIndexer<Float,​Float,​Float>
      • getUnsortedEncodedValueFromSorted

        public Float getUnsortedEncodedValueFromSorted​(Float sortedIntermediateValue)
        Description copied from interface: DimensionIndexer
        Given an encoded value that was ordered by associated actual value, return the equivalent encoded value ordered by time of ingestion. Using the example in the class description: getUnsortedEncodedValueFromSorted(2) would return 0
        Specified by:
        getUnsortedEncodedValueFromSorted in interface DimensionIndexer<Float,​Float,​Float>
        Parameters:
        sortedIntermediateValue - value to convert
        Returns:
        converted value
      • getSortedIndexedValues

        public CloseableIndexed<Float> getSortedIndexedValues()
        Description copied from interface: DimensionIndexer
        Returns an indexed structure of this dimension's sorted actual values. The integer IDs represent the ordering of the sorted values. Using the example in the class description: "Apple"=0, "Hello"=1, "World"=2
        Specified by:
        getSortedIndexedValues in interface DimensionIndexer<Float,​Float,​Float>
        Returns:
        Sorted index of actual values
      • getMinValue

        public Float getMinValue()
        Description copied from interface: DimensionIndexer
        Get the minimum dimension value seen by this indexer. NOTE: On an in-memory segment (IncrementalIndex), we can determine min/max values by looking at the stream of row values seen in calls to processSingleRowValToIndexKey(). However, on a disk-backed segment (QueryableIndex), the numeric dimensions do not currently have any supporting index structures that can be used to efficiently determine min/max values. When numeric dimension support is added, the segment format should be changed to store min/max values, to avoid performing a full-column scan to determine these values for numeric dims.
        Specified by:
        getMinValue in interface DimensionIndexer<Float,​Float,​Float>
        Returns:
        min value
      • compareUnsortedEncodedKeyComponents

        public int compareUnsortedEncodedKeyComponents​(@Nullable
                                                       Float lhs,
                                                       @Nullable
                                                       Float rhs)
        Description copied from interface: DimensionIndexer
        Compares the row values for this DimensionIndexer's dimension from a Row key. The dimension value arrays within a Row key always use the "unsorted" ordering for encoded values. The row values are passed to this function as an Object, the implementer should cast them to the type appropriate for this dimension. For example, a dictionary encoded String implementation would cast the Objects as int[] arrays. When comparing, if the two arrays have different lengths, the shorter array should be ordered first. Otherwise, the implementer of this function should iterate through the unsorted encoded values, converting them to their actual type (e.g., performing a dictionary lookup for a dict-encoded String dimension), and comparing the actual values until a difference is found. Refer to StringDimensionIndexer.compareUnsortedEncodedKeyComponents() for a reference implementation. The comparison rules used by this method should match the rules used by DimensionHandler.getEncodedValueSelectorComparator(), otherwise incorrect ordering/merging of rows can occur during ingestion, causing issues such as imperfect rollup.
        Specified by:
        compareUnsortedEncodedKeyComponents in interface DimensionIndexer<Float,​Float,​Float>
        Parameters:
        lhs - dimension value array from a Row key
        rhs - dimension value array from a Row key
        Returns:
        comparison of the two arrays
      • fillBitmapsFromUnsortedEncodedKeyComponent

        public void fillBitmapsFromUnsortedEncodedKeyComponent​(Float key,
                                                               int rowNum,
                                                               MutableBitmap[] bitmapIndexes,
                                                               BitmapFactory factory)
        Description copied from interface: DimensionIndexer
        Helper function for building bitmap indexes for integer-encoded dimensions. Called by IncrementalIndexAdapter as it iterates through its sequence of rows. Given a row value array from a Row key, with the current row number indicated by "rowNum", set the index for "rowNum" in the bitmap index for each value that appears in the row value array. For example, if key is an int[] array with values [1,3,4] for a dictionary-encoded String dimension, and rowNum is 27, this function would set bit 27 in bitmapIndexes[1], bitmapIndexes[3], and bitmapIndexes[4] See StringDimensionIndexer.fillBitmapsFromUnsortedEncodedKeyComponent() for a reference implementation. If a dimension type does not support bitmap indexes, this function will not be called and can be left unimplemented.
        Specified by:
        fillBitmapsFromUnsortedEncodedKeyComponent in interface DimensionIndexer<Float,​Float,​Float>
        Parameters:
        key - dimension value array from a Row key
        rowNum - current row number
        bitmapIndexes - array of bitmaps, indexed by integer dimension value
        factory - bitmap factory