Class SpillingGrouper<KeyType>

  • All Implemented Interfaces:
    Closeable, AutoCloseable, Grouper<KeyType>

    public class SpillingGrouper<KeyType>
    extends Object
    implements Grouper<KeyType>
    Grouper based around a single underlying BufferHashGrouper. Not thread-safe. When the underlying grouper is full, its contents are sorted and written to temporary files using "spillMapper".
    • Method Detail

      • isInitialized

        public boolean isInitialized()
        Description copied from interface: Grouper
        Check this grouper is initialized or not.
        Specified by:
        isInitialized in interface Grouper<KeyType>
        Returns:
        true if the grouper is already initialized, otherwise false.
      • aggregate

        public AggregateResult aggregate​(KeyType key,
                                         int keyHash)
        Description copied from interface: Grouper
        Aggregate the current row with the provided key. Some implementations are thread-safe and some are not.
        Specified by:
        aggregate in interface Grouper<KeyType>
        Parameters:
        key - key object
        keyHash - result of Grouper.hashFunction() on the key
        Returns:
        result that is ok if the row was aggregated, not ok if a resource limit was hit
      • reset

        public void reset()
        Description copied from interface: Grouper
        Reset the grouper to its initial state.
        Specified by:
        reset in interface Grouper<KeyType>
      • mergeAndGetDictionary

        public List<String> mergeAndGetDictionary()
        Returns a dictionary of string keys added to this grouper. Note that the dictionary of keySerde is spilled on local storage whenever the inner grouper is spilled. If there are spilled dictionaries, this method loads them from disk and returns a merged dictionary.
        Returns:
        a dictionary which is a list of unique strings
      • isSpillingAllowed

        public boolean isSpillingAllowed()
      • setSpillingAllowed

        public void setSpillingAllowed​(boolean spillingAllowed)
      • iterator

        public CloseableIterator<Grouper.Entry<KeyType>> iterator​(boolean sorted)
        Description copied from interface: Grouper
        Iterate through entries.

        Some implementations allow writes even after this method is called. After you are done with the iterator returned by this method, you should either call Grouper.close() (if you are done with the Grouper) or Grouper.reset() (if you want to reuse it). Some implementations allow calling Grouper.iterator(boolean) again if you want another iterator. But, this method must not be called by multiple threads concurrently.

        If "sorted" is true then the iterator will return sorted results. It will use KeyType's natural ordering on deserialized objects, and will use the Grouper.KeySerde.bufferComparator() on serialized objects. Woe be unto you if these comparators are not equivalent.

        Callers must process and discard the returned Grouper.Entrys immediately because some implementations can reuse the key objects.

        Specified by:
        iterator in interface Grouper<KeyType>
        Parameters:
        sorted - return sorted results
        Returns:
        entry iterator