Interface Hash

  • All Known Implementing Classes:
    Char2ObjectOpenHashMap

    public interface Hash
    Basic data for all hash-based classes.

    Historical note

    Warning: the following comments are here for historical reasons, and apply just to the double hash classes that can be optionally generated. The standard fastutil distribution since 6.1.0 uses linear-probing hash tables, and tables are always sized as powers of two.

    The classes in fastutil are built around open-addressing hashing implemented via double hashing. Following Knuth's suggestions in the third volume of The Art of Computer Programming, we use for the table size a prime p such that p-2 is also prime. In this way hashing is implemented with modulo p, and secondary hashing with modulo p-2.

    Entries in a table can be in three states: FREE, OCCUPIED or REMOVED. The naive handling of removed entries requires that you search for a free entry as if they were occupied. However, fastutil implements two useful optimizations, based on the following invariant:

    Let i0, i1, &hellip, ip-1 be the permutation of the table indices induced by the key k, that is, i0 is the hash of k and the following indices are obtained by adding (modulo p) the secondary hash plus one. If there is a OCCUPIED entry with key k, its index in the sequence above comes before the indices of any REMOVED entries with key k.

    When we search for the key k we scan the entries in the sequence i0, i1, &hellip, ip-1 and stop when k is found, when we finished the sequence or when we find a FREE entry. Note that the correctness of this procedure it is not completely trivial. Indeed, when we stop at a REMOVED entry with key k we must rely on the invariant to be sure that no OCCUPIED entry with the same key can appear later. If we insert and remove frequently the same entries, this optimization can be very effective (note, however, that when using objects as keys or values deleted entries are set to a special fixed value to optimize garbage collection).

    Moreover, during the probe we keep the index of the first REMOVED entry we meet. If we actually have to insert a new element, we use that entry if we can, thus avoiding to pollute another FREE entry. Since this position comes a fortiori before any REMOVED entries with the same key, we are also keeping the invariant true.

    • Nested Class Summary

      Nested Classes 
      Modifier and Type Interface Description
      static interface  Hash.Strategy<K>
      A generic hash strategy.
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static int DEFAULT_GROWTH_FACTOR
      The default growth factor of a hash table.
      static int DEFAULT_INITIAL_SIZE
      The initial default size of a hash table.
      static float DEFAULT_LOAD_FACTOR
      The default load factor of a hash table.
      static float FAST_LOAD_FACTOR
      The load factor for a (usually small) table that is meant to be particularly fast.
      static byte FREE
      The state of a free hash table entry.
      static byte OCCUPIED
      The state of a occupied hash table entry.
      static int[] PRIMES
      A list of primes to be used as table sizes.
      static byte REMOVED
      The state of a hash table entry freed by a deletion.
      static float VERY_FAST_LOAD_FACTOR
      The load factor for a (usually very small) table that is meant to be extremely fast.
    • Field Detail

      • DEFAULT_INITIAL_SIZE

        static final int DEFAULT_INITIAL_SIZE
        The initial default size of a hash table.
        See Also:
        Constant Field Values
      • DEFAULT_LOAD_FACTOR

        static final float DEFAULT_LOAD_FACTOR
        The default load factor of a hash table.
        See Also:
        Constant Field Values
      • FAST_LOAD_FACTOR

        static final float FAST_LOAD_FACTOR
        The load factor for a (usually small) table that is meant to be particularly fast.
        See Also:
        Constant Field Values
      • VERY_FAST_LOAD_FACTOR

        static final float VERY_FAST_LOAD_FACTOR
        The load factor for a (usually very small) table that is meant to be extremely fast.
        See Also:
        Constant Field Values
      • DEFAULT_GROWTH_FACTOR

        static final int DEFAULT_GROWTH_FACTOR
        The default growth factor of a hash table.
        See Also:
        Constant Field Values
      • OCCUPIED

        static final byte OCCUPIED
        The state of a occupied hash table entry.
        See Also:
        Constant Field Values
      • REMOVED

        static final byte REMOVED
        The state of a hash table entry freed by a deletion.
        See Also:
        Constant Field Values
      • PRIMES

        static final int[] PRIMES
        A list of primes to be used as table sizes. The i-th element is the largest prime p smaller than 2(i+28)/16 and such that p-2 is also prime (or 1, for the first few entries).