Class OpenBitSet

  • All Implemented Interfaces:
    Serializable, Cloneable, BitMask

    public class OpenBitSet
    extends Object
    implements BitMask
    An "open" BitSet implementation that allows direct access to the array of words storing the bits.

    Unlike java.util.bitset, the fact that bits are packed into an array of longs is part of the interface. This allows efficient implementation of other algorithms by someone other than the author. It also allows one to efficiently implement alternate serialization or interchange formats.

    OpenBitSet is faster than java.util.BitSet in most operations and *much* faster at calculating cardinality of sets and results of set operations. It can also handle sets of larger cardinality (up to 64 * 2**32-1)

    The goals of OpenBitSet are the fastest implementation possible, and maximum code reuse. Extra safety and encapsulation may always be built on top, but if that's built in, the cost can never be removed (and hence people re-implement their own version in order to get better performance). If you want a "safe", totally encapsulated (and slower and limited) BitSet class, use java.util.BitSet.

    Performance Results

    Test system: Pentium 4, Sun Java 1.5_06 -server -Xbatch -Xmx64M
    BitSet size = 1,000,000
    Results are java.util.BitSet time divided by OpenBitSet time.
    cardinality intersect_count union nextSetBit get iterator
    50% full 3.36 3.96 1.44 1.46 1.99 1.58
    1% full 3.31 3.90   1.04   0.99

    Test system: AMD Opteron, 64 bit linux, Sun Java 1.5_06 -server -Xbatch -Xmx64M
    BitSet size = 1,000,000
    Results are java.util.BitSet time divided by OpenBitSet time.
    cardinality intersect_count union nextSetBit get iterator
    50% full 2.50 3.50 1.00 1.03 1.12 1.25
    1% full 2.51 3.49   1.00   1.02
    See Also:
    Serialized Form
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static class  OpenBitSet.BitUtil  
    • Constructor Summary

      Constructors 
      Constructor Description
      OpenBitSet()  
      OpenBitSet​(long numBits)
      Constructs an OpenBitSet large enough to hold numBits.
      OpenBitSet​(long[] bits, int numWords)
      Constructs an OpenBitSet from an existing long[].
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void and​(OpenBitSet other)  
      void andNot​(OpenBitSet other)  
      static long andNotCount​(OpenBitSet a, OpenBitSet b)
      Returns the popcount or cardinality of "a and not b" or "intersection(a, not(b))".
      static int bits2words​(long numBits)
      returns the number of 64 bit words it would take to hold numBits
      long capacity()
      Returns the current capacity in bits (1 greater than the index of the last bit)
      long cardinality()  
      void clear​(int startIndex, int endIndex)
      Clears a range of bits.
      void clear​(long index)
      clears a bit, allowing access beyond the current set size without changing the size.
      void clear​(long startIndex, long endIndex)
      Clears a range of bits.
      OpenBitSet clone()  
      void ensureCapacity​(long numBits)
      Ensure that the long[] is big enough to hold numBits, expanding it if necessary.
      void ensureCapacityWords​(int numWords)
      Expand the long[] with the size given as a number of words (64 bit longs).
      boolean equals​(Object o)
      returns true if both sets have the same bits set
      protected int expandingWordNum​(long index)  
      void fastClear​(int index)
      clears a bit.
      void fastClear​(long index)
      clears a bit.
      void fastFlip​(int index)
      flips a bit.
      void fastFlip​(long index)
      flips a bit.
      boolean fastGet​(int index)
      Returns true or false for the specified bit index.
      boolean fastGet​(long index)
      Returns true or false for the specified bit index.
      void fastSet​(int index)
      Sets the bit at the specified index.
      void fastSet​(long index)
      Sets the bit at the specified index.
      void flip​(long index)
      flips a bit, expanding the set size if necessary
      void flip​(long startIndex, long endIndex)
      Flips a range of bits, expanding the set size if necessary
      boolean flipAndGet​(int index)
      flips a bit and returns the resulting bit value.
      boolean flipAndGet​(long index)
      flips a bit and returns the resulting bit value.
      boolean get​(int index)
      Returns true or false for the specified bit index.
      boolean get​(long index)
      Returns true or false for the specified bit index
      boolean getAndSet​(int index)
      Sets a bit and returns the previous value.
      boolean getAndSet​(long index)
      Sets a bit and returns the previous value.
      int getBit​(int index)
      returns 1 if the bit is set, 0 if not.
      long[] getBits()
      Expert: returns the long[] storing the bits
      String getInstancingStatement()  
      int getNumWords()
      Expert: gets the number of longs in the array that are in use
      static long[] grow​(long[] array, int minSize)  
      int hashCode()  
      void intersect​(OpenBitSet other)
      this = this AND other
      static long intersectionCount​(OpenBitSet a, OpenBitSet b)
      Returns the popcount or cardinality of the intersection of the two sets.
      boolean intersects​(BitMask mask)  
      boolean intersects​(OpenBitSet other)
      returns true if the sets have any elements in common
      boolean isAllSet()  
      boolean isEmpty()
      Returns true if there are no set bits
      boolean isSet​(int index)  
      int length()  
      int nextSetBit​(int index)
      Returns the index of the first set bit starting at the index specified
      long nextSetBit​(long index)
      Returns the index of the first set bit starting at the index specified
      void or​(OpenBitSet other)  
      static int oversize​(int minTargetSize, int bytesPerElement)
      Returns an array size >= minTargetSize, generally over-allocating exponentially to achieve amortized linear-time cost as the array grows.
      int prevSetBit​(int index)
      Returns the index of the first set bit starting downwards at the index specified
      long prevSetBit​(long index)
      Returns the index of the first set bit starting downwards at the index specified
      void remove​(OpenBitSet other)
      Remove all elements set in other. this = this AND_NOT other
      BitMask reset​(int index)  
      BitMask resetAll​(BitMask mask)  
      BitMask set​(int index)  
      void set​(long index)
      sets a bit, expanding the set size if necessary
      void set​(long startIndex, long endIndex)
      Sets a range of bits, expanding the set size if necessary
      BitMask setAll​(BitMask mask)  
      void setBits​(long[] bits)
      Expert: sets a new long[] to use as the bit storage
      void setNumWords​(int nWords)
      Expert: sets the number of longs in the array that are in use
      long size()
      Returns the current capacity of this set.
      String toString()  
      void trimTrailingZeros()
      Lowers numWords, the number of words in use, by checking for trailing zero words.
      void union​(OpenBitSet other)
      this = this OR other
      static long unionCount​(OpenBitSet a, OpenBitSet b)
      Returns the popcount or cardinality of the union of the two sets.
      void xor​(OpenBitSet other)
      this = this XOR other
      static long xorCount​(OpenBitSet a, OpenBitSet b)
      Returns the popcount or cardinality of the exclusive-or of the two sets.
    • Field Detail

      • bits

        protected long[] bits
      • wlen

        protected int wlen
      • OS_ARCH

        public static final String OS_ARCH
      • JRE_IS_MINIMUM_JAVA7

        public static final boolean JRE_IS_MINIMUM_JAVA7
      • JRE_IS_MINIMUM_JAVA8

        public static final boolean JRE_IS_MINIMUM_JAVA8
      • JRE_IS_64BIT

        public static final boolean JRE_IS_64BIT
        True iff running on a 64bit JVM
    • Constructor Detail

      • OpenBitSet

        public OpenBitSet​(long numBits)
        Constructs an OpenBitSet large enough to hold numBits.
      • OpenBitSet

        public OpenBitSet()
      • OpenBitSet

        public OpenBitSet​(long[] bits,
                          int numWords)
        Constructs an OpenBitSet from an existing long[].
        The first 64 bits are in long[0], with bit index 0 at the least significant bit, and bit index 63 at the most significant. Given a bit index, the word containing it is long[index/64], and it is at bit number index%64 within that word.

        numWords are the number of elements in the array that contain set bits (non-zero longs). numWords should be <= bits.length, and any existing words in the array at position >= numWords should be zero.

    • Method Detail

      • capacity

        public long capacity()
        Returns the current capacity in bits (1 greater than the index of the last bit)
      • size

        public long size()
        Returns the current capacity of this set. Included for compatibility. This is *not* equal to cardinality()
      • length

        public int length()
      • isEmpty

        public boolean isEmpty()
        Returns true if there are no set bits
        Specified by:
        isEmpty in interface BitMask
      • getBits

        public long[] getBits()
        Expert: returns the long[] storing the bits
      • setBits

        public void setBits​(long[] bits)
        Expert: sets a new long[] to use as the bit storage
      • getNumWords

        public int getNumWords()
        Expert: gets the number of longs in the array that are in use
      • setNumWords

        public void setNumWords​(int nWords)
        Expert: sets the number of longs in the array that are in use
      • get

        public boolean get​(int index)
        Returns true or false for the specified bit index.
      • fastGet

        public boolean fastGet​(int index)
        Returns true or false for the specified bit index. The index should be less than the OpenBitSet size
      • get

        public boolean get​(long index)
        Returns true or false for the specified bit index
      • fastGet

        public boolean fastGet​(long index)
        Returns true or false for the specified bit index. The index should be less than the OpenBitSet size.
      • getBit

        public int getBit​(int index)
        returns 1 if the bit is set, 0 if not. The index should be less than the OpenBitSet size
      • set

        public void set​(long index)
        sets a bit, expanding the set size if necessary
      • fastSet

        public void fastSet​(int index)
        Sets the bit at the specified index. The index should be less than the OpenBitSet size.
      • fastSet

        public void fastSet​(long index)
        Sets the bit at the specified index. The index should be less than the OpenBitSet size.
      • set

        public void set​(long startIndex,
                        long endIndex)
        Sets a range of bits, expanding the set size if necessary
        Parameters:
        startIndex - lower index
        endIndex - one-past the last bit to set
      • expandingWordNum

        protected int expandingWordNum​(long index)
      • fastClear

        public void fastClear​(int index)
        clears a bit. The index should be less than the OpenBitSet size.
      • fastClear

        public void fastClear​(long index)
        clears a bit. The index should be less than the OpenBitSet size.
      • clear

        public void clear​(long index)
        clears a bit, allowing access beyond the current set size without changing the size.
      • clear

        public void clear​(int startIndex,
                          int endIndex)
        Clears a range of bits. Clearing past the end does not change the size of the set.
        Parameters:
        startIndex - lower index
        endIndex - one-past the last bit to clear
      • clear

        public void clear​(long startIndex,
                          long endIndex)
        Clears a range of bits. Clearing past the end does not change the size of the set.
        Parameters:
        startIndex - lower index
        endIndex - one-past the last bit to clear
      • getAndSet

        public boolean getAndSet​(int index)
        Sets a bit and returns the previous value. The index should be less than the OpenBitSet size.
      • getAndSet

        public boolean getAndSet​(long index)
        Sets a bit and returns the previous value. The index should be less than the OpenBitSet size.
      • fastFlip

        public void fastFlip​(int index)
        flips a bit. The index should be less than the OpenBitSet size.
      • fastFlip

        public void fastFlip​(long index)
        flips a bit. The index should be less than the OpenBitSet size.
      • flip

        public void flip​(long index)
        flips a bit, expanding the set size if necessary
      • flipAndGet

        public boolean flipAndGet​(int index)
        flips a bit and returns the resulting bit value. The index should be less than the OpenBitSet size.
      • flipAndGet

        public boolean flipAndGet​(long index)
        flips a bit and returns the resulting bit value. The index should be less than the OpenBitSet size.
      • flip

        public void flip​(long startIndex,
                         long endIndex)
        Flips a range of bits, expanding the set size if necessary
        Parameters:
        startIndex - lower index
        endIndex - one-past the last bit to flip
      • cardinality

        public long cardinality()
        Returns:
        the number of set bits
      • intersectionCount

        public static long intersectionCount​(OpenBitSet a,
                                             OpenBitSet b)
        Returns the popcount or cardinality of the intersection of the two sets. Neither set is modified.
      • unionCount

        public static long unionCount​(OpenBitSet a,
                                      OpenBitSet b)
        Returns the popcount or cardinality of the union of the two sets. Neither set is modified.
      • andNotCount

        public static long andNotCount​(OpenBitSet a,
                                       OpenBitSet b)
        Returns the popcount or cardinality of "a and not b" or "intersection(a, not(b))". Neither set is modified.
      • xorCount

        public static long xorCount​(OpenBitSet a,
                                    OpenBitSet b)
        Returns the popcount or cardinality of the exclusive-or of the two sets. Neither set is modified.
      • nextSetBit

        public int nextSetBit​(int index)
        Returns the index of the first set bit starting at the index specified. -1 is returned if there are no more set bits.
      • nextSetBit

        public long nextSetBit​(long index)
        Returns the index of the first set bit starting at the index specified. -1 is returned if there are no more set bits.
      • prevSetBit

        public int prevSetBit​(int index)
        Returns the index of the first set bit starting downwards at the index specified. -1 is returned if there are no more set bits.
      • prevSetBit

        public long prevSetBit​(long index)
        Returns the index of the first set bit starting downwards at the index specified. -1 is returned if there are no more set bits.
      • intersect

        public void intersect​(OpenBitSet other)
        this = this AND other
      • union

        public void union​(OpenBitSet other)
        this = this OR other
      • remove

        public void remove​(OpenBitSet other)
        Remove all elements set in other. this = this AND_NOT other
      • xor

        public void xor​(OpenBitSet other)
        this = this XOR other
      • andNot

        public void andNot​(OpenBitSet other)
      • intersects

        public boolean intersects​(OpenBitSet other)
        returns true if the sets have any elements in common
      • ensureCapacityWords

        public void ensureCapacityWords​(int numWords)
        Expand the long[] with the size given as a number of words (64 bit longs). getNumWords() is unchanged by this call.
      • grow

        public static long[] grow​(long[] array,
                                  int minSize)
      • oversize

        public static int oversize​(int minTargetSize,
                                   int bytesPerElement)
        Returns an array size >= minTargetSize, generally over-allocating exponentially to achieve amortized linear-time cost as the array grows. NOTE: this was originally borrowed from Python 2.4.2 listobject.c sources (attribution in LICENSE.txt), but has now been substantially changed based on discussions from java-dev thread with subject "Dynamic array reallocation algorithms", started on Jan 12 2010.
        Parameters:
        minTargetSize - Minimum required value to be returned.
        bytesPerElement - Bytes used by each element of the array.
      • ensureCapacity

        public void ensureCapacity​(long numBits)
        Ensure that the long[] is big enough to hold numBits, expanding it if necessary. getNumWords() is unchanged by this call.
      • trimTrailingZeros

        public void trimTrailingZeros()
        Lowers numWords, the number of words in use, by checking for trailing zero words.
      • bits2words

        public static int bits2words​(long numBits)
        returns the number of 64 bit words it would take to hold numBits
      • equals

        public boolean equals​(Object o)
        returns true if both sets have the same bits set
        Overrides:
        equals in class Object
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class Object
      • isSet

        public boolean isSet​(int index)
        Specified by:
        isSet in interface BitMask
      • isAllSet

        public boolean isAllSet()
        Specified by:
        isAllSet in interface BitMask