Class ImmutableExternalPrefixMap

All Implemented Interfaces:
PrefixMap<MutableString>, StringMap<MutableString>, Function<CharSequence,​Long>, Object2LongFunction<CharSequence>, Size64, Serializable, Function<CharSequence,​Long>, ToLongFunction<CharSequence>

public class ImmutableExternalPrefixMap
extends AbstractPrefixMap
implements Serializable
An immutable prefix map mostly stored in external memory.
Since:
2.0
Author:
Sebastiano Vigna
See Also:
ImmutableExternalPrefixMap, Serialized Form
  • Field Details

    • serialVersionUID

      public static final long serialVersionUID
      See Also:
      Constant Field Values
    • STD_BLOCK_SIZE

      public static final int STD_BLOCK_SIZE
      The standard block size (in bytes).
      See Also:
      Constant Field Values
    • intervalApproximator

      protected final ImmutableBinaryTrie<CharSequence> intervalApproximator
      The in-memory data structure used to approximate intervals..
    • blockSize

      protected final long blockSize
      The block size of this (in bits).
    • decoder

      protected final Decoder decoder
      A decoder used to read data from the dump stream.
    • symbol2char

      protected final char[] symbol2char
      A map (given by an array) from symbols in the coder to characters.
    • char2symbol

      protected final Char2IntOpenHashMap char2symbol
      A map from characters to symbols of the coder.
    • size

      protected final long size
      The number of terms in this map.
    • blockStart

      protected final long[][] blockStart
      The index of the first word in each block, plus an additional entry containing Function.size().
    • blockOffset

      protected final long[][] blockOffset
      A big array array parallel to blockStart giving the offset in blocks in the dump file of the corresponding word in blockStart. If there are no overflows, this will just be an initial segment of the natural numbers, but overflows cause jumps.
    • selfContained

      protected final boolean selfContained
      Whether this map is self-contained.
    • iteratorIsUsable

      protected transient boolean iteratorIsUsable
      If true, the creation of the last DumpStreamIterator was not followed by a call to any get method.
    • dumpStream

      protected transient InputBitStream dumpStream
      A reference to the dump stream.
  • Constructor Details

    • ImmutableExternalPrefixMap

      public ImmutableExternalPrefixMap​(Iterable<? extends CharSequence> terms, int blockSizeInBytes, CharSequence dumpStreamFilename) throws IOException
      Creates an external prefix map with specified block size and dump stream.

      This constructor does not assume that CharSequence instances returned by terms.iterator() will be distinct. Thus, it can be safely used with FileLinesCollection.

      Parameters:
      terms - an iterable whose iterator will enumerate in lexicographical order the terms for the map.
      blockSizeInBytes - the block size (in bytes).
      dumpStreamFilename - the name of the dump stream, or null for a self-contained map.
      Throws:
      IOException
    • ImmutableExternalPrefixMap

      public ImmutableExternalPrefixMap​(Iterable<? extends CharSequence> terms, CharSequence dumpStreamFilename) throws IOException
      Creates an external prefix map with block size STD_BLOCK_SIZE and specified dump stream.

      This constructor does not assume that CharSequence instances returned by terms.iterator() will be distinct. Thus, it can be safely used with FileLinesCollection.

      Parameters:
      terms - a collection whose iterator will enumerate in lexicographical order the terms for the map.
      dumpStreamFilename - the name of the dump stream, or null for a self-contained map.
      Throws:
      IOException
    • ImmutableExternalPrefixMap

      public ImmutableExternalPrefixMap​(Iterable<? extends CharSequence> terms, int blockSizeInBytes) throws IOException
      Creates an external prefix map with specified block size.

      This constructor does not assume that CharSequence instances returned by terms.iterator() will be distinct. Thus, it can be safely used with FileLinesCollection.

      Parameters:
      blockSizeInBytes - the block size (in bytes).
      terms - a collection whose iterator will enumerate in lexicographical order the terms for the map.
      Throws:
      IOException
    • ImmutableExternalPrefixMap

      public ImmutableExternalPrefixMap​(Iterable<? extends CharSequence> terms) throws IOException
      Creates an external prefix map with block size STD_BLOCK_SIZE.

      This constructor does not assume that strings returned by terms.iterator() will be distinct. Thus, it can be safely used with FileLinesCollection.

      Parameters:
      terms - a collection whose iterator will enumerate in lexicographical order the terms for the map.
      Throws:
      IOException
  • Method Details