Class Lucene90BlockTreeTermsReader
java.lang.Object
org.apache.lucene.index.Fields
org.apache.lucene.codecs.FieldsProducer
org.apache.lucene.backward_codecs.lucene90.blocktree.Lucene90BlockTreeTermsReader
- All Implemented Interfaces:
Closeable,AutoCloseable,Iterable<String>
A block-based terms index and dictionary that assigns terms to variable length blocks according
to how they share prefixes. The terms index is a prefix trie whose leaves are term blocks. The
advantage of this approach is that seekExact is often able to determine a term cannot exist
without doing any IO, and intersection with Automata is very fast. Note that this terms
dictionary has its own fixed terms index (ie, it does not support a pluggable terms index
implementation).
NOTE: this terms dictionary supports min/maxItemsPerBlock during indexing to control how much memory the terms index uses.
The data structure used by this implementation is very similar to a burst trie (http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.18.3499), but with added logic to break up too-large blocks of all terms sharing a given prefix into smaller ones.
Use CheckIndex with the -verbose option to see
summary statistics on the blocks in the dictionary.
See Lucene90BlockTreeTermsWriter.
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final StringExtension of terms codec namestatic final StringExtension of terms filestatic final StringExtension of terms index codec namestatic final StringExtension of terms index filestatic final StringExtension of terms meta codec namestatic final StringExtension of terms meta filestatic final intCurrent terms format.static final intThe version that specialize arc store for continuous label in FST.static final intVersion that encode output as MSB VLong for better outputs sharing in FST, see GITHUB#12620.static final intInitial terms format.Fields inherited from class org.apache.lucene.index.Fields
EMPTY_ARRAY -
Constructor Summary
ConstructorsConstructorDescriptionLucene90BlockTreeTermsReader(PostingsReaderBase postingsReader, SegmentReadState state) Sole constructor. -
Method Summary
Methods inherited from class org.apache.lucene.codecs.FieldsProducer
getMergeInstanceMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitMethods inherited from interface java.lang.Iterable
forEach, spliterator
-
Field Details
-
TERMS_EXTENSION
Extension of terms file- See Also:
-
TERMS_CODEC_NAME
Extension of terms codec name- See Also:
-
VERSION_START
public static final int VERSION_STARTInitial terms format.- See Also:
-
VERSION_MSB_VLONG_OUTPUT
public static final int VERSION_MSB_VLONG_OUTPUTVersion that encode output as MSB VLong for better outputs sharing in FST, see GITHUB#12620.- See Also:
-
VERSION_FST_CONTINUOUS_ARCS
public static final int VERSION_FST_CONTINUOUS_ARCSThe version that specialize arc store for continuous label in FST.- See Also:
-
VERSION_CURRENT
public static final int VERSION_CURRENTCurrent terms format.- See Also:
-
TERMS_INDEX_EXTENSION
Extension of terms index file- See Also:
-
TERMS_INDEX_CODEC_NAME
Extension of terms index codec name- See Also:
-
TERMS_META_EXTENSION
Extension of terms meta file- See Also:
-
TERMS_META_CODEC_NAME
Extension of terms meta codec name- See Also:
-
-
Constructor Details
-
Lucene90BlockTreeTermsReader
public Lucene90BlockTreeTermsReader(PostingsReaderBase postingsReader, SegmentReadState state) throws IOException Sole constructor.- Throws:
IOException
-
-
Method Details
-
close
- Specified by:
closein interfaceAutoCloseable- Specified by:
closein interfaceCloseable- Specified by:
closein classFieldsProducer- Throws:
IOException
-
iterator
-
terms
- Specified by:
termsin classFields- Throws:
IOException
-
size
public int size() -
checkIntegrity
- Specified by:
checkIntegrityin classFieldsProducer- Throws:
IOException
-
toString
-