Package org.apache.lucene.codecs.bloom
Class BloomFilteringPostingsFormat
- java.lang.Object
-
- org.apache.lucene.codecs.PostingsFormat
-
- org.apache.lucene.codecs.bloom.BloomFilteringPostingsFormat
-
- All Implemented Interfaces:
NamedSPILoader.NamedSPI
public final class BloomFilteringPostingsFormat extends PostingsFormat
APostingsFormat
useful for low doc-frequency fields such as primary keys. Bloom filters are maintained in a ".blm" file which offers "fast-fail" for reads in segments known to have no record of the key. A choice of delegate PostingsFormat is used to record all other Postings data.A choice of
BloomFilterFactory
can be passed to tailor Bloom Filter settings on a per-field basis. The default configuration isDefaultBloomFilterFactory
which allocates a ~8mb bitset and hashes values usingMurmurHash64
. This should be suitable for most purposes.The format of the blm file is as follows:
- BloomFilter (.blm) --> Header, DelegatePostingsFormatName, NumFilteredFields, FilterNumFilteredFields, Footer
- Filter --> FieldNumber, FuzzySet
- FuzzySet -->See
FuzzySet.serialize(DataOutput)
- Header -->
IndexHeader
- DelegatePostingsFormatName -->
String
The name of a ServiceProvider registeredPostingsFormat
- NumFilteredFields -->
Uint32
- FieldNumber -->
Uint32
The number of the field in this segment - Footer -->
CodecFooter
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
-
Field Summary
Fields Modifier and Type Field Description static String
BLOOM_CODEC_NAME
static int
VERSION_CURRENT
static int
VERSION_START
-
Fields inherited from class org.apache.lucene.codecs.PostingsFormat
EMPTY
-
-
Constructor Summary
Constructors Constructor Description BloomFilteringPostingsFormat()
BloomFilteringPostingsFormat(PostingsFormat delegatePostingsFormat)
Creates Bloom filters for a selection of fields created in the index.BloomFilteringPostingsFormat(PostingsFormat delegatePostingsFormat, BloomFilterFactory bloomFilterFactory)
Creates Bloom filters for a selection of fields created in the index.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description FieldsConsumer
fieldsConsumer(SegmentWriteState state)
FieldsProducer
fieldsProducer(SegmentReadState state)
String
toString()
-
Methods inherited from class org.apache.lucene.codecs.PostingsFormat
availablePostingsFormats, forName, getName, reloadPostingsFormats
-
-
-
-
Field Detail
-
BLOOM_CODEC_NAME
public static final String BLOOM_CODEC_NAME
- See Also:
- Constant Field Values
-
VERSION_START
public static final int VERSION_START
- See Also:
- Constant Field Values
-
VERSION_CURRENT
public static final int VERSION_CURRENT
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
BloomFilteringPostingsFormat
public BloomFilteringPostingsFormat(PostingsFormat delegatePostingsFormat, BloomFilterFactory bloomFilterFactory)
Creates Bloom filters for a selection of fields created in the index. This is recorded as a set of Bitsets held as a segment summary in an additional "blm" file. This PostingsFormat delegates to a choice of delegate PostingsFormat for encoding all other postings data.- Parameters:
delegatePostingsFormat
- The PostingsFormat that records all the non-bloom filter data i.e. postings info.bloomFilterFactory
- TheBloomFilterFactory
responsible for sizing BloomFilters appropriately
-
BloomFilteringPostingsFormat
public BloomFilteringPostingsFormat(PostingsFormat delegatePostingsFormat)
Creates Bloom filters for a selection of fields created in the index. This is recorded as a set of Bitsets held as a segment summary in an additional "blm" file. This PostingsFormat delegates to a choice of delegate PostingsFormat for encoding all other postings data. This choice of constructor defaults to theDefaultBloomFilterFactory
for configuring per-field BloomFilters.- Parameters:
delegatePostingsFormat
- The PostingsFormat that records all the non-bloom filter data i.e. postings info.
-
BloomFilteringPostingsFormat
public BloomFilteringPostingsFormat()
-
-
Method Detail
-
fieldsConsumer
public FieldsConsumer fieldsConsumer(SegmentWriteState state) throws IOException
- Specified by:
fieldsConsumer
in classPostingsFormat
- Throws:
IOException
-
fieldsProducer
public FieldsProducer fieldsProducer(SegmentReadState state) throws IOException
- Specified by:
fieldsProducer
in classPostingsFormat
- Throws:
IOException
-
toString
public String toString()
- Overrides:
toString
in classPostingsFormat
-
-