Class BlockWriter
- java.lang.Object
-
- org.apache.lucene.codecs.uniformsplit.BlockWriter
-
- Direct Known Subclasses:
STBlockWriter
public class BlockWriter extends Object
Writes blocks in the block file.According the Uniform Split technique, the writing combines three steps per block, and it is repeated for all the field blocks:
- Select the term with the shortest
minimal distinguishing prefix
(MDP) in the neighborhood of thetarget block size
(+-delta size
) - The selected term becomes the first term of the next block, and its MDP is the next block key.
- The current block is written to the
block file
. And its block key isadded
to theindex dictionary
.
This stateful
BlockWriter
is called repeatedly toadd
all theBlockLine
terms of a field. ThenfinishLastBlock(org.apache.lucene.codecs.uniformsplit.IndexDictionary.Builder)
is called. And then thisBlockWriter
can be reused to add the terms of another field.- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
-
Field Summary
Fields Modifier and Type Field Description protected BlockEncoder
blockEncoder
protected BlockHeader.Serializer
blockHeaderWriter
protected List<BlockLine>
blockLines
protected ByteBuffersDataOutput
blockLinesWriteBuffer
protected BlockLine.Serializer
blockLineWriter
protected IndexOutput
blockOutput
protected ByteBuffersDataOutput
blockWriteBuffer
protected int
deltaNumLines
protected FieldMetadata
fieldMetadata
protected BytesRef
lastTerm
protected BlockHeader
reusableBlockHeader
protected BytesRef
scratchBytesRef
protected int
targetNumBlockLines
protected DeltaBaseTermStateSerializer
termStateSerializer
protected ByteBuffersDataOutput
termStatesWriteBuffer
-
Constructor Summary
Constructors Modifier Constructor Description protected
BlockWriter(IndexOutput blockOutput, int targetNumBlockLines, int deltaNumLines, BlockEncoder blockEncoder)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
addBlockKey(List<BlockLine> blockLines, IndexDictionary.Builder dictionaryBuilder)
Adds a new block key with its corresponding block file pointer to theIndexDictionary.Builder
.protected void
addLine(BytesRef term, BlockTermState blockTermState, IndexDictionary.Builder dictionaryBuilder)
Adds a newBlockLine
term for the current field.protected BlockHeader.Serializer
createBlockHeaderSerializer()
protected BlockLine.Serializer
createBlockLineSerializer()
protected DeltaBaseTermStateSerializer
createDeltaBaseTermStateSerializer()
protected void
finishLastBlock(IndexDictionary.Builder dictionaryBuilder)
This method is called when there is no more term for the field.protected void
splitAndWriteBlock(IndexDictionary.Builder dictionaryBuilder)
Defines the new block start according totargetNumBlockLines
anddeltaNumLines
.protected void
updateFieldMetadata(long blockStartFP)
updates the field metadata after all lines were written for the block.protected void
writeBlock(List<BlockLine> blockLines, IndexDictionary.Builder dictionaryBuilder)
Writes a block and adds its block key to the dictionary builder.protected void
writeBlockLine(boolean isIncrementalEncodingSeed, BlockLine line, BlockLine previousLine)
-
-
-
Field Detail
-
targetNumBlockLines
protected final int targetNumBlockLines
-
deltaNumLines
protected final int deltaNumLines
-
blockOutput
protected final IndexOutput blockOutput
-
blockLinesWriteBuffer
protected final ByteBuffersDataOutput blockLinesWriteBuffer
-
termStatesWriteBuffer
protected final ByteBuffersDataOutput termStatesWriteBuffer
-
blockHeaderWriter
protected final BlockHeader.Serializer blockHeaderWriter
-
blockLineWriter
protected final BlockLine.Serializer blockLineWriter
-
termStateSerializer
protected final DeltaBaseTermStateSerializer termStateSerializer
-
blockEncoder
protected final BlockEncoder blockEncoder
-
blockWriteBuffer
protected final ByteBuffersDataOutput blockWriteBuffer
-
fieldMetadata
protected FieldMetadata fieldMetadata
-
lastTerm
protected BytesRef lastTerm
-
reusableBlockHeader
protected final BlockHeader reusableBlockHeader
-
scratchBytesRef
protected BytesRef scratchBytesRef
-
-
Constructor Detail
-
BlockWriter
protected BlockWriter(IndexOutput blockOutput, int targetNumBlockLines, int deltaNumLines, BlockEncoder blockEncoder)
-
-
Method Detail
-
createBlockHeaderSerializer
protected BlockHeader.Serializer createBlockHeaderSerializer()
-
createBlockLineSerializer
protected BlockLine.Serializer createBlockLineSerializer()
-
createDeltaBaseTermStateSerializer
protected DeltaBaseTermStateSerializer createDeltaBaseTermStateSerializer()
-
addLine
protected void addLine(BytesRef term, BlockTermState blockTermState, IndexDictionary.Builder dictionaryBuilder) throws IOException
Adds a newBlockLine
term for the current field.This method determines whether the new term is part of the current block, or if it is part of the next block. In the latter case, a new block is started (including one or more of the lastly added lines), the current block is written to the block file, and the current block key is added to the
IndexDictionary.Builder
.- Parameters:
term
- The block line term. TheBytesRef
instance is used directly, the caller is responsible to make a deep copy if needed. This is required because we keep a list of block lines until we decide to write the current block, and each line must have a different term instance.blockTermState
- Block line details.dictionaryBuilder
- to which the block keys are added.- Throws:
IOException
-
finishLastBlock
protected void finishLastBlock(IndexDictionary.Builder dictionaryBuilder) throws IOException
This method is called when there is no more term for the field. It writes the remaining lines added withaddLine(org.apache.lucene.util.BytesRef, org.apache.lucene.codecs.BlockTermState, org.apache.lucene.codecs.uniformsplit.IndexDictionary.Builder)
as the last block of the field and resets thisBlockWriter
state. Then thisBlockWriter
can be used for another field.- Throws:
IOException
-
splitAndWriteBlock
protected void splitAndWriteBlock(IndexDictionary.Builder dictionaryBuilder) throws IOException
Defines the new block start according totargetNumBlockLines
anddeltaNumLines
. The new block is started (including one or more of the lastly added lines), the current block is written to the block file, and the current block key is added to theIndexDictionary.Builder
.- Throws:
IOException
-
writeBlock
protected void writeBlock(List<BlockLine> blockLines, IndexDictionary.Builder dictionaryBuilder) throws IOException
Writes a block and adds its block key to the dictionary builder.- Throws:
IOException
-
updateFieldMetadata
protected void updateFieldMetadata(long blockStartFP)
updates the field metadata after all lines were written for the block.
-
writeBlockLine
protected void writeBlockLine(boolean isIncrementalEncodingSeed, BlockLine line, BlockLine previousLine) throws IOException
- Throws:
IOException
-
addBlockKey
protected void addBlockKey(List<BlockLine> blockLines, IndexDictionary.Builder dictionaryBuilder) throws IOException
Adds a new block key with its corresponding block file pointer to theIndexDictionary.Builder
. The block key is the MDP (seeTermBytes
) of the block first term.- Throws:
IOException
-
-