Package io.netty.buffer.search
Class AbstractMultiSearchProcessorFactory
java.lang.Object
io.netty.buffer.search.AbstractMultiSearchProcessorFactory
- All Implemented Interfaces:
MultiSearchProcessorFactory,SearchProcessorFactory
- Direct Known Subclasses:
AhoCorasicSearchProcessorFactory
public abstract class AbstractMultiSearchProcessorFactory
extends Object
implements MultiSearchProcessorFactory
Base class for precomputed factories that create
The purpose of
See the documentation of
Note: in some cases one
Usage example (given that the
MultiSearchProcessors.
The purpose of
MultiSearchProcessor is to perform efficient simultaneous search for multiple needles
in the haystack, while scanning every byte of the input sequentially, only once. While it can also be used
to search for just a single needle, using a SearchProcessorFactory would be more efficient for
doing that.
See the documentation of
AbstractSearchProcessorFactory for a comprehensive description of common usage.
In addition to the functionality provided by SearchProcessor, MultiSearchProcessor adds
a method to get the index of the needle found at the current position of the MultiSearchProcessor -
MultiSearchProcessor.getFoundNeedleId().
Note: in some cases one
needle can be a suffix of another needle, eg. {"BC", "ABC"},
and there can potentially be multiple needles found ending at the same position of the haystack.
In such case MultiSearchProcessor.getFoundNeedleId() returns the index of the longest matching needle
in the array of needles.
Usage example (given that the
haystack is a ByteBuf containing "ABCD" and the
needles are "AB", "BC" and "CD"):
MultiSearchProcessorFactory factory = MultiSearchProcessorFactory.newAhoCorasicSearchProcessorFactory(
"AB".getBytes(CharsetUtil.UTF_8), "BC".getBytes(CharsetUtil.UTF_8), "CD".getBytes(CharsetUtil.UTF_8));
MultiSearchProcessor processor = factory.newSearchProcessor();
int idx1 = haystack.forEachByte(processor);
// idx1 is 1 (index of the last character of the occurrence of "AB" in the haystack)
// processor.getFoundNeedleId() is 0 (index of "AB" in needles[])
int continueFrom1 = idx1 + 1;
// continue the search starting from the next character
int idx2 = haystack.forEachByte(continueFrom1, haystack.readableBytes() - continueFrom1, processor);
// idx2 is 2 (index of the last character of the occurrence of "BC" in the haystack)
// processor.getFoundNeedleId() is 1 (index of "BC" in needles[])
int continueFrom2 = idx2 + 1;
int idx3 = haystack.forEachByte(continueFrom2, haystack.readableBytes() - continueFrom2, processor);
// idx3 is 3 (index of the last character of the occurrence of "CD" in the haystack)
// processor.getFoundNeedleId() is 2 (index of "CD" in needles[])
int continueFrom3 = idx3 + 1;
int idx4 = haystack.forEachByte(continueFrom3, haystack.readableBytes() - continueFrom3, processor);
// idx4 is -1 (no more occurrences of any of the needles)
// This search session is complete, processor should be discarded.
// To search for the same needles again, reuse the same AbstractMultiSearchProcessorFactory
// to get a new MultiSearchProcessor.
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionnewAhoCorasicSearchProcessorFactory(byte[]... needles) Creates aMultiSearchProcessorFactorybased on Aho–Corasick string search algorithm.Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface io.netty.buffer.search.MultiSearchProcessorFactory
newSearchProcessor
-
Constructor Details
-
AbstractMultiSearchProcessorFactory
public AbstractMultiSearchProcessorFactory()
-
-
Method Details
-
newAhoCorasicSearchProcessorFactory
public static AhoCorasicSearchProcessorFactory newAhoCorasicSearchProcessorFactory(byte[]... needles) Creates aMultiSearchProcessorFactorybased on Aho–Corasick string search algorithm.
Precomputation (this method) time is linear in the size of input (O(Σ|needles|)).
The factory allocates and retains an array of 256 * X ints plus another array of X ints, where X is the sum of lengths of each entry ofneedlesminus the sum of lengths of repeated prefixes of theneedles.
Search (the actual application ofMultiSearchProcessor) time is linear in the size ofByteBufon which the search is performed (O(|haystack|)). Every byte ofByteBufis processed only once, sequentually, regardles of the number ofneedlesbeing searched for.- Parameters:
needles- a varargs array of arrays of bytes to search for- Returns:
- a new instance of
AhoCorasicSearchProcessorFactoryprecomputed for the givenneedles
-