Class EarlyTerminatingSortingCollector
- java.lang.Object
-
- org.apache.lucene.search.Collector
-
- org.apache.lucene.index.sorter.EarlyTerminatingSortingCollector
-
public class EarlyTerminatingSortingCollector extends Collector
ACollector
that early terminates collection of documents on a per-segment basis, if the segment was sorted according to the givenSorter
.NOTE: the
Collector
detects sorted segments according toSortingMergePolicy
, so it's best used in conjunction with it. Also, it collects up to a specified num docs from each segment, and therefore is mostly suitable for use in conjunction with collectors such asTopDocsCollector
, and not e.g.TotalHitCountCollector
.NOTE: If you wrap a
TopDocsCollector
that sorts in the same order as the index order, the returnedTopDocsCollector.topDocs()
will be correct. However the total ofhit count
will be underestimated since not all matching documents will have been collected.NOTE: This
Collector
usesSorter.getID()
to detect whether a segment was sorted with the sameSorter
as the one given inEarlyTerminatingSortingCollector(Collector, Sorter, int)
. This has two implications:- if
Sorter.getID()
is not implemented correctly and returns different identifiers for equivalentSorter
s, this collector will not detect sorted segments, - if you suddenly change the
IndexWriter
'sSortingMergePolicy
to sort according to another criterion and if both the old and the newSorter
s have the same identifier, thisCollector
will incorrectly detect sorted segments.
- if
-
-
Constructor Summary
Constructors Constructor Description EarlyTerminatingSortingCollector(Collector in, Sorter sorter, int numDocsToCollect)
Create a newEarlyTerminatingSortingCollector
instance.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description boolean
acceptsDocsOutOfOrder()
Returntrue
if this collector does not require the matching docIDs to be delivered in int sort order (smallest to largest) toCollector.collect(int)
.void
collect(int doc)
Called once for every document matching a query, with the unbased document number.void
setNextReader(AtomicReaderContext context)
Called before collecting from eachAtomicReaderContext
.void
setScorer(Scorer scorer)
Called before successive calls toCollector.collect(int)
.
-
-
-
Constructor Detail
-
EarlyTerminatingSortingCollector
public EarlyTerminatingSortingCollector(Collector in, Sorter sorter, int numDocsToCollect)
Create a newEarlyTerminatingSortingCollector
instance.- Parameters:
in
- the collector to wrapsorter
- the same sorter as the one which is used byIndexWriter
'sSortingMergePolicy
numDocsToCollect
- the number of documents to collect on each segment. When wrapping aTopDocsCollector
, this number should be the number of hits.
-
-
Method Detail
-
setScorer
public void setScorer(Scorer scorer) throws IOException
Description copied from class:Collector
Called before successive calls toCollector.collect(int)
. Implementations that need the score of the current document (passed-in toCollector.collect(int)
), should save the passed-in Scorer and call scorer.score() when needed.- Specified by:
setScorer
in classCollector
- Throws:
IOException
-
collect
public void collect(int doc) throws IOException
Description copied from class:Collector
Called once for every document matching a query, with the unbased document number.Note: The collection of the current segment can be terminated by throwing a
CollectionTerminatedException
. In this case, the last docs of the currentAtomicReaderContext
will be skipped andIndexSearcher
will swallow the exception and continue collection with the next leaf.Note: This is called in an inner search loop. For good search performance, implementations of this method should not call
IndexSearcher.doc(int)
orIndexReader.document(int)
on every hit. Doing so can slow searches by an order of magnitude or more.- Specified by:
collect
in classCollector
- Throws:
IOException
-
setNextReader
public void setNextReader(AtomicReaderContext context) throws IOException
Description copied from class:Collector
Called before collecting from eachAtomicReaderContext
. All doc ids inCollector.collect(int)
will correspond toIndexReaderContext.reader()
. AddAtomicReaderContext.docBase
to the currentIndexReaderContext.reader()
's internal document id to re-base ids inCollector.collect(int)
.- Specified by:
setNextReader
in classCollector
- Parameters:
context
- next atomic reader context- Throws:
IOException
-
acceptsDocsOutOfOrder
public boolean acceptsDocsOutOfOrder()
Description copied from class:Collector
Returntrue
if this collector does not require the matching docIDs to be delivered in int sort order (smallest to largest) toCollector.collect(int)
.Most Lucene Query implementations will visit matching docIDs in order. However, some queries (currently limited to certain cases of
BooleanQuery
) can achieve faster searching if theCollector
allows them to deliver the docIDs out of order.Many collectors don't mind getting docIDs out of order, so it's important to return
true
here.- Specified by:
acceptsDocsOutOfOrder
in classCollector
-
-