Package org.apache.lucene.misc.search
Class DiversifiedTopDocsCollector
java.lang.Object
org.apache.lucene.search.TopDocsCollector<DiversifiedTopDocsCollector.ScoreDocKey>
org.apache.lucene.misc.search.DiversifiedTopDocsCollector
- All Implemented Interfaces:
Collector
public abstract class DiversifiedTopDocsCollector
extends TopDocsCollector<DiversifiedTopDocsCollector.ScoreDocKey>
A
TopDocsCollector
that controls diversity in results by ensuring no more than
maxHitsPerKey results from a common source are collected in the final results.
An example application might be a product search in a marketplace where no more than 3 results per retailer are permitted in search results.
To compare behaviour with other forms of collector, a useful analogy might be the problem of making a compilation album of 1967's top hit records:
- A vanilla query's results might look like a "Best of the Beatles" album - high quality but not much diversity
- A GroupingSearch would produce the equivalent of "The 10 top-selling artists of 1967 - some killer and quite a lot of filler"
- A "diversified" query would be the top 20 hit records of that year - with a max of 3 Beatles hits in order to maintain diversity
- Working in one pass over the data
- Not requiring the client to guess how many groups are required
- Removing low-scoring "filler" which sits at the end of each group's hits
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic class
An extension to ScoreDoc that includes a key used for grouping purposes -
Field Summary
Fields inherited from class org.apache.lucene.search.TopDocsCollector
EMPTY_TOPDOCS, pq, totalHits, totalHitsRelation
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionprotected abstract NumericDocValues
getKeys
(LeafReaderContext context) Get a source of values used for grouping keysgetLeafCollector
(LeafReaderContext context) protected DiversifiedTopDocsCollector.ScoreDocKey
insert
(DiversifiedTopDocsCollector.ScoreDocKey addition, int docBase, NumericDocValues keys) protected TopDocs
newTopDocs
(ScoreDoc[] results, int start) Methods inherited from class org.apache.lucene.search.TopDocsCollector
getTotalHits, populateResults, topDocs, topDocs, topDocs, topDocsSize
-
Field Details
-
maxNumPerKey
protected int maxNumPerKey
-
-
Constructor Details
-
DiversifiedTopDocsCollector
public DiversifiedTopDocsCollector(int numHits, int maxHitsPerKey)
-
-
Method Details
-
getKeys
Get a source of values used for grouping keys -
scoreMode
-
newTopDocs
- Overrides:
newTopDocs
in classTopDocsCollector<DiversifiedTopDocsCollector.ScoreDocKey>
-
insert
protected DiversifiedTopDocsCollector.ScoreDocKey insert(DiversifiedTopDocsCollector.ScoreDocKey addition, int docBase, NumericDocValues keys) throws IOException - Throws:
IOException
-
getLeafCollector
- Throws:
IOException
-