Package org.apache.lucene.misc.search
Class DiversifiedTopDocsCollector
- java.lang.Object
-
- org.apache.lucene.search.TopDocsCollector<DiversifiedTopDocsCollector.ScoreDocKey>
-
- org.apache.lucene.misc.search.DiversifiedTopDocsCollector
-
- All Implemented Interfaces:
Collector
public abstract class DiversifiedTopDocsCollector extends TopDocsCollector<DiversifiedTopDocsCollector.ScoreDocKey>
ATopDocsCollector
that controls diversity in results by ensuring no more than maxHitsPerKey results from a common source are collected in the final results.An example application might be a product search in a marketplace where no more than 3 results per retailer are permitted in search results.
To compare behaviour with other forms of collector, a useful analogy might be the problem of making a compilation album of 1967's top hit records:
- A vanilla query's results might look like a "Best of the Beatles" album - high quality but not much diversity
- A GroupingSearch would produce the equivalent of "The 10 top-selling artists of 1967 - some killer and quite a lot of filler"
- A "diversified" query would be the top 20 hit records of that year - with a max of 3 Beatles hits in order to maintain diversity
- Working in one pass over the data
- Not requiring the client to guess how many groups are required
- Removing low-scoring "filler" which sits at the end of each group's hits
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
DiversifiedTopDocsCollector.ScoreDocKey
An extension to ScoreDoc that includes a key used for grouping purposes
-
Field Summary
Fields Modifier and Type Field Description protected int
maxNumPerKey
-
Fields inherited from class org.apache.lucene.search.TopDocsCollector
EMPTY_TOPDOCS, pq, totalHits, totalHitsRelation
-
-
Constructor Summary
Constructors Constructor Description DiversifiedTopDocsCollector(int numHits, int maxHitsPerKey)
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected abstract NumericDocValues
getKeys(LeafReaderContext context)
Get a source of values used for grouping keysLeafCollector
getLeafCollector(LeafReaderContext context)
protected DiversifiedTopDocsCollector.ScoreDocKey
insert(DiversifiedTopDocsCollector.ScoreDocKey addition, int docBase, NumericDocValues keys)
protected TopDocs
newTopDocs(ScoreDoc[] results, int start)
ScoreMode
scoreMode()
-
Methods inherited from class org.apache.lucene.search.TopDocsCollector
getTotalHits, populateResults, topDocs, topDocs, topDocs, topDocsSize
-
-
-
-
Method Detail
-
getKeys
protected abstract NumericDocValues getKeys(LeafReaderContext context)
Get a source of values used for grouping keys
-
scoreMode
public ScoreMode scoreMode()
-
newTopDocs
protected TopDocs newTopDocs(ScoreDoc[] results, int start)
- Overrides:
newTopDocs
in classTopDocsCollector<DiversifiedTopDocsCollector.ScoreDocKey>
-
insert
protected DiversifiedTopDocsCollector.ScoreDocKey insert(DiversifiedTopDocsCollector.ScoreDocKey addition, int docBase, NumericDocValues keys) throws IOException
- Throws:
IOException
-
getLeafCollector
public LeafCollector getLeafCollector(LeafReaderContext context) throws IOException
- Throws:
IOException
-
-