public class DFRSimilarity extends SimilarityBase
The DFR scoring formula is composed of three separate components: the
basic model, the aftereffect and an additional
normalization component, represented by the classes
BasicModel
, AfterEffect
and Normalization
,
respectively. The names of these classes were chosen to match the names of
their counterparts in the Terrier IR engine.
To construct a DFRSimilarity, you must specify the implementations for all three components of DFR:
BasicModel
: Basic model of information content:
BasicModelBE
: Limiting form of Bose-Einstein
BasicModelG
: Geometric approximation of Bose-Einstein
BasicModelP
: Poisson approximation of the Binomial
BasicModelD
: Divergence approximation of the Binomial
BasicModelIn
: Inverse document frequency
BasicModelIne
: Inverse expected document
frequency [mixture of Poisson and IDF]
BasicModelIF
: Inverse term frequency
[approximation of I(ne)]
AfterEffect
: First normalization of information
gain:
AfterEffectL
: Laplace's law of succession
AfterEffectB
: Ratio of two Bernoulli processes
AfterEffect.NoAfterEffect
: no first normalization
Normalization
: Second (length) normalization:
NormalizationH1
: Uniform distribution of term
frequency
NormalizationH2
: term frequency density inversely
related to length
NormalizationH3
: term frequency normalization
provided by Dirichlet prior
NormalizationZ
: term frequency normalization provided
by a Zipfian relation
Normalization.NoNormalization
: no second normalization
Note that qtf, the multiplicity of term-occurrence in the query, is not handled by this implementation.
BasicModel
,
AfterEffect
,
Normalization
Similarity.SimScorer, Similarity.SimWeight
Constructor and Description |
---|
DFRSimilarity(BasicModel basicModel,
AfterEffect afterEffect,
Normalization normalization)
Creates DFRSimilarity from the three components.
|
Modifier and Type | Method and Description |
---|---|
AfterEffect |
getAfterEffect()
Returns the first normalization
|
BasicModel |
getBasicModel()
Returns the basic model of information content
|
Normalization |
getNormalization()
Returns the second normalization
|
String |
toString()
Subclasses must override this method to return the name of the Similarity
and preferably the values of parameters (if any) as well.
|
computeNorm, computeWeight, getDiscountOverlaps, log2, setDiscountOverlaps, simScorer
coord, queryNorm
public DFRSimilarity(BasicModel basicModel, AfterEffect afterEffect, Normalization normalization)
Note that null
values are not allowed:
if you want no normalization or after-effect, instead pass
Normalization.NoNormalization
or AfterEffect.NoAfterEffect
respectively.
basicModel
- Basic model of information contentafterEffect
- First normalization of information gainnormalization
- Second (length) normalizationpublic String toString()
SimilarityBase
toString
in class SimilarityBase
public BasicModel getBasicModel()
public AfterEffect getAfterEffect()
public Normalization getNormalization()
Copyright © 2010 - 2020 Adobe. All Rights Reserved