Package com.cobber.fta
Class Histogram
- Object
-
- Histogram
-
public class Histogram extends Object
This class is used to encapsulate a Histogram to provide Histogram data. If the data fits in the cardinality set then it simply uses a map to generate the histogram values. Once the cardinality exceeds maxCardinality then the data is tracked using an algorithm based on Yael Ben-Haim and Elad Tom-Tov, "A streaming parallel decision tree algorithm", J. Machine Learning Research 11 (2010), pp. 849--872 All data is stored in the Cardinality Map until this is exhausted at this point we start to populate (via accept) the underlying Histogram Sketch with all values not captured in the Cardinality Map. Once we need to generate a Histogram we either just generate it from the Cardinality Map or if the MaxCardinality has been exceeded we add all the entries captured in the Cardinality Map to the Sketch.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description class
Histogram.Entry
A Histogram Entry captures the low and high bounds for each bucket along with the number of entries in the bucket.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Histogram.Entry[]
getHistogram(int buckets)
Get the histogram with the supplied number of bucketsHistogram
merge(Histogram other)
void
setCardinality(Map<String,Long> map)
void
setCardinalityOverflow(HistogramSPDT histogramSPDT)
-
-
-
Method Detail
-
setCardinality
public void setCardinality(Map<String,Long> map)
-
setCardinalityOverflow
public void setCardinalityOverflow(HistogramSPDT histogramSPDT)
-
getHistogram
public Histogram.Entry[] getHistogram(int buckets)
Get the histogram with the supplied number of buckets- Parameters:
buckets
- the number of buckets in the Histogram- Returns:
- An array of length 'buckets' that constitutes the Histogram (or null if cardinality is zero).
-
-