Class HistogramSPDT


  • public class HistogramSPDT
    extends Object
    This class is used to encapsulate a Histogram to provide Histogram data. If the data fits in the cardinality set then it simply uses a map to generate the histogram values. Once the cardinality exceeds maxCardinality then the data is tracked using an algorithm based on Yael Ben-Haim and Elad Tom-Tov, "A streaming parallel decision tree algorithm", J. Machine Learning Research 11 (2010), pp. 849--872
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void accept​(double value, long count)
      Add a new data point with an associated count to the histogram approximation.
      void accept​(String value, long count)
      Add a new String-valued data point with an associated count to the histogram approximation.
      ArrayList<HistogramSPDT.Bin> getBins()
      Accessor for the ArrayList of all bins.
      double getMaxValue()
      Retrieve the maximum value.
      double getMinValue()
      Retrieve the minimum value.
      protected int locateInsertion​(double value)  
      HistogramSPDT merge​(HistogramSPDT other)  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Method Detail

      • accept

        public void accept​(String value,
                           long count)
        Add a new String-valued data point with an associated count to the histogram approximation. The String will be converted to a double using the StringConverter passed to the Constructor.
        Parameters:
        value - The String data point to add to the histogram approximation.
        count - The count associated with this data point
      • accept

        public void accept​(double value,
                           long count)
        Add a new data point with an associated count to the histogram approximation.
        Parameters:
        value - The data point to add to the histogram approximation.
        count - The count associated with this data point
      • getBins

        public ArrayList<HistogramSPDT.Bin> getBins()
        Accessor for the ArrayList of all bins.
        Returns:
        The bins that constitute the Histogram.
      • getMinValue

        public double getMinValue()
        Retrieve the minimum value.
        Returns:
        The minimum value ever seen by this Histogram
      • getMaxValue

        public double getMaxValue()
        Retrieve the maximum value.
        Returns:
        The maximum value ever seen by this Histogram
      • locateInsertion

        protected int locateInsertion​(double value)