Class AgentDigest


  • public class AgentDigest
    extends com.tdunning.math.stats.AbstractTDigest
    NOTE: This is a pruned and modified version of MergingDigest. It does not support queries (cdf/quantiles) or the traditional encodings.

    Maintains a t-digest by collecting new points in a buffer that is then sorted occasionally and merged into a sorted array that contains previously computed centroids.

    This can be very fast because the cost of sorting and merging is amortized over several insertion. If we keep N centroids total and have the input array is k long, then the amortized cost is something like

    N/k + log k

    These costs even out when N/k = log k. Balancing costs is often a good place to start in optimizing an algorithm. For different values of compression factor, the following table shows estimated asymptotic values of N and suggested values of k:

    CompressionNk
    507825
    10015742
    20031473

    The virtues of this kind of t-digest implementation include:

    • No allocation is required after initialization
    • The data structure automatically compresses existing centroids when possible
    • No Java object overhead is incurred for centroids since data is kept in primitive arrays

    The current implementation takes the liberty of using ping-pong buffers for implementing the merge resulting in a substantial memory penalty, but the complexity of an in place merge was not considered as worthwhile since even with the overhead, the memory cost is less than 40 bytes per centroid which is much less than half what the AVLTreeDigest uses. Speed tests are still not complete so it is uncertain whether the merge strategy is faster than the tree strategy.

    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static class  AgentDigest.AgentDigestMarshaller
      Stateless AgentDigest codec for chronicle maps
    • Field Summary

      • Fields inherited from class com.tdunning.math.stats.AbstractTDigest

        gen, recordAllData
    • Constructor Summary

      Constructors 
      Constructor Description
      AgentDigest​(short compression, long dispatchTimeMillis)  
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void add​(double x, int w)  
      void add​(double x, int w, List<Double> history)  
      void asBytes​(ByteBuffer buf)  
      void asSmallBytes​(ByteBuffer buf)  
      int byteSize()  
      double cdf​(double x)  
      int centroidCount()
      Number of centroids of this AgentDigest (does compress if necessary)
      Collection<com.tdunning.math.stats.Centroid> centroids()
      Not clear to me that this is a good idea, maybe just add the temp points and existing centroids rather then merging first?
      void compress()  
      double compression()  
      long getDispatchTimeMillis()
      Time at which this digest should be dispatched to wavefront.
      double quantile​(double q)  
      com.tdunning.math.stats.TDigest recordAllData()
      Turns on internal data recording.
      long size()  
      int smallByteSize()  
      wavefront.report.Histogram toHistogram​(int duration)
      Creates a reporting Histogram from this AgentDigest (marked with the supplied duration).
      • Methods inherited from class com.tdunning.math.stats.AbstractTDigest

        add, add, createCentroid, decode, encode, interpolate, isRecording, merge, weightedAverage, weightedAverageSorted
      • Methods inherited from class com.tdunning.math.stats.TDigest

        checkValue, createArrayDigest, createArrayDigest, createAvlTreeDigest, createDigest, createTreeDigest
    • Constructor Detail

      • AgentDigest

        public AgentDigest​(short compression,
                           long dispatchTimeMillis)
    • Method Detail

      • recordAllData

        public com.tdunning.math.stats.TDigest recordAllData()
        Turns on internal data recording.
        Overrides:
        recordAllData in class com.tdunning.math.stats.AbstractTDigest
      • add

        public void add​(double x,
                        int w)
        Specified by:
        add in class com.tdunning.math.stats.TDigest
      • add

        public void add​(double x,
                        int w,
                        List<Double> history)
      • compress

        public void compress()
        Specified by:
        compress in class com.tdunning.math.stats.TDigest
      • size

        public long size()
        Specified by:
        size in class com.tdunning.math.stats.TDigest
      • cdf

        public double cdf​(double x)
        Specified by:
        cdf in class com.tdunning.math.stats.TDigest
      • quantile

        public double quantile​(double q)
        Specified by:
        quantile in class com.tdunning.math.stats.TDigest
      • centroids

        public Collection<com.tdunning.math.stats.Centroid> centroids()
        Not clear to me that this is a good idea, maybe just add the temp points and existing centroids rather then merging first?
        Specified by:
        centroids in class com.tdunning.math.stats.TDigest
      • compression

        public double compression()
        Specified by:
        compression in class com.tdunning.math.stats.TDigest
      • byteSize

        public int byteSize()
        Specified by:
        byteSize in class com.tdunning.math.stats.TDigest
      • smallByteSize

        public int smallByteSize()
        Specified by:
        smallByteSize in class com.tdunning.math.stats.TDigest
      • centroidCount

        public int centroidCount()
        Number of centroids of this AgentDigest (does compress if necessary)
      • toHistogram

        public wavefront.report.Histogram toHistogram​(int duration)
        Creates a reporting Histogram from this AgentDigest (marked with the supplied duration).
      • asBytes

        public void asBytes​(ByteBuffer buf)
        Specified by:
        asBytes in class com.tdunning.math.stats.TDigest
      • asSmallBytes

        public void asSmallBytes​(ByteBuffer buf)
        Specified by:
        asSmallBytes in class com.tdunning.math.stats.TDigest
      • getDispatchTimeMillis

        public long getDispatchTimeMillis()
        Time at which this digest should be dispatched to wavefront.