public class HistogramData
extends java.lang.Object
implements java.io.Serializable
We may consider using Apache Commons or HdrHistogram library in the future for advanced features such as sparsely populated histograms.
Modifier and Type | Class and Description |
---|---|
static interface |
HistogramData.BucketType |
static class |
HistogramData.ExponentialBuckets |
static class |
HistogramData.LinearBuckets |
Constructor and Description |
---|
HistogramData(HistogramData.BucketType bucketType)
Create a histogram.
|
Modifier and Type | Method and Description |
---|---|
void |
clear() |
boolean |
equals(@Nullable java.lang.Object object) |
static HistogramData |
exponential(int scale,
int numBuckets)
Returns a histogram object with exponential boundaries.
|
HistogramData |
getAndReset()
Copies all updates to a new histogram object and resets 'this' histogram.
|
long |
getBottomBucketCount() |
double |
getBottomBucketMean() |
HistogramData.BucketType |
getBucketType() |
long |
getCount(int bucketIndex)
TODO(https://github.com/apache/beam/issues/20853): Update this function to allow indexing the
-INF and INF bucket (using 0 and length -1) Get the bucket count for the given bucketIndex.
|
double |
getMean() |
java.lang.String |
getPercentileString(java.lang.String elemType,
java.lang.String unit) |
double |
getSumOfSquaredDeviations() |
long |
getTopBucketCount() |
double |
getTopBucketMean() |
long |
getTotalCount() |
int |
hashCode() |
void |
incBottomBucketCount(long count) |
void |
incBucketCount(int bucketIndex,
long count) |
void |
incTopBucketCount(long count) |
static HistogramData |
linear(double start,
double width,
int numBuckets)
TODO(https://github.com/apache/beam/issues/20853): Update this function to define numBuckets
total, including the infinite buckets.
|
double |
p50() |
double |
p90() |
double |
p99() |
void |
record(double... values) |
void |
record(double value) |
void |
update(HistogramData other) |
public HistogramData(HistogramData.BucketType bucketType)
bucketType
- a bucket type for a new histogram instance.public HistogramData.BucketType getBucketType()
public static HistogramData linear(double start, double width, int numBuckets)
start
- Lower bound of a starting bucket.width
- Bucket width. Smaller width implies a better resolution for percentile estimation.numBuckets
- The number of buckets. Upper bound of an ending bucket is defined by start +
width * numBuckets.public static HistogramData exponential(int scale, int numBuckets)
scale
determines a coefficient 'base' which species bucket boundaries.
base = 2**(2**(-scale)) e.g. scale=1 => base=2**(1/2)=sqrt(2) scale=0 => base=2**(1)=2 scale=-1 => base=2**(2)=4This bucketing strategy makes it simple/numerically stable to compute bucket indexes for datapoints.
Bucket boundaries are given by the following table where n=numBuckets. | 'Bucket Index' | Bucket Boundaries | |---------------|---------------------| | Underflow | (-inf, 0) | | 0 | [0, base) | | 1 | [base, base^2) | | 2 | [base^2, base^3) | | i | [base^i, base^(i+1))| | n-1 | [base^(n-1), base^n)| | Overflow | [base^n, inf) |
Example scale/boundaries: When scale=1, buckets 0,1,2...i have lowerbounds 0, 2^(1/2), 2^(2/2), ... 2^(i/2). When scale=0, buckets 0,1,2...i have lowerbounds 0, 2, 2^2, ... 2^(i). When scale=-1, buckets 0,1,2...i have lowerbounds 0, 4, 4^2, ... 4^(i).Scale parameter is similar to OpenTelemetry's notion of ExponentialHistogram. Bucket boundaries are modified to make them compatible with GCP's exponential histogram.
numBuckets
- The number of buckets. Clipped so that the largest bucket's lower bound is
not greater than 2^32-1 (uint32 max).scale
- Integer between [-3, 3] which determines bucket boundaries. Larger values imply
more fine grained buckets.public void record(double... values)
public void update(HistogramData other)
public void incBucketCount(int bucketIndex, long count)
public void incTopBucketCount(long count)
public void incBottomBucketCount(long count)
public void clear()
public HistogramData getAndReset()
public void record(double value)
public long getTotalCount()
public java.lang.String getPercentileString(java.lang.String elemType, java.lang.String unit)
public long getCount(int bucketIndex)
This method does not guarantee the atomicity when sequentially accessing the multiple buckets i.e. other threads may alter the value between consecutive invocations. For summing the total number of elements in the histogram, use `getTotalCount()` instead.
bucketIndex
- index of the bucketpublic long getTopBucketCount()
public double getTopBucketMean()
public long getBottomBucketCount()
public double getBottomBucketMean()
public double getMean()
public double getSumOfSquaredDeviations()
public double p99()
public double p90()
public double p50()
public boolean equals(@Nullable java.lang.Object object)
equals
in class java.lang.Object
public int hashCode()
hashCode
in class java.lang.Object