HistogramData (core 2.55.1 API)

java.lang.Object
- org.apache.beam.sdk.util.HistogramData

All Implemented Interfaces:

java.io.Serializable
```
public class HistogramData
extends java.lang.Object
implements java.io.Serializable
```
A histogram that supports estimated percentile with linear interpolation.
We may consider using Apache Commons or HdrHistogram library in the future for advanced features such as sparsely populated histograms.

See Also:

Serialized Form

Nested Class Summary

Nested Classes
Modifier and Type	Class and Description
`static interface`	`HistogramData.BucketType`
`static class`	`HistogramData.ExponentialBuckets`
`static class`	`HistogramData.LinearBuckets`

Constructor Summary

Constructors
Constructor and Description

HistogramData(HistogramData.BucketType bucketType)
Create a histogram.

Constructors
Constructor and Description
`HistogramData(HistogramData.BucketType bucketType)` Create a histogram.

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`void`	`clear()`
`boolean`	`equals(@Nullable java.lang.Object object)`
`static HistogramData`	`exponential(int scale, int numBuckets)` Returns a histogram object with exponential boundaries.
`HistogramData`	`getAndReset()` Copies all updates to a new histogram object and resets 'this' histogram.
`long`	`getBottomBucketCount()`
`double`	`getBottomBucketMean()`
`HistogramData.BucketType`	`getBucketType()`
`long`	`getCount(int bucketIndex)` TODO(https://github.com/apache/beam/issues/20853): Update this function to allow indexing the -INF and INF bucket (using 0 and length -1) Get the bucket count for the given bucketIndex.
`double`	`getMean()`
`java.lang.String`	`getPercentileString(java.lang.String elemType, java.lang.String unit)`
`double`	`getSumOfSquaredDeviations()`
`long`	`getTopBucketCount()`
`double`	`getTopBucketMean()`
`long`	`getTotalCount()`
`int`	`hashCode()`
`void`	`incBottomBucketCount(long count)`
`void`	`incBucketCount(int bucketIndex, long count)`
`void`	`incTopBucketCount(long count)`
`static HistogramData`	`linear(double start, double width, int numBuckets)` TODO(https://github.com/apache/beam/issues/20853): Update this function to define numBuckets total, including the infinite buckets.
`double`	`p50()`
`double`	`p90()`
`double`	`p99()`
`void`	`record(double... values)`
`void`	`record(double value)`
`void`	`update(HistogramData other)`

Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - HistogramData
```
public HistogramData(HistogramData.BucketType bucketType)
```
    Create a histogram.
    
    Parameters:
    
    bucketType - a bucket type for a new histogram instance.
- Method Detail
  - getBucketType
```
public HistogramData.BucketType getBucketType()
```
  - linear
```
public static HistogramData linear(double start,
                                   double width,
                                   int numBuckets)
```
    TODO(https://github.com/apache/beam/issues/20853): Update this function to define numBuckets total, including the infinite buckets. Create a histogram with linear buckets.
    
    Parameters:
    
    start - Lower bound of a starting bucket.
    
    width - Bucket width. Smaller width implies a better resolution for percentile estimation.
    
    numBuckets - The number of buckets. Upper bound of an ending bucket is defined by start + width * numBuckets.
    
    Returns:
    
    a new Histogram instance.
  - exponential
```
public static HistogramData exponential(int scale,
                                        int numBuckets)
```
    Returns a histogram object with exponential boundaries. The input parameter scale determines a coefficient 'base' which species bucket boundaries.
```
 base = 2**(2**(-scale)) e.g.
 scale=1 => base=2**(1/2)=sqrt(2)
 scale=0 => base=2**(1)=2
 scale=-1 => base=2**(2)=4
 
```
    This bucketing strategy makes it simple/numerically stable to compute bucket indexes for datapoints.
```
 Bucket boundaries are given by the following table where n=numBuckets.
 | 'Bucket Index' | Bucket Boundaries   |
 |---------------|---------------------|
 | Underflow     | (-inf, 0)           |
 | 0             | [0, base)           |
 | 1             | [base, base^2)      |
 | 2             | [base^2, base^3)    |
 | i             | [base^i, base^(i+1))|
 | n-1           | [base^(n-1), base^n)|
 | Overflow      | [base^n, inf)       |
 
```
```
 Example scale/boundaries:
 When scale=1, buckets 0,1,2...i have lowerbounds 0, 2^(1/2), 2^(2/2), ... 2^(i/2).
 When scale=0, buckets 0,1,2...i have lowerbounds 0, 2, 2^2, ... 2^(i).
 When scale=-1, buckets 0,1,2...i have lowerbounds 0, 4, 4^2, ... 4^(i).
 
```
    Scale parameter is similar to OpenTelemetry's notion of ExponentialHistogram. Bucket boundaries are modified to make them compatible with GCP's exponential histogram.
    Parameters:
    
    numBuckets - The number of buckets. Clipped so that the largest bucket's lower bound is not greater than 2^32-1 (uint32 max).
    
    scale - Integer between [-3, 3] which determines bucket boundaries. Larger values imply more fine grained buckets.
    
    Returns:
    
    a new Histogram instance.
  - record
```
public void record(double... values)
```
  - update
```
public void update(HistogramData other)
```
  - incBucketCount
```
public void incBucketCount(int bucketIndex,
                           long count)
```
  - incTopBucketCount
```
public void incTopBucketCount(long count)
```
  - incBottomBucketCount
```
public void incBottomBucketCount(long count)
```
  - clear
```
public void clear()
```
  - getAndReset
```
public HistogramData getAndReset()
```
    Copies all updates to a new histogram object and resets 'this' histogram.
    
    Returns:
    
    New histogram object that has the the same updates as 'this'.
  - record
```
public void record(double value)
```
  - getTotalCount
```
public long getTotalCount()
```
  - getPercentileString
```
public java.lang.String getPercentileString(java.lang.String elemType,
                                            java.lang.String unit)
```
  - getCount
```
public long getCount(int bucketIndex)
```
    TODO(https://github.com/apache/beam/issues/20853): Update this function to allow indexing the -INF and INF bucket (using 0 and length -1) Get the bucket count for the given bucketIndex.
    This method does not guarantee the atomicity when sequentially accessing the multiple buckets i.e. other threads may alter the value between consecutive invocations. For summing the total number of elements in the histogram, use `getTotalCount()` instead.
    
    Parameters:
    
    bucketIndex - index of the bucket
    
    Returns:
    
    The number of elements in the specified bucket
  - getTopBucketCount
```
public long getTopBucketCount()
```
  - getTopBucketMean
```
public double getTopBucketMean()
```
  - getBottomBucketCount
```
public long getBottomBucketCount()
```
  - getBottomBucketMean
```
public double getBottomBucketMean()
```
  - getMean
```
public double getMean()
```
  - getSumOfSquaredDeviations
```
public double getSumOfSquaredDeviations()
```
  - p99
```
public double p99()
```
  - p90
```
public double p90()
```
  - p50
```
public double p50()
```
  - equals
```
public boolean equals(@Nullable java.lang.Object object)
```
    Overrides:
    
    equals in class java.lang.Object
  - hashCode
```
public int hashCode()
```
    Overrides:
    
    hashCode in class java.lang.Object

Class HistogramData

Nested Class Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

HistogramData

Method Detail

getBucketType

linear

exponential

record

update

incBucketCount

incTopBucketCount

incBottomBucketCount

clear

getAndReset

record

getTotalCount

getPercentileString

getCount

getTopBucketCount

getTopBucketMean

getBottomBucketCount

getBottomBucketMean

getMean

getSumOfSquaredDeviations

p99

p90

p50

equals

hashCode