- All Implemented Interfaces:
- Limit
public final class Gradient2Limit
extends AbstractLimit
Concurrency limit algorithm that adjusts the limit based on the gradient of change of the current average RTT and
a long term exponentially smoothed average RTT. Unlike traditional congestion control algorithms we use average
instead of minimum since RPC methods can be very bursty due to various factors such as non-homogenous request
processing complexity as well as a wide distribution of data size. We have also found that using minimum can result
in an bias towards an impractically low base RTT resulting in excessive load shedding. An exponential decay is
applied to the base RTT so that the value is kept stable yet is allowed to adapt to long term changes in latency
characteristics.
The core algorithm re-calculates the limit every sampling window (ex. 1 second) using the formula
// Calculate the gradient limiting to the range [0.5, 1.0] to filter outliers
gradient = max(0.5, min(1.0, longtermRtt / currentRtt));
// Calculate the new limit by applying the gradient and allowing for some queuing
newLimit = gradient * currentLimit + queueSize;
// Update the limit using a smoothing factor (default 0.2)
newLimit = currentLimit * (1-smoothing) + newLimit * smoothing
The limit can be in one of three main states
1. Steady state
In this state the average RTT is very stable and the current measurement whipsaws around this value, sometimes reducing
the limit, sometimes increasing it.
2. Transition from steady state to load
In this state either the RPS to latency has spiked. The gradient is < 1.0 due to a growing request queue that
cannot be handled by the system. Excessive requests and rejected due to the low limit. The baseline RTT grows using
exponential decay but lags the current measurement, which keeps the gradient < 1.0 and limit low.
3. Transition from load to steady state
In this state the system goes back to steady state after a prolonged period of excessive load. Requests aren't rejected
and the sample RTT remains low. During this state the long term RTT may take some time to go back to normal and could
potentially be several multiples higher than the current RTT.