This is the default for the fixed number of retries an LB implementation is willing to make if an unavailable (Status != Open) node is returned from the underlying pick.
This is the default for the fixed number of retries an LB implementation is willing to make if an unavailable (Status != Open) node is returned from the underlying pick.
For randomized LBs (P2C* and Aperture*) this can be interpreted as a probability. For example, imagine that half of the replica set is down, the probability of picking two unavailable nodes is 0.25. If we repeat that process for 5 times, the total probability of seeing 5 unavailable nodes in a row, will be (0.25 ^ 5) = 0.1%. This means that if half of the cluster is down, the LB will be making a bad choice (when better choice may have been available) for 0.1% of requests.
Please, note that this doesn't mean that 0.1% of requests will be failed by P2C operating on a half-dead cluster since Finagle clients have additional layers of requeues above the load balancer.
The aperture load-band balancer balances load to the smallest subset ("aperture") of
services so that the concurrent load to each service, measured over a window specified
by smoothWin
, stays within the load band delimited by lowLoad
and highLoad
.
The aperture load-band balancer balances load to the smallest subset ("aperture") of
services so that the concurrent load to each service, measured over a window specified
by smoothWin
, stays within the load band delimited by lowLoad
and highLoad
.
The default load band configuration attempts to create a 1:1 relationship between a client's offered load and the aperture size. Given a homogeneous replica set – this optimizes for at least one concurrent request per node and gives the balancer sufficient data to compare load across nodes.
Among the benefits of aperture balancing are:
The time window to use when calculating the rps observed by this
load balancer instance. The smoothed rps value is then used to determine the size of
the aperture (along with the lowLoad
and highLoad
parameters).
The lower threshold on average load (as calculated by rps over
smooth window / # of endpoints). Once this threshold is reached, the aperture is
narrowed. Put differently, if there is an average of lowLoad
requests across
the endpoints, then we are not concentrating our concurrency well, so we narrow
the aperture.
The upper threshold on average load (as calculated by rps / #
of instances). Once this threshold is reached, the aperture is widened. Put differently,
if there is an average of highLoad
requests across the endpoints, then we are
over subscribing the endpoints and need to widen the aperture.
The minimum aperture allowed. Note, this value is checked to ensure that it is not larger than the number of endpoints.
This is the fixed number of retries the LB is willing to make if an unavailable node (Status != Open) is returned from the underlying pick. See the constant MaxEffort for more details on how we pick the default.
The PRNG used for flipping coins. Override for deterministic tests.
Enables the aperture instance to make use of the coordinate in com.twitter.finagle.loadbalancer.aperture.ProcessCoordinate to calculate an order for the endpoints. In short, this allows coordination for apertures across process boundaries to avoid load concentration when deployed in a distributed system.
The user guide for more details.
Like aperture but but using the Peak EWMA (exponentially weight moving average) load metric.
Like aperture but but using the Peak EWMA (exponentially weight moving average) load metric.
Peak EWMA uses a moving average over an endpoint's round-trip time (RTT) that is highly sensitive to peaks. This average is then weighted by the number of outstanding requests. Effectively, increasing our resolution per-request and providing a higher fidelity measurement of server responsiveness compared to the standard least loaded. It is designed to react to slow endpoints more quickly than least-loaded by penalizing them when they exhibit slow response times. This load metric operates under the assumption that a loaded endpoint takes time to recover and so it is generally safe for the advertised load to incorporate an endpoint's history. However, this assumption breaks down in the presence of long polling clients.
The time window to use when calculating the rps observed by this load balancer instance. The smoothed rps value is used to determine the size of aperture (along with the lowLoad and highLoad parameters). In the context of peakEwma, this value is also used to average over the latency observed per endpoint. It's unlikely that you would want to measure the latency on a different time scale than rps, so we couple the two.
The lower threshold on average load (as calculated by rps over
smooth window / # of endpoints). Once this threshold is reached, the aperture is
narrowed. Put differently, if there is an average of lowLoad
requests across
the endpoints, then we are not concentrating our concurrency well, so we narrow
the aperture.
The upper threshold on average load (as calculated by rps / #
of instances). Once this threshold is reached, the aperture is widened. Put differently,
if there is an average of highLoad
requests across the endpoints, then we are
over subscribing the endpoints and need to widen the aperture.
The minimum aperture allowed. Note, this value is checked to ensure that it is not larger than the number of endpoints.
This is the fixed number of retries the LB is willing to make if an unavailable node (Status != Open) is returned from the underlying pick. See the constant MaxEffort for more details on how we pick the default.
The PRNG used for flipping coins. Override for deterministic tests.
Enables the aperture instance to make use of the coordinate in com.twitter.finagle.loadbalancer.aperture.ProcessCoordinate to calculate an order for the endpoints. In short, this allows coordination for apertures across process boundaries to avoid load concentration when deployed in a distributed system.
The user guide for more details.
An efficient strictly least-loaded balancer that maintains an internal heap to select least-loaded endpoints.
An efficient strictly least-loaded balancer that maintains an internal heap to select least-loaded endpoints.
The user guide for more details.
An O(1), concurrent, least-loaded fair load balancer.
An O(1), concurrent, least-loaded fair load balancer. This uses the ideas behind "power of 2 choices" [1].
This is the fixed number of retries the LB is willing to make if an unavailable node (Status != Open) is returned from the underlying pick. See the constant MaxEffort for more details on how we pick the default.
The PRNG used for flipping coins. Override for deterministic tests. [1] Michael Mitzenmacher. 2001. The Power of Two Choices in Randomized Load Balancing. IEEE Trans. Parallel Distrib. Syst. 12, 10 (October 2001), 1094-1104.
Like p2c but using the Peak EWMA (exponentially weight moving average) load metric.
Like p2c but using the Peak EWMA (exponentially weight moving average) load metric.
Peak EWMA uses a moving average over an endpoint's round-trip time (RTT) that is highly sensitive to peaks. This average is then weighted by the number of outstanding requests. Effectively, increasing our resolution per-request and providing a higher fidelity measurement of server responsiveness compared to the standard least loaded. It is designed to react to slow endpoints more quickly than least-loaded by penalizing them when they exhibit slow response times. This load metric operates under the assumption that a loaded endpoint takes time to recover and so it is generally safe for the advertised load to incorporate an endpoint's history. However, this assumption breaks down in the presence of long polling clients.
The window of latency observations.
This is the fixed number of retries the LB is willing to make if an unavailable node (Status != Open) is returned from the underlying pick. See the constant MaxEffort for more details on how we pick the default.
The PRNG used for flipping coins. Override for deterministic tests.
The user guide for more details.
A simple round robin balancer that chooses the next endpoint in the list for each request.
A simple round robin balancer that chooses the next endpoint in the list for each request.
WARNING: Unlike other balancers available in finagle, this does not take latency into account and will happily direct load to slow or oversubscribed services. We recommend using one of the other load balancers for typical production use.
This is the fixed number of retries the LB is willing to make if an unavailable node (Status != Open) is returned from the underlying pick. See the constant MaxEffort for more details on how we pick the default.
Constructor methods for various load balancers. The methods take balancer specific parameters and return a LoadBalancerFactory that allows you to easily inject a balancer into the Finagle client stack via the
withLoadBalancer
method.configuring a client with a load balancer
The user guide for more details.