
  package root
  package com
  package twitter

    Start with com.twitter.finagle.

  package finagle

    Finagle is an extensible RPC system.

    Finagle is an extensible RPC system.

    Services are represented by class com.twitter.finagle.Service. Clients make use of com.twitter.finagle.Service objects while servers implement them.

    Finagle contains a number of protocol implementations; each of these implement Client and/or com.twitter.finagle.Server. For example, Finagle's HTTP implementation, com.twitter.finagle.Http (in package finagle-http), exposes both.

    Thus a simple HTTP server is built like this:

    import com.twitter.finagle.{Http, Service}
    import com.twitter.finagle.http.{Request, Response}
    import com.twitter.util.{Await, Future}
    val service = new Service[Request, Response] {
      def apply(req: Request): Future[Response] =
    val server = Http.server.serve(":8080", service)

    We first define a service to which requests are dispatched. In this case, the service returns immediately with a HTTP 200 OK response, and with no content.

    This service is then served via the Http protocol on TCP port 8080. Finally we wait for the server to stop serving.

    We can now query our web server:

    % curl -D - localhost:8080
    HTTP/1.1 200 OK

    Building an HTTP client is also simple. (Note that type annotations are added for illustration.)

    import com.twitter.finagle.{Http, Service}
    import com.twitter.finagle.http.{Request, Response}
    import com.twitter.util.{Future, Return, Throw}
    val client: Service[Request, Response] = Http.client.newService("localhost:8080")
    val f: Future[Response] = client(Request()).respond {
      case Return(rep) =>
        printf("Got HTTP response %s\n", rep)
      case Throw(exc) =>
        printf("Got error %s\n", exc)

    Http.client.newService("localhost:8080") constructs a new com.twitter.finagle.Service instance connected to localhost TCP port 8080. We then issue a HTTP/1.1 GET request to URI "/". The service returns a com.twitter.util.Future representing the result of the operation. We listen to this future, printing an appropriate message when the response arrives.

    The Finagle homepage contains useful documentation and resources for using Finagle.

  package loadbalancer

    This package implements client side load balancing algorithms.

    This package implements client side load balancing algorithms.

    As an end-user, see the Balancers API to create instances which can be used to configure a Finagle client with various load balancing strategies.

    As an implementor, each algorithm gets its own subdirectory and is exposed via the Balancers object. Several convenient traits are provided which factor out common behavior and can be mixed in (i.e. Balancer, DistributorT, NodeT, and Updating).

  package aperture
  package exp
  • BalancerRegistry
  • Balancers
  • EndpointFactory
  • FlagBalancerFactory
  • LoadBalancerFactory
  • Metadata
  • NoNodesOpenException
  • WhenNoNodesOpen
  • WhenNoNodesOpens
  • defaultBalancer
  • perHostStats

object Balancers

Constructor methods for various load balancers. The methods take balancer specific parameters and return a LoadBalancerFactory that allows you to easily inject a balancer into the Finagle client stack via the withLoadBalancer method.

  1. configuring a client with a load balancer

See also

The user guide for more details.

Value Members

  4. val MaxEffort: Int

    This is the default for the fixed number of retries an LB implementation is willing to make if an unavailable (Status != Open) node is returned from the underlying pick.

    This is the default for the fixed number of retries an LB implementation is willing to make if an unavailable (Status != Open) node is returned from the underlying pick.

    For randomized LBs (P2C* and Aperture*) this can be interpreted as a probability. For example, imagine that half of the replica set is down, the probability of picking two unavailable nodes is 0.25. If we repeat that process for 5 times, the total probability of seeing 5 unavailable nodes in a row, will be (0.25 ^ 5) = 0.1%. This means that if half of the cluster is down, the LB will be making a bad choice (when better choice may have been available) for 0.1% of requests.

    Please, note that this doesn't mean that 0.1% of requests will be failed by P2C operating on a half-dead cluster since Finagle clients have additional layers of requeues above the load balancer.

  5. def aperture(smoothWin: Duration = 15.seconds, lowLoad: Double = 0.875, highLoad: Double = 1.125, minAperture: Int = 1, maxEffort: Int = MaxEffort, rng: Rng = Rng.threadLocal, useDeterministicOrdering: Option[Boolean] = None): LoadBalancerFactory

    The aperture load-band balancer balances load to the smallest subset ("aperture") of services so that the concurrent load to each service, measured over a window specified by smoothWin, stays within the load band delimited by lowLoad and highLoad.

    The aperture load-band balancer balances load to the smallest subset ("aperture") of services so that the concurrent load to each service, measured over a window specified by smoothWin, stays within the load band delimited by lowLoad and highLoad.

    The default load band configuration attempts to create a 1:1 relationship between a client's offered load and the aperture size. Given a homogeneous replica set – this optimizes for at least one concurrent request per node and gives the balancer sufficient data to compare load across nodes.

    Among the benefits of aperture balancing are:

    1. A client uses resources commensurate to offered load. In particular, it does not have to open sessions with every service in a large cluster. This is especially important when offered load and cluster capacity are mismatched. 2. It balances over fewer, and thus warmer, services. This enhances the efficacy of the fail-fast mechanisms, etc. This also means that clients pay the penalty of session establishment less frequently. 3. It increases the efficacy of least-loaded balancing which, in order to work well, requires concurrent load. The load-band balancer effectively arranges load in a manner that ensures a higher level of per-service concurrency.

    The time window to use when calculating the rps observed by this load balancer instance. The smoothed rps value is then used to determine the size of the aperture (along with the lowLoad and highLoad parameters).


    The lower threshold on average load (as calculated by rps over smooth window / # of endpoints). Once this threshold is reached, the aperture is narrowed. Put differently, if there is an average of lowLoad requests across the endpoints, then we are not concentrating our concurrency well, so we narrow the aperture.


    The upper threshold on average load (as calculated by rps / # of instances). Once this threshold is reached, the aperture is widened. Put differently, if there is an average of highLoad requests across the endpoints, then we are over subscribing the endpoints and need to widen the aperture.


    The minimum aperture allowed. Note, this value is checked to ensure that it is not larger than the number of endpoints.


    This is the fixed number of retries the LB is willing to make if an unavailable node (Status != Open) is returned from the underlying pick. See the constant MaxEffort for more details on how we pick the default.


    The PRNG used for flipping coins. Override for deterministic tests.


    Enables the aperture instance to make use of the coordinate in com.twitter.finagle.loadbalancer.aperture.ProcessCoordinate to calculate an order for the endpoints. In short, this allows coordination for apertures across process boundaries to avoid load concentration when deployed in a distributed system.

    See also

    The user guide for more details.

  6. def aperturePeakEwma(smoothWin: Duration = 15.seconds, lowLoad: Double = 0.875, highLoad: Double = 1.125, minAperture: Int = 1, maxEffort: Int = MaxEffort, rng: Rng = Rng.threadLocal, useDeterministicOrdering: Option[Boolean] = None): LoadBalancerFactory

    Like aperture but but using the Peak EWMA (exponentially weight moving average) load metric.

    Like aperture but but using the Peak EWMA (exponentially weight moving average) load metric.

    Peak EWMA uses a moving average over an endpoint's round-trip time (RTT) that is highly sensitive to peaks. This average is then weighted by the number of outstanding requests. Effectively, increasing our resolution per-request and providing a higher fidelity measurement of server responsiveness compared to the standard least loaded. It is designed to react to slow endpoints more quickly than least-loaded by penalizing them when they exhibit slow response times. This load metric operates under the assumption that a loaded endpoint takes time to recover and so it is generally safe for the advertised load to incorporate an endpoint's history. However, this assumption breaks down in the presence of long polling clients.


    The time window to use when calculating the rps observed by this load balancer instance. The smoothed rps value is used to determine the size of aperture (along with the lowLoad and highLoad parameters). In the context of peakEwma, this value is also used to average over the latency observed per endpoint. It's unlikely that you would want to measure the latency on a different time scale than rps, so we couple the two.


    The lower threshold on average load (as calculated by rps over smooth window / # of endpoints). Once this threshold is reached, the aperture is narrowed. Put differently, if there is an average of lowLoad requests across the endpoints, then we are not concentrating our concurrency well, so we narrow the aperture.


    The upper threshold on average load (as calculated by rps / # of instances). Once this threshold is reached, the aperture is widened. Put differently, if there is an average of highLoad requests across the endpoints, then we are over subscribing the endpoints and need to widen the aperture.


    The minimum aperture allowed. Note, this value is checked to ensure that it is not larger than the number of endpoints.


    This is the fixed number of retries the LB is willing to make if an unavailable node (Status != Open) is returned from the underlying pick. See the constant MaxEffort for more details on how we pick the default.


    The PRNG used for flipping coins. Override for deterministic tests.


    Enables the aperture instance to make use of the coordinate in com.twitter.finagle.loadbalancer.aperture.ProcessCoordinate to calculate an order for the endpoints. In short, this allows coordination for apertures across process boundaries to avoid load concentration when deployed in a distributed system.

    See also

    The user guide for more details.

    An O(1), concurrent, least-loaded fair load balancer.

    An O(1), concurrent, least-loaded fair load balancer. This uses the ideas behind "power of 2 choices" [1].


    This is the fixed number of retries the LB is willing to make if an unavailable node (Status != Open) is returned from the underlying pick. See the constant MaxEffort for more details on how we pick the default.


    The PRNG used for flipping coins. Override for deterministic tests. [1] Michael Mitzenmacher. 2001. The Power of Two Choices in Randomized Load Balancing. IEEE Trans. Parallel Distrib. Syst. 12, 10 (October 2001), 1094-1104.

  20. def p2cPeakEwma(decayTime: Duration = 10.seconds, maxEffort: Int = MaxEffort, rng: Rng = Rng.threadLocal): LoadBalancerFactory

    Like p2c but using the Peak EWMA (exponentially weight moving average) load metric.

    Like p2c but using the Peak EWMA (exponentially weight moving average) load metric.

    Peak EWMA uses a moving average over an endpoint's round-trip time (RTT) that is highly sensitive to peaks. This average is then weighted by the number of outstanding requests. Effectively, increasing our resolution per-request and providing a higher fidelity measurement of server responsiveness compared to the standard least loaded. It is designed to react to slow endpoints more quickly than least-loaded by penalizing them when they exhibit slow response times. This load metric operates under the assumption that a loaded endpoint takes time to recover and so it is generally safe for the advertised load to incorporate an endpoint's history. However, this assumption breaks down in the presence of long polling clients.


    The window of latency observations.


    This is the fixed number of retries the LB is willing to make if an unavailable node (Status != Open) is returned from the underlying pick. See the constant MaxEffort for more details on how we pick the default.


    The PRNG used for flipping coins. Override for deterministic tests.

    See also

    The user guide for more details.

  21. def roundRobin(maxEffort: Int = MaxEffort): LoadBalancerFactory

    A simple round robin balancer that chooses the next endpoint in the list for each request.

    A simple round robin balancer that chooses the next endpoint in the list for each request.

    WARNING: Unlike other balancers available in finagle, this does not take latency into account and will happily direct load to slow or oversubscribed services. We recommend using one of the other load balancers for typical production use.


    This is the fixed number of retries the LB is willing to make if an unavailable node (Status != Open) is returned from the underlying pick. See the constant MaxEffort for more details on how we pick the default.

