Interface ScalingPolicyMetric.Builder

    • Method Detail

      • invocationsPerInstance

        ScalingPolicyMetric.Builder invocationsPerInstance​(Integer invocationsPerInstance)

        The number of invocations sent to a model, normalized by InstanceCount in each ProductionVariant. 1/numberOfInstances is sent as the value on each request, where numberOfInstances is the number of active instances for the ProductionVariant behind the endpoint at the time of the request.

        Parameters:
        invocationsPerInstance - The number of invocations sent to a model, normalized by InstanceCount in each ProductionVariant. 1/numberOfInstances is sent as the value on each request, where numberOfInstances is the number of active instances for the ProductionVariant behind the endpoint at the time of the request.
        Returns:
        Returns a reference to this object so that method calls can be chained together.
      • modelLatency

        ScalingPolicyMetric.Builder modelLatency​(Integer modelLatency)

        The interval of time taken by a model to respond as viewed from SageMaker. This interval includes the local communication times taken to send the request and to fetch the response from the container of a model and the time taken to complete the inference in the container.

        Parameters:
        modelLatency - The interval of time taken by a model to respond as viewed from SageMaker. This interval includes the local communication times taken to send the request and to fetch the response from the container of a model and the time taken to complete the inference in the container.
        Returns:
        Returns a reference to this object so that method calls can be chained together.