Class EndpointConfigurationShadowProductionVariantArgs


  • public final class EndpointConfigurationShadowProductionVariantArgs
    extends com.pulumi.resources.ResourceArgs
    • Method Detail

      • acceleratorType

        public java.util.Optional<com.pulumi.core.Output<java.lang.String>> acceleratorType()
        Returns:
        The size of the Elastic Inference (EI) instance to use for the production variant.
      • containerStartupHealthCheckTimeoutInSeconds

        public java.util.Optional<com.pulumi.core.Output<java.lang.Integer>> containerStartupHealthCheckTimeoutInSeconds()
        Returns:
        The timeout value, in seconds, for your inference container to pass health check by SageMaker Hosting. For more information about health check, see [How Your Container Should Respond to Health Check (Ping) Requests](https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-inference-code.html#your-algorithms-inference-algo-ping-requests). Valid values between `60` and `3600`.
      • enableSsmAccess

        public java.util.Optional<com.pulumi.core.Output<java.lang.Boolean>> enableSsmAccess()
        Returns:
        You can use this parameter to turn on native Amazon Web Services Systems Manager (SSM) access for a production variant behind an endpoint. By default, SSM access is disabled for all production variants behind an endpoints.
      • inferenceAmiVersion

        public java.util.Optional<com.pulumi.core.Output<java.lang.String>> inferenceAmiVersion()
        Returns:
        Specifies an option from a collection of preconfigured Amazon Machine Image (AMI) images. Each image is configured by Amazon Web Services with a set of software and driver versions. Amazon Web Services optimizes these configurations for different machine learning workloads.
      • initialInstanceCount

        public java.util.Optional<com.pulumi.core.Output<java.lang.Integer>> initialInstanceCount()
        Returns:
        Initial number of instances used for auto-scaling.
      • initialVariantWeight

        public java.util.Optional<com.pulumi.core.Output<java.lang.Double>> initialVariantWeight()
        Returns:
        Determines initial traffic distribution among all of the models that you specify in the endpoint configuration. If unspecified, it defaults to `1.0`.
      • instanceType

        public java.util.Optional<com.pulumi.core.Output<java.lang.String>> instanceType()
        Returns:
        The type of instance to start.
      • modelDataDownloadTimeoutInSeconds

        public java.util.Optional<com.pulumi.core.Output<java.lang.Integer>> modelDataDownloadTimeoutInSeconds()
        Returns:
        The timeout value, in seconds, to download and extract the model that you want to host from Amazon S3 to the individual inference instance associated with this production variant. Valid values between `60` and `3600`.
      • modelName

        public com.pulumi.core.Output<java.lang.String> modelName()
        Returns:
        The name of the model to use.
      • variantName

        public java.util.Optional<com.pulumi.core.Output<java.lang.String>> variantName()
        Returns:
        The name of the variant. If omitted, this provider will assign a random, unique name.
      • volumeSizeInGb

        public java.util.Optional<com.pulumi.core.Output<java.lang.Integer>> volumeSizeInGb()
        Returns:
        The size, in GB, of the ML storage volume attached to individual inference instance associated with the production variant. Valid values between `1` and `512`.