Interface PipelineOptions

  • All Superinterfaces:
    HasDisplayData
    All Known Subinterfaces:
    ApplicationNameOptions, ExecutorOptions, ExperimentalOptions, FileStagingOptions, ManualDockerEnvironmentOptions, MetricsOptions, PortablePipelineOptions, RemoteEnvironmentOptions, ResourceHintsOptions, SdkHarnessOptions, StreamingOptions, TestPipeline.TestValueProviderOptions, TestPipelineOptions

    @ThreadSafe
    public interface PipelineOptions
    extends HasDisplayData
    PipelineOptions are used to configure Pipelines. You can extend PipelineOptions to create custom configuration options specific to your Pipeline, for both local execution and execution via a PipelineRunner.

    PipelineOptions and their subinterfaces represent a collection of properties which can be manipulated in a type safe manner. PipelineOptions is backed by a dynamic Proxy which allows for type safe manipulation of properties in an extensible fashion through plain old Java interfaces.

    PipelineOptions can be created with PipelineOptionsFactory.create() and PipelineOptionsFactory.as(Class). They can be created from command-line arguments with PipelineOptionsFactory.fromArgs(String[]). They can be converted to another type by invoking as(Class) and can be accessed from within a DoFn by invoking getPipelineOptions() on the input Context object.

    Please don't implement PipelineOptions, it implies that it is backwards-incompatible to add new options. User-implemented PipelineOptions is not accepted by Pipeline.

    For example:

    
     // The most common way to construct PipelineOptions is via command-line argument parsing:
     public static void main(String[] args) {
       // Will parse the arguments passed into the application and construct a PipelineOptions
       // Note that --help will print registered options, and --help=PipelineOptionsClassName
       // will print out usage for the specific class.
       PipelineOptions options =
           PipelineOptionsFactory.fromArgs(args).create();
    
       Pipeline p = Pipeline.create(options);
       ...
       p.run();
     }
    
     // To create options for the DirectRunner:
     DirectOptions directRunnerOptions =
         PipelineOptionsFactory.as(DirectOptions.class);
    
     // To cast from one type to another using the as(Class) method:
     ApplicationNameOptions applicationNameOptions =
         directPipelineOptions.as(ApplicationNameOptions.class);
    
     // Options for the same property are shared between types
     // The statement below will print out the name of the enclosing class by default
     System.out.println(applicationNameOptions.getApplicationName());
    
     // Prints out registered options.
     PipelineOptionsFactory.printHelp(System.out);
    
     // Prints out options which are available to be set on ApplicationNameOptions
     PipelineOptionsFactory.printHelp(System.out, ApplicationNameOptions.class);
     

    Defining Your Own PipelineOptions

    Defining your own PipelineOptions is the way for you to make configuration options available for both local execution and execution via a PipelineRunner. By having PipelineOptionsFactory as your command-line interpreter, you will provide a standardized way for users to interact with your application via the command-line.

    To define your own PipelineOptions, you create a public interface which extends PipelineOptions and define getter/setter pairs. These getter/setter pairs define a collection of JavaBean properties.

    For example:

    
     // Creates a user defined property called "myProperty"
     public interface MyOptions extends PipelineOptions {
       String getMyProperty();
       void setMyProperty(String value);
     }
     

    Note: Please see the section on Registration below when using custom property types.

    Restrictions

    Since PipelineOptions can be "cast" to multiple types dynamically using as(Class), a property must conform to the following set of restrictions:

    • Any property with the same name must have the same return type for all derived interfaces of PipelineOptions.
    • Every bean property of any interface derived from PipelineOptions must have a getter and setter method.
    • Every method must conform to being a getter or setter for a JavaBean.
    • The derived interface of PipelineOptions must be composable with every interface part registered with the PipelineOptionsFactory.
    • Only getters may be annotated with @JsonIgnore.
    • If any getter is annotated with @JsonIgnore, then all getters for this property must be annotated with @JsonIgnore.
    • If any getter is annotated with JsonDeserialize and JsonSerialize, then all getters for this property must also be.

    Annotations For PipelineOptions

    @Description can be used to annotate an interface or a getter with useful information which is output when --help is invoked via PipelineOptionsFactory.fromArgs(String[]).

    @Default represents a set of annotations that can be used to annotate getter properties on PipelineOptions with information representing the default value to be returned if no value is specified. Any default implementation (using the default keyword) is ignored.

    @Hidden hides an option from being listed when --help is invoked via PipelineOptionsFactory.fromArgs(String[]).

    @Validation represents a set of annotations that can be used to annotate getter properties on PipelineOptions with information representing the validation criteria to be used when validating with the PipelineOptionsValidator. Validation will be performed if during construction of the PipelineOptions, PipelineOptionsFactory.withValidation() is invoked.

    @JsonIgnore is used to prevent a property from being serialized and available during execution of DoFn. See the Serialization section below for more details.

    @JsonSerialize and @JsonDeserialize is used to control how a property is (de)serialized when the PipelineOptions are (de)serialized to JSON. See the Serialization section below for more details.

    Registration Of PipelineOptions

    Registration of PipelineOptions by an application guarantees that the PipelineOptions is composable during execution of their Pipeline and meets the restrictions listed above or will fail during registration. Registration also lists the registered PipelineOptions when --help is invoked via PipelineOptionsFactory.fromArgs(String[]).

    Registration can be performed by invoking PipelineOptionsFactory.register(java.lang.Class<? extends org.apache.beam.sdk.options.PipelineOptions>) within a users application or via automatic registration by creating a ServiceLoader entry and a concrete implementation of the PipelineOptionsRegistrar interface.

    It is optional but recommended to use one of the many build time tools such as AutoService to generate the necessary META-INF files automatically.

    A list of registered options can be fetched from PipelineOptionsFactory.getRegisteredOptions().

    Serialization Of PipelineOptions

    PipelineOptions is intentionally not marked Serializable, in order to discourage pipeline authors from capturing PipelineOptions at pipeline construction time, because a pipeline may be saved as a template and run with a different set of options than the ones it was constructed with. See Pipeline.run(PipelineOptions).

    However, PipelineRunners require support for options to be serialized. Each property within PipelineOptions must be able to be serialized using Jackson's ObjectMapper or the getter method for the property annotated with @JsonIgnore.

    Jackson supports serialization of many types and supports a useful set of annotations to aid in serialization of custom types. We point you to the public Jackson documentation when attempting to add serialization support for your custom types. Note that PipelineOptions relies on Jackson's ability to automatically configure the ObjectMapper with additional modules via ObjectMapper.findModules().

    To further customize serialization, getter methods may be annotated with @JsonSerialize and @JsonDeserialize. @JsonDeserialize is also used when parsing command line arguments.

    Note: A property must be annotated with BOTH@JsonDeserialize and @JsonSerialize or neither. It is an error to have a property annotated with only @JsonDeserialize or @JsonSerialize.

    Note: It is an error to have the same property available in multiple interfaces with only some of them being annotated with @JsonIgnore. It is also an error to mark a setter for a property with @JsonIgnore.

    • Method Detail

      • as

        <T extends PipelineOptions> T as​(java.lang.Class<T> kls)
        Transforms this object into an object of type <T> saving each property that has been manipulated. <T> must extend PipelineOptions.

        If <T> is not registered with the PipelineOptionsFactory, then we attempt to verify that <T> is composable with every interface that this instance of the PipelineOptions has seen.

        Parameters:
        kls - The class of the type to transform to.
        Returns:
        An object of type kls.
      • getRunner

        @Required
        @InstanceFactory(DirectRunner.class)
        java.lang.Class<? extends PipelineRunner<?>> getRunner()
        The pipeline runner that will be used to execute the pipeline. For registered runners, the class name can be specified, otherwise the fully qualified name needs to be specified.
      • setRunner

        void setRunner​(java.lang.Class<? extends PipelineRunner<?>> kls)
      • getStableUniqueNames

        @Required
        @Enum("WARNING")
        PipelineOptions.CheckEnabled getStableUniqueNames()
        Whether to check for stable unique names on each transform. This is necessary to support updating of pipelines.
      • getTempLocation

        java.lang.String getTempLocation()
        A pipeline level default location for storing temporary files.

        This can be a path of any file system.

        getTempLocation() can be used as a default location in other PipelineOptions.

        If it is unset, PipelineRunner can override it.

      • setTempLocation

        void setTempLocation​(java.lang.String value)
      • setJobName

        void setJobName​(java.lang.String jobName)
      • outputRuntimeOptions

        java.util.Map<java.lang.String,​java.util.Map<java.lang.String,​java.lang.Object>> outputRuntimeOptions()
        Returns a map of properties which correspond to ValueProvider.RuntimeValueProvider, keyed by the property name. The value is a map containing type and default information.
      • setOptionsId

        void setOptionsId​(long id)
      • getUserAgent

        @InstanceFactory(UserAgentFactory.class)
        java.lang.String getUserAgent()
        A user agent string as per RFC2616, describing the pipeline to external services.

        https://www.ietf.org/rfc/rfc2616.txt

        It should follow the BNF Form:

        
         user agent         = 1*(product | comment)
         product            = token ["/" product-version]
         product-version    = token
         
        Where a token is a series of characters without a separator.

        The string defaults to [name]/[version] based on the properties of the Apache Beam release.

      • setUserAgent

        void setUserAgent​(java.lang.String userAgent)