Class DataflowPipelineJob

  • All Implemented Interfaces:
    org.apache.beam.sdk.PipelineResult
    Direct Known Subclasses:
    DataflowTemplateJob

    public class DataflowPipelineJob
    extends java.lang.Object
    implements org.apache.beam.sdk.PipelineResult
    A DataflowPipelineJob represents a job submitted to Dataflow using DataflowRunner.
    • Nested Class Summary

      • Nested classes/interfaces inherited from interface org.apache.beam.sdk.PipelineResult

        org.apache.beam.sdk.PipelineResult.State
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected static org.apache.beam.sdk.util.FluentBackoff STATUS_BACKOFF_FACTORY  
    • Constructor Summary

      Constructors 
      Constructor Description
      DataflowPipelineJob​(DataflowClient dataflowClient, java.lang.String jobId, DataflowPipelineOptions dataflowOptions, java.util.Map<org.apache.beam.sdk.runners.AppliedPTransform<?,​?,​?>,​java.lang.String> transformStepNames)
      Constructs the job.
      DataflowPipelineJob​(DataflowClient dataflowClient, java.lang.String jobId, DataflowPipelineOptions dataflowOptions, java.util.Map<org.apache.beam.sdk.runners.AppliedPTransform<?,​?,​?>,​java.lang.String> transformStepNames, @Nullable org.apache.beam.model.pipeline.v1.RunnerApi.Pipeline pipelineProto)
      Constructs the job.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      org.apache.beam.sdk.PipelineResult.State cancel()  
      DataflowPipelineOptions getDataflowOptions()  
      java.lang.String getJobId()
      Get the id of this job.
      @Nullable org.apache.beam.model.pipeline.v1.RunnerApi.Pipeline getPipelineProto()
      Get the Runner API pipeline proto if available.
      java.lang.String getProjectId()
      Get the project this job exists in.
      java.lang.String getRegion()
      Get the region this job exists in.
      DataflowPipelineJob getReplacedByJob()
      Returns a new DataflowPipelineJob for the job that replaced this one, if applicable.
      org.apache.beam.sdk.PipelineResult.State getState()  
      protected @Nullable org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.collect.BiMap<org.apache.beam.sdk.runners.AppliedPTransform<?,​?,​?>,​java.lang.String> getTransformStepNames()  
      org.apache.beam.sdk.metrics.MetricResults metrics()  
      @Nullable org.apache.beam.sdk.PipelineResult.State waitUntilFinish()  
      @Nullable org.apache.beam.sdk.PipelineResult.State waitUntilFinish​(org.joda.time.Duration duration)  
      @Nullable org.apache.beam.sdk.PipelineResult.State waitUntilFinish​(org.joda.time.Duration duration, MonitoringUtil.JobMessagesHandler messageHandler)
      Waits until the pipeline finishes and returns the final status.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • STATUS_BACKOFF_FACTORY

        protected static final org.apache.beam.sdk.util.FluentBackoff STATUS_BACKOFF_FACTORY
    • Constructor Detail

      • DataflowPipelineJob

        public DataflowPipelineJob​(DataflowClient dataflowClient,
                                   java.lang.String jobId,
                                   DataflowPipelineOptions dataflowOptions,
                                   java.util.Map<org.apache.beam.sdk.runners.AppliedPTransform<?,​?,​?>,​java.lang.String> transformStepNames,
                                   @Nullable org.apache.beam.model.pipeline.v1.RunnerApi.Pipeline pipelineProto)
        Constructs the job.
        Parameters:
        jobId - the job id
        dataflowOptions - used to configure the client for the Dataflow Service
        transformStepNames - a mapping from AppliedPTransforms to Step Names
        pipelineProto - Runner API pipeline proto.
      • DataflowPipelineJob

        public DataflowPipelineJob​(DataflowClient dataflowClient,
                                   java.lang.String jobId,
                                   DataflowPipelineOptions dataflowOptions,
                                   java.util.Map<org.apache.beam.sdk.runners.AppliedPTransform<?,​?,​?>,​java.lang.String> transformStepNames)
        Constructs the job.
        Parameters:
        jobId - the job id
        dataflowOptions - used to configure the client for the Dataflow Service
        transformStepNames - a mapping from AppliedPTransforms to Step Names
    • Method Detail

      • getJobId

        public java.lang.String getJobId()
        Get the id of this job.
      • getProjectId

        public java.lang.String getProjectId()
        Get the project this job exists in.
      • getPipelineProto

        public @Nullable org.apache.beam.model.pipeline.v1.RunnerApi.Pipeline getPipelineProto()
        Get the Runner API pipeline proto if available.
      • getRegion

        public java.lang.String getRegion()
        Get the region this job exists in.
      • getTransformStepNames

        protected @Nullable org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.collect.BiMap<org.apache.beam.sdk.runners.AppliedPTransform<?,​?,​?>,​java.lang.String> getTransformStepNames()
      • getReplacedByJob

        public DataflowPipelineJob getReplacedByJob()
        Returns a new DataflowPipelineJob for the job that replaced this one, if applicable.
        Throws:
        java.lang.IllegalStateException - if called before the job has terminated or if the job terminated but was not updated
      • waitUntilFinish

        public @Nullable org.apache.beam.sdk.PipelineResult.State waitUntilFinish()
        Specified by:
        waitUntilFinish in interface org.apache.beam.sdk.PipelineResult
      • waitUntilFinish

        public @Nullable org.apache.beam.sdk.PipelineResult.State waitUntilFinish​(org.joda.time.Duration duration)
        Specified by:
        waitUntilFinish in interface org.apache.beam.sdk.PipelineResult
      • waitUntilFinish

        public @Nullable org.apache.beam.sdk.PipelineResult.State waitUntilFinish​(org.joda.time.Duration duration,
                                                                                  MonitoringUtil.JobMessagesHandler messageHandler)
                                                                           throws java.io.IOException,
                                                                                  java.lang.InterruptedException
        Waits until the pipeline finishes and returns the final status.
        Parameters:
        duration - The time to wait for the job to finish. Provide a value less than 1 ms for an infinite wait.
        messageHandler - If non null this handler will be invoked for each batch of messages received.
        Returns:
        The final state of the job or null on timeout or if the thread is interrupted.
        Throws:
        java.io.IOException - If there is a persistent problem getting job information.
        java.lang.InterruptedException
      • cancel

        public org.apache.beam.sdk.PipelineResult.State cancel()
                                                        throws java.io.IOException
        Specified by:
        cancel in interface org.apache.beam.sdk.PipelineResult
        Throws:
        java.io.IOException
      • getState

        public org.apache.beam.sdk.PipelineResult.State getState()
        Specified by:
        getState in interface org.apache.beam.sdk.PipelineResult
      • metrics

        public org.apache.beam.sdk.metrics.MetricResults metrics()
        Specified by:
        metrics in interface org.apache.beam.sdk.PipelineResult