Class JobGraph
- java.lang.Object
-
- org.apache.flink.runtime.jobgraph.JobGraph
-
- All Implemented Interfaces:
Serializable,ExecutionPlan
public class JobGraph extends Object implements ExecutionPlan
The JobGraph represents a Flink dataflow program, at the low level that the JobManager accepts. All programs from higher level APIs are transformed into JobGraphs.The JobGraph is a graph of vertices and intermediate results that are connected together to form a DAG. Note that iterations (feedback edges) are currently not encoded inside the JobGraph but inside certain special vertices that establish the feedback channel amongst themselves.
The JobGraph defines the job-wide configuration settings, while each vertex and intermediate result define the characteristics of the concrete operation and intermediate data.
- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description JobGraph(String jobName)Constructs a new job graph with the given name, the givenExecutionConfig, and a random job ID.JobGraph(org.apache.flink.api.common.JobID jobId, String jobName)Constructs a new job graph with the given job ID (or a random ID, ifnullis passed), the given name and the given execution configuration (seeExecutionConfig).JobGraph(org.apache.flink.api.common.JobID jobId, String jobName, JobVertex... vertices)Constructs a new job graph with the given name, the givenExecutionConfig, the given jobId or a random one if null supplied, and the given job vertices.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidaddJar(org.apache.flink.core.fs.Path jar)Adds the path of a JAR file required to run the job on a task manager.voidaddJars(List<URL> jarFilesToAttach)Adds the given jar files to theJobGraphviaaddJar(org.apache.flink.core.fs.Path).voidaddUserArtifact(String name, org.apache.flink.api.common.cache.DistributedCache.DistributedCacheEntry file)Adds the path of a custom file required to run the job on a task manager.voidaddUserJarBlobKey(PermanentBlobKey key)Adds the BLOB referenced by the key to the JobGraph's dependencies.voidaddVertex(JobVertex vertex)Adds a new task vertex to the job graph if it is not already included.voidenableApproximateLocalRecovery(boolean enabled)JobVertexfindVertexByID(JobVertexID id)Searches for a vertex with a matching ID and returns it.JobCheckpointingSettingsgetCheckpointingSettings()Gets the settings for asynchronous snapshots.List<URL>getClasspaths()Gets the classpath required for the job.Set<CoLocationGroup>getCoLocationGroups()Returns allCoLocationGroupinstances associated with thisJobGraph.longgetInitialClientHeartbeatTimeout()Gets the initial client heartbeat timeout.org.apache.flink.configuration.ConfigurationgetJobConfiguration()Returns the configuration object for this job.org.apache.flink.api.common.JobIDgetJobID()Returns the ID of the job.List<org.apache.flink.core.execution.JobStatusHook>getJobStatusHooks()JobTypegetJobType()Gets the type of the job.intgetMaximumParallelism()Gets the maximum parallelism of all operations in this job graph.StringgetName()Returns the name assigned to the job graph.intgetNumberOfVertices()Returns the number of all vertices.SavepointRestoreSettingsgetSavepointRestoreSettings()Returns the configured savepoint restore setting.org.apache.flink.util.SerializedValue<org.apache.flink.api.common.ExecutionConfig>getSerializedExecutionConfig()Returns theExecutionConfig.Set<SlotSharingGroup>getSlotSharingGroups()Map<String,org.apache.flink.api.common.cache.DistributedCache.DistributedCacheEntry>getUserArtifacts()Gets the list of assigned user jar paths.List<PermanentBlobKey>getUserJarBlobKeys()Returns a set of BLOB keys referring to the JAR files required to run this job.List<org.apache.flink.core.fs.Path>getUserJars()Gets the list of assigned user jar paths.Iterable<JobVertex>getVertices()Returns an Iterable to iterate all vertices registered with the job graph.JobVertex[]getVerticesAsArray()Returns an array of all job vertices that are registered with the job graph.List<JobVertex>getVerticesSortedTopologicallyFromSources()booleanhasUsercodeJarFiles()Checks whether the JobGraph has user code JAR files attached.booleanisApproximateLocalRecoveryEnabled()booleanisDynamic()Checks if the execution plan is dynamic.booleanisEmpty()Checks if the execution plan is empty.booleanisPartialResourceConfigured()Checks if partial resource configuration is specified.voidsetClasspaths(List<URL> paths)Sets the classpaths required to run the job on a task manager.voidsetDynamic(boolean dynamic)voidsetExecutionConfig(org.apache.flink.api.common.ExecutionConfig executionConfig)Sets the execution config.voidsetInitialClientHeartbeatTimeout(long initialClientHeartbeatTimeout)voidsetJobConfiguration(org.apache.flink.configuration.Configuration jobConfiguration)voidsetJobID(org.apache.flink.api.common.JobID jobID)Sets the ID of the job.voidsetJobStatusHooks(List<org.apache.flink.core.execution.JobStatusHook> hooks)voidsetJobType(JobType type)voidsetSavepointRestoreSettings(SavepointRestoreSettings settings)Sets the savepoint restore settings.voidsetSerializedExecutionConfig(org.apache.flink.util.SerializedValue<org.apache.flink.api.common.ExecutionConfig> serializedExecutionConfig)voidsetSnapshotSettings(JobCheckpointingSettings settings)Sets the settings for asynchronous snapshots.voidsetUserArtifactBlobKey(String entryName, PermanentBlobKey blobKey)Sets a user artifact blob key for a specified user artifact.voidsetUserArtifactRemotePath(String entryName, String remotePath)StringtoString()voidwriteUserArtifactEntriesToConfiguration()Writes user artifact entries to the job configuration.-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface org.apache.flink.streaming.api.graph.ExecutionPlan
isCheckpointingEnabled
-
-
-
-
Constructor Detail
-
JobGraph
public JobGraph(String jobName)
Constructs a new job graph with the given name, the givenExecutionConfig, and a random job ID. The ExecutionConfig will be serialized and can't be modified afterwards.- Parameters:
jobName- The name of the job.
-
JobGraph
public JobGraph(@Nullable org.apache.flink.api.common.JobID jobId, String jobName)Constructs a new job graph with the given job ID (or a random ID, ifnullis passed), the given name and the given execution configuration (seeExecutionConfig). The ExecutionConfig will be serialized and can't be modified afterwards.- Parameters:
jobId- The id of the job. A random ID is generated, ifnullis passed.jobName- The name of the job.
-
JobGraph
public JobGraph(@Nullable org.apache.flink.api.common.JobID jobId, String jobName, JobVertex... vertices)Constructs a new job graph with the given name, the givenExecutionConfig, the given jobId or a random one if null supplied, and the given job vertices. The ExecutionConfig will be serialized and can't be modified afterwards.- Parameters:
jobId- The id of the job. A random ID is generated, ifnullis passed.jobName- The name of the job.vertices- The vertices to add to the graph.
-
-
Method Detail
-
getJobID
public org.apache.flink.api.common.JobID getJobID()
Returns the ID of the job.- Specified by:
getJobIDin interfaceExecutionPlan- Returns:
- the ID of the job
-
setJobID
public void setJobID(org.apache.flink.api.common.JobID jobID)
Sets the ID of the job.
-
getName
public String getName()
Returns the name assigned to the job graph.- Specified by:
getNamein interfaceExecutionPlan- Returns:
- the name assigned to the job graph
-
isPartialResourceConfigured
public boolean isPartialResourceConfigured()
Description copied from interface:ExecutionPlanChecks if partial resource configuration is specified.- Specified by:
isPartialResourceConfiguredin interfaceExecutionPlan- Returns:
- true if partial resource configuration is set; false otherwise
-
isEmpty
public boolean isEmpty()
Description copied from interface:ExecutionPlanChecks if the execution plan is empty.- Specified by:
isEmptyin interfaceExecutionPlan- Returns:
- true if the plan is empty; false otherwise
-
setJobConfiguration
public void setJobConfiguration(org.apache.flink.configuration.Configuration jobConfiguration)
-
getJobConfiguration
public org.apache.flink.configuration.Configuration getJobConfiguration()
Returns the configuration object for this job. Job-wide parameters should be set into that configuration object.- Specified by:
getJobConfigurationin interfaceExecutionPlan- Returns:
- The configuration object for this job.
-
getSerializedExecutionConfig
public org.apache.flink.util.SerializedValue<org.apache.flink.api.common.ExecutionConfig> getSerializedExecutionConfig()
Returns theExecutionConfig.- Specified by:
getSerializedExecutionConfigin interfaceExecutionPlan- Returns:
- ExecutionConfig
-
setJobType
public void setJobType(JobType type)
-
getJobType
public JobType getJobType()
Description copied from interface:ExecutionPlanGets the type of the job.- Specified by:
getJobTypein interfaceExecutionPlan- Returns:
- the job type
-
setDynamic
public void setDynamic(boolean dynamic)
-
isDynamic
public boolean isDynamic()
Description copied from interface:ExecutionPlanChecks if the execution plan is dynamic.- Specified by:
isDynamicin interfaceExecutionPlan- Returns:
- true if the execution plan is dynamic; false otherwise
-
enableApproximateLocalRecovery
public void enableApproximateLocalRecovery(boolean enabled)
-
isApproximateLocalRecoveryEnabled
public boolean isApproximateLocalRecoveryEnabled()
-
setSavepointRestoreSettings
public void setSavepointRestoreSettings(SavepointRestoreSettings settings)
Sets the savepoint restore settings.- Specified by:
setSavepointRestoreSettingsin interfaceExecutionPlan- Parameters:
settings- The savepoint restore settings.
-
getSavepointRestoreSettings
public SavepointRestoreSettings getSavepointRestoreSettings()
Returns the configured savepoint restore setting.- Specified by:
getSavepointRestoreSettingsin interfaceExecutionPlan- Returns:
- The configured savepoint restore settings.
-
setExecutionConfig
public void setExecutionConfig(org.apache.flink.api.common.ExecutionConfig executionConfig) throws IOExceptionSets the execution config. This method eagerly serialized the ExecutionConfig for future RPC transport. Further modification of the referenced ExecutionConfig object will not affect this serialized copy.- Parameters:
executionConfig- The ExecutionConfig to be serialized.- Throws:
IOException- Thrown if the serialization of the ExecutionConfig fails
-
setSerializedExecutionConfig
public void setSerializedExecutionConfig(org.apache.flink.util.SerializedValue<org.apache.flink.api.common.ExecutionConfig> serializedExecutionConfig)
-
addVertex
public void addVertex(JobVertex vertex)
Adds a new task vertex to the job graph if it is not already included.- Parameters:
vertex- the new task vertex to be added
-
getVertices
public Iterable<JobVertex> getVertices()
Returns an Iterable to iterate all vertices registered with the job graph.- Returns:
- an Iterable to iterate all vertices registered with the job graph
-
getVerticesAsArray
public JobVertex[] getVerticesAsArray()
Returns an array of all job vertices that are registered with the job graph. The order in which the vertices appear in the list is not defined.- Returns:
- an array of all job vertices that are registered with the job graph
-
getNumberOfVertices
public int getNumberOfVertices()
Returns the number of all vertices.- Returns:
- The number of all vertices.
-
getSlotSharingGroups
public Set<SlotSharingGroup> getSlotSharingGroups()
-
getCoLocationGroups
public Set<CoLocationGroup> getCoLocationGroups()
Returns allCoLocationGroupinstances associated with thisJobGraph.- Returns:
- The associated
CoLocationGroupinstances.
-
setSnapshotSettings
public void setSnapshotSettings(JobCheckpointingSettings settings)
Sets the settings for asynchronous snapshots. A value ofnullmeans that snapshotting is not enabled.- Parameters:
settings- The snapshot settings
-
getCheckpointingSettings
public JobCheckpointingSettings getCheckpointingSettings()
Gets the settings for asynchronous snapshots. This method returns null, when checkpointing is not enabled.- Specified by:
getCheckpointingSettingsin interfaceExecutionPlan- Returns:
- The snapshot settings
-
findVertexByID
public JobVertex findVertexByID(JobVertexID id)
Searches for a vertex with a matching ID and returns it.- Parameters:
id- the ID of the vertex to search for- Returns:
- the vertex with the matching ID or
nullif no vertex with such ID could be found
-
setClasspaths
public void setClasspaths(List<URL> paths)
Sets the classpaths required to run the job on a task manager.- Parameters:
paths- paths of the directories/JAR files required to run the job on a task manager
-
getClasspaths
public List<URL> getClasspaths()
Description copied from interface:ExecutionPlanGets the classpath required for the job.- Specified by:
getClasspathsin interfaceExecutionPlan- Returns:
- a list of classpath URLs
-
getMaximumParallelism
public int getMaximumParallelism()
Gets the maximum parallelism of all operations in this job graph.- Specified by:
getMaximumParallelismin interfaceExecutionPlan- Returns:
- The maximum parallelism of this job graph
-
getVerticesSortedTopologicallyFromSources
public List<JobVertex> getVerticesSortedTopologicallyFromSources() throws org.apache.flink.api.common.InvalidProgramException
- Throws:
org.apache.flink.api.common.InvalidProgramException
-
addJar
public void addJar(org.apache.flink.core.fs.Path jar)
Adds the path of a JAR file required to run the job on a task manager.- Parameters:
jar- path of the JAR file required to run the job on a task manager
-
addJars
public void addJars(List<URL> jarFilesToAttach)
Adds the given jar files to theJobGraphviaaddJar(org.apache.flink.core.fs.Path).- Parameters:
jarFilesToAttach- a list of theURLsof the jar files to attach to the jobgraph.- Throws:
RuntimeException- if a jar URL is not valid.
-
getUserJars
public List<org.apache.flink.core.fs.Path> getUserJars()
Gets the list of assigned user jar paths.- Specified by:
getUserJarsin interfaceExecutionPlan- Returns:
- The list of assigned user jar paths
-
addUserArtifact
public void addUserArtifact(String name, org.apache.flink.api.common.cache.DistributedCache.DistributedCacheEntry file)
Adds the path of a custom file required to run the job on a task manager.- Parameters:
name- a name under which this artifact will be accessible throughDistributedCachefile- path of a custom file required to run the job on a task manager
-
getUserArtifacts
public Map<String,org.apache.flink.api.common.cache.DistributedCache.DistributedCacheEntry> getUserArtifacts()
Gets the list of assigned user jar paths.- Specified by:
getUserArtifactsin interfaceExecutionPlan- Returns:
- The list of assigned user jar paths
-
addUserJarBlobKey
public void addUserJarBlobKey(PermanentBlobKey key)
Adds the BLOB referenced by the key to the JobGraph's dependencies.- Specified by:
addUserJarBlobKeyin interfaceExecutionPlan- Parameters:
key- path of the JAR file required to run the job on a task manager
-
hasUsercodeJarFiles
public boolean hasUsercodeJarFiles()
Checks whether the JobGraph has user code JAR files attached.- Returns:
- True, if the JobGraph has user code JAR files attached, false otherwise.
-
getUserJarBlobKeys
public List<PermanentBlobKey> getUserJarBlobKeys()
Returns a set of BLOB keys referring to the JAR files required to run this job.- Specified by:
getUserJarBlobKeysin interfaceExecutionPlan- Returns:
- set of BLOB keys referring to the JAR files required to run this job
-
setUserArtifactBlobKey
public void setUserArtifactBlobKey(String entryName, PermanentBlobKey blobKey) throws IOException
Description copied from interface:ExecutionPlanSets a user artifact blob key for a specified user artifact.- Specified by:
setUserArtifactBlobKeyin interfaceExecutionPlan- Parameters:
entryName- the name of the user artifactblobKey- the blob key corresponding to the user artifact- Throws:
IOException- if an error occurs during the operation
-
setUserArtifactRemotePath
public void setUserArtifactRemotePath(String entryName, String remotePath)
-
writeUserArtifactEntriesToConfiguration
public void writeUserArtifactEntriesToConfiguration()
Description copied from interface:ExecutionPlanWrites user artifact entries to the job configuration.- Specified by:
writeUserArtifactEntriesToConfigurationin interfaceExecutionPlan
-
setJobStatusHooks
public void setJobStatusHooks(List<org.apache.flink.core.execution.JobStatusHook> hooks)
-
getJobStatusHooks
public List<org.apache.flink.core.execution.JobStatusHook> getJobStatusHooks()
-
setInitialClientHeartbeatTimeout
public void setInitialClientHeartbeatTimeout(long initialClientHeartbeatTimeout)
-
getInitialClientHeartbeatTimeout
public long getInitialClientHeartbeatTimeout()
Description copied from interface:ExecutionPlanGets the initial client heartbeat timeout.- Specified by:
getInitialClientHeartbeatTimeoutin interfaceExecutionPlan- Returns:
- the timeout duration in milliseconds
-
-