Class FileMergingSnapshotManagerBase
- java.lang.Object
-
- org.apache.flink.runtime.checkpoint.filemerging.FileMergingSnapshotManagerBase
-
- All Implemented Interfaces:
Closeable,AutoCloseable,FileMergingSnapshotManager
- Direct Known Subclasses:
AcrossCheckpointFileMergingSnapshotManager,WithinCheckpointFileMergingSnapshotManager
public abstract class FileMergingSnapshotManagerBase extends Object implements FileMergingSnapshotManager
Base implementation ofFileMergingSnapshotManager.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected static classFileMergingSnapshotManagerBase.DirectoryHandleWithReferenceTrackThis class wrap DirectoryStreamStateHandle with reference count by ongoing checkpoint.-
Nested classes/interfaces inherited from interface org.apache.flink.runtime.checkpoint.filemerging.FileMergingSnapshotManager
FileMergingSnapshotManager.SpaceStat, FileMergingSnapshotManager.SubtaskKey
-
-
Field Summary
Fields Modifier and Type Field Description protected org.apache.flink.core.fs.PathcheckpointDirprotected PhysicalFilePool.TypefilePoolTypeType of physical file pool.protected org.apache.flink.core.fs.FileSystemfsTheFileSystemthat this manager works on.protected ExecutorioExecutorThe executor for I/O operations in this manager.protected ObjectlockGuard forinitFileSystem(org.apache.flink.core.fs.FileSystem, org.apache.flink.core.fs.Path, org.apache.flink.core.fs.Path, org.apache.flink.core.fs.Path, int),restoreStateHandles(long, org.apache.flink.runtime.checkpoint.filemerging.FileMergingSnapshotManager.SubtaskKey, java.util.stream.Stream<org.apache.flink.runtime.state.filemerging.SegmentFileStateHandle>)and uploadedStates.protected org.apache.flink.core.fs.PathmanagedExclusiveStateDirThe private state files are merged across subtasks, there is only one directory for merged-files within one TM per job.protected FileMergingSnapshotManagerBase.DirectoryHandleWithReferenceTrackmanagedExclusiveStateDirHandleTheDirectoryStreamStateHandlewith it ongoing checkpoint reference count for private state directory, one for each taskmanager and job.protected longmaxPhysicalFileSizeMax size for a physical file.protected floatmaxSpaceAmplificationprotected FileMergingMetricGroupmetricGroupThe metric group for file merging snapshot manager.protected PhysicalFile.PhysicalFileDeleterphysicalFileDeleterprotected org.apache.flink.core.fs.PathsharedStateDirprotected booleanshouldSyncAfterClosingLogicalFileFile-system dependent value.protected FileMergingSnapshotManager.SpaceStatspaceStatThe current space statistic, updated on file creation/deletion.protected org.apache.flink.core.fs.PathtaskOwnedStateDirprotected TreeMap<Long,Set<LogicalFile>>uploadedStatesprotected intwriteBufferSizeThe buffer size for writing files to the file system.
-
Constructor Summary
Constructors Constructor Description FileMergingSnapshotManagerBase(String id, long maxFileSize, PhysicalFilePool.Type filePoolType, float maxSpaceAmplification, Executor ioExecutor, org.apache.flink.metrics.MetricGroup parentMetricGroup)
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description voidclose()booleancouldReusePreviousStateHandle(StreamStateHandle stateHandle)Check whether previous state handles could further be reused considering the space amplification.FileMergingCheckpointStateOutputStreamcreateCheckpointStateOutputStream(FileMergingSnapshotManager.SubtaskKey subtaskKey, long checkpointId, CheckpointedStateScope scope)Create a newFileMergingCheckpointStateOutputStream.protected LogicalFilecreateLogicalFile(PhysicalFile physicalFile, long startOffset, long length, FileMergingSnapshotManager.SubtaskKey subtaskKey)Create a logical file on a physical file.protected PhysicalFilecreatePhysicalFile(FileMergingSnapshotManager.SubtaskKey subtaskKey, CheckpointedStateScope scope)Create a physical file in right location (managed directory), which is specified by scope of this checkpoint and current subtask.protected PhysicalFilePoolcreatePhysicalPool()Create physical pool by filePoolType.protected voiddeletePhysicalFile(org.apache.flink.core.fs.Path filePath, long size)Delete a physical file by given file path.protected voiddiscardCheckpoint(long checkpointId)The callback which will be triggered when all subtasks discarded (aborted or subsumed).voiddiscardSingleLogicalFile(LogicalFile logicalFile, long checkpointId)protected org.apache.flink.core.fs.PathgeneratePhysicalFilePath(org.apache.flink.core.fs.Path dirPath)Generate a file path for a physical file.StringgetId()LogicalFilegetLogicalFile(LogicalFile.LogicalFileId fileId)org.apache.flink.core.fs.PathgetManagedDir(FileMergingSnapshotManager.SubtaskKey subtaskKey, CheckpointedStateScope scope)Get the managed directory of the file-merging snapshot manager, created inFileMergingSnapshotManager.initFileSystem(org.apache.flink.core.fs.FileSystem, org.apache.flink.core.fs.Path, org.apache.flink.core.fs.Path, org.apache.flink.core.fs.Path, int)orFileMergingSnapshotManager.registerSubtaskForSharedStates(org.apache.flink.runtime.checkpoint.filemerging.FileMergingSnapshotManager.SubtaskKey).DirectoryStreamStateHandlegetManagedDirStateHandle(FileMergingSnapshotManager.SubtaskKey subtaskKey, CheckpointedStateScope scope)Get theDirectoryStreamStateHandleof the managed directory, created inFileMergingSnapshotManager.initFileSystem(org.apache.flink.core.fs.FileSystem, org.apache.flink.core.fs.Path, org.apache.flink.core.fs.Path, org.apache.flink.core.fs.Path, int)orFileMergingSnapshotManager.registerSubtaskForSharedStates(org.apache.flink.runtime.checkpoint.filemerging.FileMergingSnapshotManager.SubtaskKey).protected abstract PhysicalFilegetOrCreatePhysicalFileForCheckpoint(FileMergingSnapshotManager.SubtaskKey subtaskKey, long checkpointId, CheckpointedStateScope scope)Get a reused physical file or create one.voidinitFileSystem(org.apache.flink.core.fs.FileSystem fileSystem, org.apache.flink.core.fs.Path checkpointBaseDir, org.apache.flink.core.fs.Path sharedStateDir, org.apache.flink.core.fs.Path taskOwnedStateDir, int writeBufferSize)Initialize the file system, recording the checkpoint path the manager should work with.voidnotifyCheckpointAborted(FileMergingSnapshotManager.SubtaskKey subtaskKey, long checkpointId)This method is called as a notification once a distributed checkpoint has been aborted.voidnotifyCheckpointComplete(FileMergingSnapshotManager.SubtaskKey subtaskKey, long checkpointId)Notifies the manager that the checkpoint with the givencheckpointIdcompleted and was committed.voidnotifyCheckpointStart(FileMergingSnapshotManager.SubtaskKey subtaskKey, long checkpointId)SubtaskCheckpointCoordinatorImpluse this method let the file merging manager know an ongoing checkpoint may reference the managed dirs.voidnotifyCheckpointSubsumed(FileMergingSnapshotManager.SubtaskKey subtaskKey, long checkpointId)This method is called as a notification once a distributed checkpoint has been subsumed.voidregisterSubtaskForSharedStates(FileMergingSnapshotManager.SubtaskKey subtaskKey)Register a subtask and create the managed directory for shared states.voidrestoreStateHandles(long checkpointId, FileMergingSnapshotManager.SubtaskKey subtaskKey, Stream<SegmentFileStateHandle> stateHandles)Restore and re-register the SegmentFileStateHandles into FileMergingSnapshotManager.protected abstract voidreturnPhysicalFileForNextReuse(FileMergingSnapshotManager.SubtaskKey subtaskKey, long checkpointId, PhysicalFile physicalFile)Try to return an existing physical file to the manager for next reuse.voidreusePreviousStateHandle(long checkpointId, Collection<? extends StreamStateHandle> stateHandles)A callback method which is called when previous state handles are reused by following checkpoint(s).voidunregisterSubtask(FileMergingSnapshotManager.SubtaskKey subtaskKey)Unregister a subtask.
-
-
-
Field Detail
-
ioExecutor
protected final Executor ioExecutor
The executor for I/O operations in this manager.
-
lock
protected final Object lock
Guard forinitFileSystem(org.apache.flink.core.fs.FileSystem, org.apache.flink.core.fs.Path, org.apache.flink.core.fs.Path, org.apache.flink.core.fs.Path, int),restoreStateHandles(long, org.apache.flink.runtime.checkpoint.filemerging.FileMergingSnapshotManager.SubtaskKey, java.util.stream.Stream<org.apache.flink.runtime.state.filemerging.SegmentFileStateHandle>)and uploadedStates.
-
uploadedStates
protected TreeMap<Long,Set<LogicalFile>> uploadedStates
-
fs
protected org.apache.flink.core.fs.FileSystem fs
TheFileSystemthat this manager works on.
-
checkpointDir
protected org.apache.flink.core.fs.Path checkpointDir
-
sharedStateDir
protected org.apache.flink.core.fs.Path sharedStateDir
-
taskOwnedStateDir
protected org.apache.flink.core.fs.Path taskOwnedStateDir
-
writeBufferSize
protected int writeBufferSize
The buffer size for writing files to the file system.
-
shouldSyncAfterClosingLogicalFile
protected boolean shouldSyncAfterClosingLogicalFile
File-system dependent value. Mark whether the file system this manager running on need sync for visibility. If true, DO a file sync after writing each segment .
-
maxPhysicalFileSize
protected long maxPhysicalFileSize
Max size for a physical file.
-
filePoolType
protected PhysicalFilePool.Type filePoolType
Type of physical file pool.
-
maxSpaceAmplification
protected final float maxSpaceAmplification
-
physicalFileDeleter
protected PhysicalFile.PhysicalFileDeleter physicalFileDeleter
-
managedExclusiveStateDir
protected org.apache.flink.core.fs.Path managedExclusiveStateDir
The private state files are merged across subtasks, there is only one directory for merged-files within one TM per job.
-
managedExclusiveStateDirHandle
protected FileMergingSnapshotManagerBase.DirectoryHandleWithReferenceTrack managedExclusiveStateDirHandle
TheDirectoryStreamStateHandlewith it ongoing checkpoint reference count for private state directory, one for each taskmanager and job.
-
spaceStat
protected FileMergingSnapshotManager.SpaceStat spaceStat
The current space statistic, updated on file creation/deletion.
-
metricGroup
protected FileMergingMetricGroup metricGroup
The metric group for file merging snapshot manager.
-
-
Constructor Detail
-
FileMergingSnapshotManagerBase
public FileMergingSnapshotManagerBase(String id, long maxFileSize, PhysicalFilePool.Type filePoolType, float maxSpaceAmplification, Executor ioExecutor, org.apache.flink.metrics.MetricGroup parentMetricGroup)
-
-
Method Detail
-
initFileSystem
public void initFileSystem(org.apache.flink.core.fs.FileSystem fileSystem, org.apache.flink.core.fs.Path checkpointBaseDir, org.apache.flink.core.fs.Path sharedStateDir, org.apache.flink.core.fs.Path taskOwnedStateDir, int writeBufferSize) throws IllegalArgumentExceptionDescription copied from interface:FileMergingSnapshotManagerInitialize the file system, recording the checkpoint path the manager should work with.The layout of checkpoint directory: /user-defined-checkpoint-dir /{job-id} (checkpointBaseDir) | + --shared/ | + --subtask-1/ + -- merged shared state files + --subtask-2/ + -- merged shared state files + --taskowned/ + -- merged private state files + --chk-1/ + --chk-2/ + --chk-3/The reason why initializing directories in this method instead of the constructor is that the FileMergingSnapshotManager itself belongs to the
TaskStateManager, which is initialized when receiving a task, while the base directories for checkpoint are created byFsCheckpointStorageAccesswhen the state backend initializes per subtask. After the checkpoint directories are initialized, the managed subdirectories are initialized here.Note: This method may be called several times, the implementation should ensure idempotency, and throw
IllegalArgumentExceptionwhen any of the path in params change across function calls.- Specified by:
initFileSystemin interfaceFileMergingSnapshotManager- Parameters:
fileSystem- The filesystem to write to.checkpointBaseDir- The base directory for checkpoints.sharedStateDir- The directory for shared checkpoint data.taskOwnedStateDir- The name of the directory for state not owned/released by the master, but by the TaskManagers.writeBufferSize- The buffer size for writing files to the file system.- Throws:
IllegalArgumentException- thrown if these three paths are not deterministic across calls.
-
registerSubtaskForSharedStates
public void registerSubtaskForSharedStates(FileMergingSnapshotManager.SubtaskKey subtaskKey)
Description copied from interface:FileMergingSnapshotManagerRegister a subtask and create the managed directory for shared states.- Specified by:
registerSubtaskForSharedStatesin interfaceFileMergingSnapshotManager- Parameters:
subtaskKey- the subtask key identifying a subtask.- See Also:
for layout information.
-
unregisterSubtask
public void unregisterSubtask(FileMergingSnapshotManager.SubtaskKey subtaskKey)
Description copied from interface:FileMergingSnapshotManagerUnregister a subtask.- Specified by:
unregisterSubtaskin interfaceFileMergingSnapshotManager- Parameters:
subtaskKey- the subtask key identifying a subtask.
-
createLogicalFile
protected LogicalFile createLogicalFile(@Nonnull PhysicalFile physicalFile, long startOffset, long length, @Nonnull FileMergingSnapshotManager.SubtaskKey subtaskKey)
Create a logical file on a physical file.- Parameters:
physicalFile- the underlying physical file.startOffset- the offset in the physical file that the logical file starts from.length- the length of the logical file.subtaskKey- the id of the subtask that the logical file belongs to.- Returns:
- the created logical file.
-
createPhysicalFile
@Nonnull protected PhysicalFile createPhysicalFile(FileMergingSnapshotManager.SubtaskKey subtaskKey, CheckpointedStateScope scope) throws IOException
Create a physical file in right location (managed directory), which is specified by scope of this checkpoint and current subtask.- Parameters:
subtaskKey- theFileMergingSnapshotManager.SubtaskKeyof current subtask.scope- the scope of the checkpoint.- Returns:
- the created physical file.
- Throws:
IOException- if anything goes wrong with file system.
-
createCheckpointStateOutputStream
public FileMergingCheckpointStateOutputStream createCheckpointStateOutputStream(FileMergingSnapshotManager.SubtaskKey subtaskKey, long checkpointId, CheckpointedStateScope scope)
Description copied from interface:FileMergingSnapshotManagerCreate a newFileMergingCheckpointStateOutputStream. According to the file merging strategy, the streams returned by multiple calls to this function may share the same underlying physical file, and each stream writes to a segment of the physical file.- Specified by:
createCheckpointStateOutputStreamin interfaceFileMergingSnapshotManager- Parameters:
subtaskKey- The subtask key identifying the subtask.checkpointId- ID of the checkpoint.scope- The state's scope, whether it is exclusive or shared.- Returns:
- An output stream that writes state for the given checkpoint.
-
generatePhysicalFilePath
protected org.apache.flink.core.fs.Path generatePhysicalFilePath(org.apache.flink.core.fs.Path dirPath)
Generate a file path for a physical file.- Parameters:
dirPath- the parent directory path for the physical file.- Returns:
- the generated file path for a physical file.
-
deletePhysicalFile
protected final void deletePhysicalFile(org.apache.flink.core.fs.Path filePath, long size)Delete a physical file by given file path. Use the io executor to do the deletion.- Parameters:
filePath- the given file path to delete.
-
createPhysicalPool
protected final PhysicalFilePool createPhysicalPool()
Create physical pool by filePoolType.- Returns:
- physical file pool.
-
getOrCreatePhysicalFileForCheckpoint
@Nonnull protected abstract PhysicalFile getOrCreatePhysicalFileForCheckpoint(FileMergingSnapshotManager.SubtaskKey subtaskKey, long checkpointId, CheckpointedStateScope scope) throws IOException
Get a reused physical file or create one. This will be called in checkpoint output stream creation logic.Basic logic of file reusing: whenever a physical file is needed, this method is called with necessary information provided for acquiring a file. The file will not be reused until it is written and returned to the reused pool by calling
returnPhysicalFileForNextReuse(org.apache.flink.runtime.checkpoint.filemerging.FileMergingSnapshotManager.SubtaskKey, long, org.apache.flink.runtime.checkpoint.filemerging.PhysicalFile).- Parameters:
subtaskKey- the subtask key for the callercheckpointId- the checkpoint idscope- checkpoint scope- Returns:
- the requested physical file.
- Throws:
IOException- thrown if anything goes wrong with file system.
-
returnPhysicalFileForNextReuse
protected abstract void returnPhysicalFileForNextReuse(FileMergingSnapshotManager.SubtaskKey subtaskKey, long checkpointId, PhysicalFile physicalFile) throws IOException
Try to return an existing physical file to the manager for next reuse. If this physical file is no longer needed (for reusing), it will be closed.Basic logic of file reusing, see
getOrCreatePhysicalFileForCheckpoint(org.apache.flink.runtime.checkpoint.filemerging.FileMergingSnapshotManager.SubtaskKey, long, org.apache.flink.runtime.state.CheckpointedStateScope).- Parameters:
subtaskKey- the subtask key for the callercheckpointId- in which checkpoint this physical file is requested.physicalFile- the returning checkpoint- Throws:
IOException- thrown if anything goes wrong with file system.- See Also:
getOrCreatePhysicalFileForCheckpoint(SubtaskKey, long, CheckpointedStateScope)
-
discardCheckpoint
protected void discardCheckpoint(long checkpointId) throws IOExceptionThe callback which will be triggered when all subtasks discarded (aborted or subsumed).- Parameters:
checkpointId- the discarded checkpoint id.- Throws:
IOException- if anything goes wrong with file system.
-
notifyCheckpointStart
public void notifyCheckpointStart(FileMergingSnapshotManager.SubtaskKey subtaskKey, long checkpointId)
SubtaskCheckpointCoordinatorImpluse this method let the file merging manager know an ongoing checkpoint may reference the managed dirs.- Specified by:
notifyCheckpointStartin interfaceFileMergingSnapshotManager- Parameters:
subtaskKey- the subtask key identifying the subtask.checkpointId- The ID of the checkpoint that has been started.
-
notifyCheckpointComplete
public void notifyCheckpointComplete(FileMergingSnapshotManager.SubtaskKey subtaskKey, long checkpointId) throws Exception
Description copied from interface:FileMergingSnapshotManagerNotifies the manager that the checkpoint with the givencheckpointIdcompleted and was committed.- Specified by:
notifyCheckpointCompletein interfaceFileMergingSnapshotManager- Parameters:
subtaskKey- the subtask key identifying the subtask.checkpointId- The ID of the checkpoint that has been completed.- Throws:
Exception- thrown if anything goes wrong with the listener.
-
notifyCheckpointAborted
public void notifyCheckpointAborted(FileMergingSnapshotManager.SubtaskKey subtaskKey, long checkpointId) throws Exception
Description copied from interface:FileMergingSnapshotManagerThis method is called as a notification once a distributed checkpoint has been aborted.- Specified by:
notifyCheckpointAbortedin interfaceFileMergingSnapshotManager- Parameters:
subtaskKey- the subtask key identifying the subtask.checkpointId- The ID of the checkpoint that has been completed.- Throws:
Exception- thrown if anything goes wrong with the listener.
-
notifyCheckpointSubsumed
public void notifyCheckpointSubsumed(FileMergingSnapshotManager.SubtaskKey subtaskKey, long checkpointId) throws Exception
Description copied from interface:FileMergingSnapshotManagerThis method is called as a notification once a distributed checkpoint has been subsumed.- Specified by:
notifyCheckpointSubsumedin interfaceFileMergingSnapshotManager- Parameters:
subtaskKey- the subtask key identifying the subtask.checkpointId- The ID of the checkpoint that has been completed.- Throws:
Exception- thrown if anything goes wrong with the listener.
-
reusePreviousStateHandle
public void reusePreviousStateHandle(long checkpointId, Collection<? extends StreamStateHandle> stateHandles)Description copied from interface:FileMergingSnapshotManagerA callback method which is called when previous state handles are reused by following checkpoint(s).- Specified by:
reusePreviousStateHandlein interfaceFileMergingSnapshotManager- Parameters:
checkpointId- the checkpoint that reuses the handles.stateHandles- the handles to be reused.
-
couldReusePreviousStateHandle
public boolean couldReusePreviousStateHandle(StreamStateHandle stateHandle)
Description copied from interface:FileMergingSnapshotManagerCheck whether previous state handles could further be reused considering the space amplification.- Specified by:
couldReusePreviousStateHandlein interfaceFileMergingSnapshotManager- Parameters:
stateHandle- the handle to be reused.
-
discardSingleLogicalFile
public void discardSingleLogicalFile(LogicalFile logicalFile, long checkpointId) throws IOException
- Throws:
IOException
-
getManagedDir
public org.apache.flink.core.fs.Path getManagedDir(FileMergingSnapshotManager.SubtaskKey subtaskKey, CheckpointedStateScope scope)
Description copied from interface:FileMergingSnapshotManagerGet the managed directory of the file-merging snapshot manager, created inFileMergingSnapshotManager.initFileSystem(org.apache.flink.core.fs.FileSystem, org.apache.flink.core.fs.Path, org.apache.flink.core.fs.Path, org.apache.flink.core.fs.Path, int)orFileMergingSnapshotManager.registerSubtaskForSharedStates(org.apache.flink.runtime.checkpoint.filemerging.FileMergingSnapshotManager.SubtaskKey).- Specified by:
getManagedDirin interfaceFileMergingSnapshotManager- Parameters:
subtaskKey- the subtask key identifying the subtask.scope- the checkpoint scope.- Returns:
- the managed directory for one subtask in specified checkpoint scope.
-
getManagedDirStateHandle
public DirectoryStreamStateHandle getManagedDirStateHandle(FileMergingSnapshotManager.SubtaskKey subtaskKey, CheckpointedStateScope scope)
Description copied from interface:FileMergingSnapshotManagerGet theDirectoryStreamStateHandleof the managed directory, created inFileMergingSnapshotManager.initFileSystem(org.apache.flink.core.fs.FileSystem, org.apache.flink.core.fs.Path, org.apache.flink.core.fs.Path, org.apache.flink.core.fs.Path, int)orFileMergingSnapshotManager.registerSubtaskForSharedStates(org.apache.flink.runtime.checkpoint.filemerging.FileMergingSnapshotManager.SubtaskKey).- Specified by:
getManagedDirStateHandlein interfaceFileMergingSnapshotManager- Parameters:
subtaskKey- the subtask key identifying the subtask.scope- the checkpoint scope.- Returns:
- the
DirectoryStreamStateHandlefor one subtask in specified checkpoint scope.
-
close
public void close() throws IOException- Specified by:
closein interfaceAutoCloseable- Specified by:
closein interfaceCloseable- Throws:
IOException
-
getId
@VisibleForTesting public String getId()
-
restoreStateHandles
public void restoreStateHandles(long checkpointId, FileMergingSnapshotManager.SubtaskKey subtaskKey, Stream<SegmentFileStateHandle> stateHandles)Description copied from interface:FileMergingSnapshotManagerRestore and re-register the SegmentFileStateHandles into FileMergingSnapshotManager.- Specified by:
restoreStateHandlesin interfaceFileMergingSnapshotManager- Parameters:
checkpointId- the restored checkpoint id.subtaskKey- the subtask key identifying the subtask.stateHandles- the restored segment file handles.
-
getLogicalFile
@VisibleForTesting public LogicalFile getLogicalFile(LogicalFile.LogicalFileId fileId)
-
-