Class PermanentBlobCache
- java.lang.Object
-
- org.apache.flink.runtime.blob.AbstractBlobCache
-
- org.apache.flink.runtime.blob.PermanentBlobCache
-
- All Implemented Interfaces:
Closeable,AutoCloseable,JobPermanentBlobService,PermanentBlobService
public class PermanentBlobCache extends AbstractBlobCache implements JobPermanentBlobService
Provides a cache for permanent BLOB files including a per-job ref-counting and a staged cleanup.When requesting BLOBs via
getFile(JobID, PermanentBlobKey), the cache will first attempt to serve the file from its local cache. Only if the local cache does not contain the desired BLOB, it will try to download it from a distributed HA file system (if available) or the BLOB server.If files for a job are not needed any more, they will enter a staged, i.e. deferred, cleanup. Files may thus still be accessible upon recovery and do not need to be re-downloaded.
-
-
Field Summary
-
Fields inherited from class org.apache.flink.runtime.blob.AbstractBlobCache
blobClientConfig, blobView, log, numFetchRetries, readWriteLock, serverAddress, shutdownHook, shutdownRequested, storageDir, tempFileCounter
-
-
Constructor Summary
Constructors Constructor Description PermanentBlobCache(org.apache.flink.configuration.Configuration blobClientConfig, File storageDir, BlobView blobView, InetSocketAddress serverAddress)PermanentBlobCache(org.apache.flink.configuration.Configuration blobClientConfig, File storageDir, BlobView blobView, InetSocketAddress serverAddress, BlobCacheSizeTracker blobCacheSizeTracker)PermanentBlobCache(org.apache.flink.configuration.Configuration blobClientConfig, org.apache.flink.util.Reference<File> storageDir, BlobView blobView, InetSocketAddress serverAddress)Instantiates a new cache for permanent BLOBs which are also available in an HA store.PermanentBlobCache(org.apache.flink.configuration.Configuration blobClientConfig, org.apache.flink.util.Reference<File> storageDir, BlobView blobView, InetSocketAddress serverAddress, BlobCacheSizeTracker blobCacheSizeTracker)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected voidcancelCleanupTask()Cancels any cleanup task that subclasses may be executing.FilegetFile(org.apache.flink.api.common.JobID jobId, PermanentBlobKey key)Returns the path to a local copy of the file associated with the provided job ID and blob key.intgetNumberOfReferenceHolders(org.apache.flink.api.common.JobID jobId)FilegetStorageLocation(org.apache.flink.api.common.JobID jobId, BlobKey key)Returns a file handle to the file associated with the given blob key on the blob server.byte[]readFile(org.apache.flink.api.common.JobID jobId, PermanentBlobKey blobKey)Returns the content of the file for the BLOB with the provided job ID the blob key.voidregisterJob(org.apache.flink.api.common.JobID jobId)Registers use of job-related BLOBs.voidreleaseJob(org.apache.flink.api.common.JobID jobId)Unregisters use of job-related BLOBs and allow them to be released.-
Methods inherited from class org.apache.flink.runtime.blob.AbstractBlobCache
close, getFileInternal, getPort, getStorageDir, setBlobServerAddress
-
-
-
-
Constructor Detail
-
PermanentBlobCache
@VisibleForTesting public PermanentBlobCache(org.apache.flink.configuration.Configuration blobClientConfig, File storageDir, BlobView blobView, @Nullable InetSocketAddress serverAddress) throws IOException- Throws:
IOException
-
PermanentBlobCache
@VisibleForTesting public PermanentBlobCache(org.apache.flink.configuration.Configuration blobClientConfig, File storageDir, BlobView blobView, @Nullable InetSocketAddress serverAddress, BlobCacheSizeTracker blobCacheSizeTracker) throws IOException- Throws:
IOException
-
PermanentBlobCache
public PermanentBlobCache(org.apache.flink.configuration.Configuration blobClientConfig, org.apache.flink.util.Reference<File> storageDir, BlobView blobView, @Nullable InetSocketAddress serverAddress) throws IOExceptionInstantiates a new cache for permanent BLOBs which are also available in an HA store.- Parameters:
blobClientConfig- global configurationstorageDir- storage directory for the cached blobsblobView- (distributed) HA blob store file system to retrieve files from firstserverAddress- address of theBlobServerto use for fetching files from ornullif none yet- Throws:
IOException- thrown if the (local or distributed) file storage cannot be created or is not usable
-
PermanentBlobCache
@VisibleForTesting public PermanentBlobCache(org.apache.flink.configuration.Configuration blobClientConfig, org.apache.flink.util.Reference<File> storageDir, BlobView blobView, @Nullable InetSocketAddress serverAddress, BlobCacheSizeTracker blobCacheSizeTracker) throws IOException- Throws:
IOException
-
-
Method Detail
-
registerJob
public void registerJob(org.apache.flink.api.common.JobID jobId)
Registers use of job-related BLOBs.Using any other method to access BLOBs, e.g.
getFile(org.apache.flink.api.common.JobID, org.apache.flink.runtime.blob.PermanentBlobKey), is only valid within calls to registerJob(JobID) andreleaseJob(JobID).- Specified by:
registerJobin interfaceJobPermanentBlobService- Parameters:
jobId- ID of the job this blob belongs to- See Also:
releaseJob(JobID)
-
releaseJob
public void releaseJob(org.apache.flink.api.common.JobID jobId)
Unregisters use of job-related BLOBs and allow them to be released.- Specified by:
releaseJobin interfaceJobPermanentBlobService- Parameters:
jobId- ID of the job this blob belongs to- See Also:
registerJob(JobID)
-
getNumberOfReferenceHolders
public int getNumberOfReferenceHolders(org.apache.flink.api.common.JobID jobId)
-
getFile
public File getFile(org.apache.flink.api.common.JobID jobId, PermanentBlobKey key) throws IOException
Returns the path to a local copy of the file associated with the provided job ID and blob key.We will first attempt to serve the BLOB from the local storage. If the BLOB is not in there, we will try to download it from the HA store, or directly from the
BlobServer.- Specified by:
getFilein interfacePermanentBlobService- Parameters:
jobId- ID of the job this blob belongs tokey- blob key associated with the requested file- Returns:
- The path to the file.
- Throws:
FileNotFoundException- if the BLOB does not exist;IOException- if any other error occurs when retrieving the file
-
readFile
public byte[] readFile(org.apache.flink.api.common.JobID jobId, PermanentBlobKey blobKey) throws IOExceptionReturns the content of the file for the BLOB with the provided job ID the blob key.The method will first attempt to serve the BLOB from the local cache. If the BLOB is not in the cache, the method will try to download it from the HA store, or directly from the
BlobServer.Compared to
getFile,readFilemakes sure that the file is fully read in the same write lock as the file is accessed. This avoids the scenario that the path is returned as the file is deleted concurrently by other threads.- Specified by:
readFilein interfacePermanentBlobService- Parameters:
jobId- ID of the job this blob belongs toblobKey- BLOB key associated with the requested file- Returns:
- The content of the BLOB.
- Throws:
FileNotFoundException- if the BLOB does not exist;IOException- if any other error occurs when retrieving the file.
-
getStorageLocation
@VisibleForTesting public File getStorageLocation(org.apache.flink.api.common.JobID jobId, BlobKey key) throws IOException
Returns a file handle to the file associated with the given blob key on the blob server.- Parameters:
jobId- ID of the job this blob belongs to (or null if job-unrelated)key- identifying the file- Returns:
- file handle to the file
- Throws:
IOException- if creating the directory fails
-
cancelCleanupTask
protected void cancelCleanupTask()
Description copied from class:AbstractBlobCacheCancels any cleanup task that subclasses may be executing.This is called during
AbstractBlobCache.close().- Specified by:
cancelCleanupTaskin classAbstractBlobCache
-
-