public class CommitOperations extends Object implements org.apache.hadoop.fs.statistics.IOStatisticsSource
Modifier and Type | Class and Description |
---|---|
class |
CommitOperations.CommitContext
Commit context.
|
static class |
CommitOperations.MaybeIOE
A holder for a possible IOException; the call
CommitOperations.MaybeIOE.maybeRethrow()
will throw any exception passed into the constructor, and be a no-op
if none was. |
Modifier and Type | Field and Description |
---|---|
static org.apache.hadoop.fs.PathFilter |
PENDING_FILTER
Filter to find all {code .pending} files.
|
static org.apache.hadoop.fs.PathFilter |
PENDINGSET_FILTER
Filter to find all {code .pendingset} files.
|
Constructor and Description |
---|
CommitOperations(S3AFileSystem fs)
Instantiate.
|
CommitOperations(S3AFileSystem fs,
CommitterStatistics committerStatistics)
Instantiate.
|
Modifier and Type | Method and Description |
---|---|
CommitOperations.MaybeIOE |
abortAllSinglePendingCommits(org.apache.hadoop.fs.Path pendingDir,
boolean recursive)
Enumerate all pending files in a dir/tree, abort.
|
int |
abortPendingUploadsUnderPath(org.apache.hadoop.fs.Path dest)
Abort all pending uploads to the destination FS under a path.
|
void |
addFileSystemStatistics(Map<String,Long> dest)
Add the filesystem statistics to the map; overwriting anything
with the same name.
|
void |
createSuccessMarker(org.apache.hadoop.fs.Path outputPath,
SuccessData successData,
boolean addMetrics)
Save the success data to the
_SUCCESS file. |
void |
deleteSuccessMarker(org.apache.hadoop.fs.Path outputPath)
Delete any existing
_SUCCESS file. |
static Optional<Long> |
extractMagicFileLength(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.fs.Path path)
Get the magic file length of a file.
|
org.apache.hadoop.fs.statistics.IOStatistics |
getIOStatistics() |
protected CommitterStatistics |
getStatistics() |
CommitOperations.CommitContext |
initiateCommitOperation(org.apache.hadoop.fs.Path path)
Begin the final commit.
|
void |
jobCompleted(boolean success)
Note that a job has completed.
|
List<com.amazonaws.services.s3.model.MultipartUpload> |
listPendingUploadsUnderPath(org.apache.hadoop.fs.Path dest)
List all pending uploads to the destination FS under a path.
|
org.apache.commons.lang3.tuple.Pair<PendingSet,List<org.apache.commons.lang3.tuple.Pair<org.apache.hadoop.fs.LocatedFileStatus,IOException>>> |
loadSinglePendingCommits(org.apache.hadoop.fs.Path pendingDir,
boolean recursive)
Load all single pending commits in the directory.
|
List<org.apache.hadoop.fs.LocatedFileStatus> |
locateAllSinglePendingCommits(org.apache.hadoop.fs.Path pendingDir,
boolean recursive)
Locate all files with the pending suffix under a directory.
|
protected org.apache.hadoop.fs.RemoteIterator<org.apache.hadoop.fs.LocatedFileStatus> |
ls(org.apache.hadoop.fs.Path path,
boolean recursive)
List files.
|
IOException |
makeIOE(String key,
Exception ex)
Convert any exception to an IOE, if needed.
|
void |
revertCommit(SinglePendingCommit commit,
BulkOperationState operationState)
Revert a pending commit by deleting the destination.
|
void |
taskCompleted(boolean success)
Note that a task has completed.
|
static List<com.amazonaws.services.s3.model.PartETag> |
toPartEtags(List<String> tagIds)
Convert an ordered list of strings to a list of index etag parts.
|
String |
toString() |
SinglePendingCommit |
uploadFileToPendingCommit(File localFile,
org.apache.hadoop.fs.Path destPath,
String partition,
long uploadPartSize,
org.apache.hadoop.util.Progressable progress)
Upload all the data in the local file, returning the information
needed to commit the work.
|
public static final org.apache.hadoop.fs.PathFilter PENDINGSET_FILTER
public static final org.apache.hadoop.fs.PathFilter PENDING_FILTER
public CommitOperations(S3AFileSystem fs)
fs
- FS to bind topublic CommitOperations(S3AFileSystem fs, CommitterStatistics committerStatistics)
fs
- FS to bind tocommitterStatistics
- committer statisticspublic static List<com.amazonaws.services.s3.model.PartETag> toPartEtags(List<String> tagIds)
tagIds
- list of tagsprotected CommitterStatistics getStatistics()
public org.apache.hadoop.fs.statistics.IOStatistics getIOStatistics()
getIOStatistics
in interface org.apache.hadoop.fs.statistics.IOStatisticsSource
public List<org.apache.hadoop.fs.LocatedFileStatus> locateAllSinglePendingCommits(org.apache.hadoop.fs.Path pendingDir, boolean recursive) throws IOException
pendingDir
- directoryrecursive
- recursive listing?IOException
- if there is a problem listing the path.public org.apache.commons.lang3.tuple.Pair<PendingSet,List<org.apache.commons.lang3.tuple.Pair<org.apache.hadoop.fs.LocatedFileStatus,IOException>>> loadSinglePendingCommits(org.apache.hadoop.fs.Path pendingDir, boolean recursive) throws IOException
pendingDir
- directory containing commitsrecursive
- do a recursive scan?IOException
- on a failure to list the files.public IOException makeIOE(String key, Exception ex)
key
- key to use in a path exceptionex
- exceptionpublic CommitOperations.MaybeIOE abortAllSinglePendingCommits(org.apache.hadoop.fs.Path pendingDir, boolean recursive) throws IOException
pendingDir
- directory of pending operationsrecursive
- recurse?IOException
- if there is a problem listing the path.protected org.apache.hadoop.fs.RemoteIterator<org.apache.hadoop.fs.LocatedFileStatus> ls(org.apache.hadoop.fs.Path path, boolean recursive) throws IOException
path
- pathrecursive
- recursive listing?IOException
- failurepublic List<com.amazonaws.services.s3.model.MultipartUpload> listPendingUploadsUnderPath(org.apache.hadoop.fs.Path dest) throws IOException
dest
- destination pathIOException
- IO failurepublic int abortPendingUploadsUnderPath(org.apache.hadoop.fs.Path dest) throws IOException
dest
- destination pathIOException
- IO failurepublic void deleteSuccessMarker(org.apache.hadoop.fs.Path outputPath) throws IOException
_SUCCESS
file.outputPath
- output directoryIOException
- IO problempublic void createSuccessMarker(org.apache.hadoop.fs.Path outputPath, SuccessData successData, boolean addMetrics) throws IOException
_SUCCESS
file.outputPath
- output directorysuccessData
- success data to save.addMetrics
- should the FS metrics be added?IOException
- IO problempublic void revertCommit(SinglePendingCommit commit, BulkOperationState operationState) throws IOException
commit
- pending commitoperationState
- nullable operational state for a bulk updateIOException
- failurepublic SinglePendingCommit uploadFileToPendingCommit(File localFile, org.apache.hadoop.fs.Path destPath, String partition, long uploadPartSize, org.apache.hadoop.util.Progressable progress) throws IOException
localFile
- local file (be a file)destPath
- destination pathpartition
- partition/subdir. Not useduploadPartSize
- size of uploadprogress
- progress callbackIOException
- failurepublic void addFileSystemStatistics(Map<String,Long> dest)
dest
- destination mappublic void taskCompleted(boolean success)
success
- success flagpublic void jobCompleted(boolean success)
success
- success flagpublic CommitOperations.CommitContext initiateCommitOperation(org.apache.hadoop.fs.Path path) throws IOException
path
- path for all work.IOException
- failure.public static Optional<Long> extractMagicFileLength(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path) throws IOException
fs
- filesystempath
- pathIOException
- on errorCopyright © 2008–2021 Apache Software Foundation. All rights reserved.