Class ObjectLowLevelOutputStream
- java.lang.Object
-
- java.io.OutputStream
-
- alluxio.underfs.ObjectLowLevelOutputStream
-
- All Implemented Interfaces:
ContentHashable
,java.io.Closeable
,java.io.Flushable
,java.lang.AutoCloseable
@NotThreadSafe public abstract class ObjectLowLevelOutputStream extends java.io.OutputStream implements ContentHashable
[Experimental] A stream for writing a file into object storage using streaming upload. The data transfer is done using object storage low-level multipart upload.We upload data in partitions. When write(), the data will be persisted to a temporary file
mFile
on the local disk. When the datamPartitionOffset
in this temporary file reaches themPartitionSize
, the file will be submitted to the upload executormExecutor
and we do not wait for uploads to finish. A new temp file will be created for the future write and themPartitionOffset
will be reset to zero. The process goes until all the data has been written to temp files.In flush(), we upload the buffered data if they are bigger than 5MB and wait for all uploads to finish. The temp files will be deleted after uploading successfully.
In close(), we upload the last part of data (if exists), wait for all uploads to finish, and complete the multipart upload.
close() will not be retried, but all the multipart upload related operations(init, upload, complete, and abort) will be retried.
If an error occurs and we have no way to recover, we abort the multipart uploads. Some multipart uploads may not be completed/aborted in normal ways and need periodical cleanup by enabling the
PropertyKey.UNDERFS_CLEANUP_ENABLED
. When a leader master starts or a cleanup interval is reached, all the multipart uploads older than clean age will be cleaned.
-
-
Field Summary
Fields Modifier and Type Field Description protected static org.slf4j.Logger
LOG
protected java.lang.String
mBucketName
Bucket name of the object storage bucket.protected boolean
mClosed
Flag to indicate this stream has been closed, to ensure close is only done once.protected java.io.File
mFile
The local temp file that will be uploaded when reaches the partition size or when flush() is called and this file is bigger thanUPLOAD_THRESHOLD
.protected java.security.MessageDigest
mHash
The MD5 hash of the file.protected java.lang.String
mKey
Key of the file when it is uploaded to object storage.protected java.io.OutputStream
mLocalOutputStream
The output stream to the local temp file.protected long
mPartitionOffset
When the offset reaches the partition size, we upload the temp file.protected long
mPartitionSize
The maximum allowed size of a partition.protected java.util.function.Supplier<RetryPolicy>
mRetryPolicy
The retry policy of this multipart upload.protected byte[]
mSingleCharWrite
Pre-allocated byte buffer for writing single characters.protected java.util.List<java.lang.String>
mTmpDirs
protected static long
UPLOAD_THRESHOLD
Only parts bigger than 5MB could be uploaded through multipart upload, except the last part.
-
Constructor Summary
Constructors Constructor Description ObjectLowLevelOutputStream(java.lang.String bucketName, java.lang.String key, com.google.common.util.concurrent.ListeningExecutorService executor, long streamingUploadPartitionSize, AlluxioConfiguration ufsConf)
Constructs a new stream for writing a file.
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected void
abortMultiPartUpload()
protected abstract void
abortMultiPartUploadInternal()
void
close()
protected abstract void
completeMultiPartUploadInternal()
protected abstract void
createEmptyObject(java.lang.String key)
void
flush()
int
getPartNumber()
Get the part number.protected abstract void
initMultiPartUploadInternal()
protected abstract void
putObject(java.lang.String key, java.io.File file, java.lang.String md5)
protected void
uploadPart()
Uploads part async.protected void
uploadPart(java.io.File file, int partNumber, boolean lastPart)
protected abstract void
uploadPartInternal(java.io.File file, int partNumber, boolean isLastPart, java.lang.String md5)
protected void
waitForAllPartsUpload()
void
write(byte[] b)
void
write(byte[] b, int off, int len)
void
write(int b)
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface alluxio.underfs.ContentHashable
getContentHash
-
-
-
-
Field Detail
-
LOG
protected static final org.slf4j.Logger LOG
-
mTmpDirs
protected final java.util.List<java.lang.String> mTmpDirs
-
UPLOAD_THRESHOLD
protected static final long UPLOAD_THRESHOLD
Only parts bigger than 5MB could be uploaded through multipart upload, except the last part.- See Also:
- Constant Field Values
-
mBucketName
protected final java.lang.String mBucketName
Bucket name of the object storage bucket.
-
mKey
protected final java.lang.String mKey
Key of the file when it is uploaded to object storage.
-
mRetryPolicy
protected final java.util.function.Supplier<RetryPolicy> mRetryPolicy
The retry policy of this multipart upload.
-
mSingleCharWrite
protected final byte[] mSingleCharWrite
Pre-allocated byte buffer for writing single characters.
-
mHash
@Nullable protected java.security.MessageDigest mHash
The MD5 hash of the file.
-
mClosed
protected boolean mClosed
Flag to indicate this stream has been closed, to ensure close is only done once.
-
mPartitionOffset
protected long mPartitionOffset
When the offset reaches the partition size, we upload the temp file.
-
mPartitionSize
protected final long mPartitionSize
The maximum allowed size of a partition.
-
mFile
@Nullable protected java.io.File mFile
The local temp file that will be uploaded when reaches the partition size or when flush() is called and this file is bigger thanUPLOAD_THRESHOLD
.
-
mLocalOutputStream
@Nullable protected java.io.OutputStream mLocalOutputStream
The output stream to the local temp file.
-
-
Constructor Detail
-
ObjectLowLevelOutputStream
public ObjectLowLevelOutputStream(java.lang.String bucketName, java.lang.String key, com.google.common.util.concurrent.ListeningExecutorService executor, long streamingUploadPartitionSize, AlluxioConfiguration ufsConf)
Constructs a new stream for writing a file.- Parameters:
bucketName
- the name of the bucketkey
- the key of the filestreamingUploadPartitionSize
- the size in bytes for partitions of streaming uploadsexecutor
- executorufsConf
- the object store under file system configuration
-
-
Method Detail
-
write
public void write(int b) throws java.io.IOException
- Specified by:
write
in classjava.io.OutputStream
- Throws:
java.io.IOException
-
write
public void write(byte[] b) throws java.io.IOException
- Overrides:
write
in classjava.io.OutputStream
- Throws:
java.io.IOException
-
write
public void write(byte[] b, int off, int len) throws java.io.IOException
- Overrides:
write
in classjava.io.OutputStream
- Throws:
java.io.IOException
-
flush
public void flush() throws java.io.IOException
- Specified by:
flush
in interfacejava.io.Flushable
- Overrides:
flush
in classjava.io.OutputStream
- Throws:
java.io.IOException
-
close
public void close() throws java.io.IOException
- Specified by:
close
in interfacejava.lang.AutoCloseable
- Specified by:
close
in interfacejava.io.Closeable
- Overrides:
close
in classjava.io.OutputStream
- Throws:
java.io.IOException
-
uploadPart
protected void uploadPart() throws java.io.IOException
Uploads part async.- Throws:
java.io.IOException
-
uploadPart
protected void uploadPart(java.io.File file, int partNumber, boolean lastPart)
-
abortMultiPartUpload
protected void abortMultiPartUpload()
-
waitForAllPartsUpload
protected void waitForAllPartsUpload() throws java.io.IOException
- Throws:
java.io.IOException
-
getPartNumber
public int getPartNumber()
Get the part number.- Returns:
- the part number
-
uploadPartInternal
protected abstract void uploadPartInternal(java.io.File file, int partNumber, boolean isLastPart, @Nullable java.lang.String md5) throws java.io.IOException
- Throws:
java.io.IOException
-
initMultiPartUploadInternal
protected abstract void initMultiPartUploadInternal() throws java.io.IOException
- Throws:
java.io.IOException
-
completeMultiPartUploadInternal
protected abstract void completeMultiPartUploadInternal() throws java.io.IOException
- Throws:
java.io.IOException
-
abortMultiPartUploadInternal
protected abstract void abortMultiPartUploadInternal() throws java.io.IOException
- Throws:
java.io.IOException
-
createEmptyObject
protected abstract void createEmptyObject(java.lang.String key) throws java.io.IOException
- Throws:
java.io.IOException
-
putObject
protected abstract void putObject(java.lang.String key, java.io.File file, @Nullable java.lang.String md5) throws java.io.IOException
- Throws:
java.io.IOException
-
-