Class ObjectLowLevelOutputStream

  • All Implemented Interfaces:
    ContentHashable, java.io.Closeable, java.io.Flushable, java.lang.AutoCloseable

    @NotThreadSafe
    public abstract class ObjectLowLevelOutputStream
    extends java.io.OutputStream
    implements ContentHashable
    [Experimental] A stream for writing a file into object storage using streaming upload. The data transfer is done using object storage low-level multipart upload.

    We upload data in partitions. When write(), the data will be persisted to a temporary file mFile on the local disk. When the data mPartitionOffset in this temporary file reaches the mPartitionSize, the file will be submitted to the upload executor mExecutor and we do not wait for uploads to finish. A new temp file will be created for the future write and the mPartitionOffset will be reset to zero. The process goes until all the data has been written to temp files.

    In flush(), we upload the buffered data if they are bigger than 5MB and wait for all uploads to finish. The temp files will be deleted after uploading successfully.

    In close(), we upload the last part of data (if exists), wait for all uploads to finish, and complete the multipart upload.

    close() will not be retried, but all the multipart upload related operations(init, upload, complete, and abort) will be retried.

    If an error occurs and we have no way to recover, we abort the multipart uploads. Some multipart uploads may not be completed/aborted in normal ways and need periodical cleanup by enabling the PropertyKey.UNDERFS_CLEANUP_ENABLED. When a leader master starts or a cleanup interval is reached, all the multipart uploads older than clean age will be cleaned.

    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected static org.slf4j.Logger LOG  
      protected java.lang.String mBucketName
      Bucket name of the object storage bucket.
      protected boolean mClosed
      Flag to indicate this stream has been closed, to ensure close is only done once.
      protected java.io.File mFile
      The local temp file that will be uploaded when reaches the partition size or when flush() is called and this file is bigger than UPLOAD_THRESHOLD.
      protected java.security.MessageDigest mHash
      The MD5 hash of the file.
      protected java.lang.String mKey
      Key of the file when it is uploaded to object storage.
      protected java.io.OutputStream mLocalOutputStream
      The output stream to the local temp file.
      protected long mPartitionOffset
      When the offset reaches the partition size, we upload the temp file.
      protected long mPartitionSize
      The maximum allowed size of a partition.
      protected java.util.function.Supplier<RetryPolicy> mRetryPolicy
      The retry policy of this multipart upload.
      protected byte[] mSingleCharWrite
      Pre-allocated byte buffer for writing single characters.
      protected java.util.List<java.lang.String> mTmpDirs  
      protected static long UPLOAD_THRESHOLD
      Only parts bigger than 5MB could be uploaded through multipart upload, except the last part.
    • Constructor Summary

      Constructors 
      Constructor Description
      ObjectLowLevelOutputStream​(java.lang.String bucketName, java.lang.String key, com.google.common.util.concurrent.ListeningExecutorService executor, long streamingUploadPartitionSize, AlluxioConfiguration ufsConf)
      Constructs a new stream for writing a file.
    • Field Detail

      • LOG

        protected static final org.slf4j.Logger LOG
      • mTmpDirs

        protected final java.util.List<java.lang.String> mTmpDirs
      • UPLOAD_THRESHOLD

        protected static final long UPLOAD_THRESHOLD
        Only parts bigger than 5MB could be uploaded through multipart upload, except the last part.
        See Also:
        Constant Field Values
      • mBucketName

        protected final java.lang.String mBucketName
        Bucket name of the object storage bucket.
      • mKey

        protected final java.lang.String mKey
        Key of the file when it is uploaded to object storage.
      • mRetryPolicy

        protected final java.util.function.Supplier<RetryPolicy> mRetryPolicy
        The retry policy of this multipart upload.
      • mSingleCharWrite

        protected final byte[] mSingleCharWrite
        Pre-allocated byte buffer for writing single characters.
      • mHash

        @Nullable
        protected java.security.MessageDigest mHash
        The MD5 hash of the file.
      • mClosed

        protected boolean mClosed
        Flag to indicate this stream has been closed, to ensure close is only done once.
      • mPartitionOffset

        protected long mPartitionOffset
        When the offset reaches the partition size, we upload the temp file.
      • mPartitionSize

        protected final long mPartitionSize
        The maximum allowed size of a partition.
      • mFile

        @Nullable
        protected java.io.File mFile
        The local temp file that will be uploaded when reaches the partition size or when flush() is called and this file is bigger than UPLOAD_THRESHOLD.
      • mLocalOutputStream

        @Nullable
        protected java.io.OutputStream mLocalOutputStream
        The output stream to the local temp file.
    • Constructor Detail

      • ObjectLowLevelOutputStream

        public ObjectLowLevelOutputStream​(java.lang.String bucketName,
                                          java.lang.String key,
                                          com.google.common.util.concurrent.ListeningExecutorService executor,
                                          long streamingUploadPartitionSize,
                                          AlluxioConfiguration ufsConf)
        Constructs a new stream for writing a file.
        Parameters:
        bucketName - the name of the bucket
        key - the key of the file
        streamingUploadPartitionSize - the size in bytes for partitions of streaming uploads
        executor - executor
        ufsConf - the object store under file system configuration
    • Method Detail

      • write

        public void write​(int b)
                   throws java.io.IOException
        Specified by:
        write in class java.io.OutputStream
        Throws:
        java.io.IOException
      • write

        public void write​(byte[] b)
                   throws java.io.IOException
        Overrides:
        write in class java.io.OutputStream
        Throws:
        java.io.IOException
      • write

        public void write​(byte[] b,
                          int off,
                          int len)
                   throws java.io.IOException
        Overrides:
        write in class java.io.OutputStream
        Throws:
        java.io.IOException
      • flush

        public void flush()
                   throws java.io.IOException
        Specified by:
        flush in interface java.io.Flushable
        Overrides:
        flush in class java.io.OutputStream
        Throws:
        java.io.IOException
      • close

        public void close()
                   throws java.io.IOException
        Specified by:
        close in interface java.lang.AutoCloseable
        Specified by:
        close in interface java.io.Closeable
        Overrides:
        close in class java.io.OutputStream
        Throws:
        java.io.IOException
      • uploadPart

        protected void uploadPart()
                           throws java.io.IOException
        Uploads part async.
        Throws:
        java.io.IOException
      • uploadPart

        protected void uploadPart​(java.io.File file,
                                  int partNumber,
                                  boolean lastPart)
      • abortMultiPartUpload

        protected void abortMultiPartUpload()
      • waitForAllPartsUpload

        protected void waitForAllPartsUpload()
                                      throws java.io.IOException
        Throws:
        java.io.IOException
      • getPartNumber

        public int getPartNumber()
        Get the part number.
        Returns:
        the part number
      • uploadPartInternal

        protected abstract void uploadPartInternal​(java.io.File file,
                                                   int partNumber,
                                                   boolean isLastPart,
                                                   @Nullable
                                                   java.lang.String md5)
                                            throws java.io.IOException
        Throws:
        java.io.IOException
      • initMultiPartUploadInternal

        protected abstract void initMultiPartUploadInternal()
                                                     throws java.io.IOException
        Throws:
        java.io.IOException
      • completeMultiPartUploadInternal

        protected abstract void completeMultiPartUploadInternal()
                                                         throws java.io.IOException
        Throws:
        java.io.IOException
      • abortMultiPartUploadInternal

        protected abstract void abortMultiPartUploadInternal()
                                                      throws java.io.IOException
        Throws:
        java.io.IOException
      • createEmptyObject

        protected abstract void createEmptyObject​(java.lang.String key)
                                           throws java.io.IOException
        Throws:
        java.io.IOException
      • putObject

        protected abstract void putObject​(java.lang.String key,
                                          java.io.File file,
                                          @Nullable
                                          java.lang.String md5)
                                   throws java.io.IOException
        Throws:
        java.io.IOException