class AsyncOffsetSeqLog extends OffsetSeqLog
Used to write entries to the offset log asynchronously
- Alphabetic
- By Inheritance
- AsyncOffsetSeqLog
- OffsetSeqLog
- HDFSMetadataLog
- Logging
- MetadataLog
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
-  new AsyncOffsetSeqLog(sparkSession: SparkSession, path: String, executorService: ThreadPoolExecutor, offsetCommitIntervalMs: Long, clock: Clock = new SystemClock())
Value Members
- 
      
      
      
        
      
    
      
        final 
        def
      
      
        !=(arg0: Any): Boolean
      
      
      - Definition Classes
- AnyRef → Any
 
- 
      
      
      
        
      
    
      
        final 
        def
      
      
        ##(): Int
      
      
      - Definition Classes
- AnyRef → Any
 
- 
      
      
      
        
      
    
      
        final 
        def
      
      
        ==(arg0: Any): Boolean
      
      
      - Definition Classes
- AnyRef → Any
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        add(batchId: Long, metadata: OffsetSeq): Boolean
      
      
      Store the metadata for the specified batchId and return trueif successful.Store the metadata for the specified batchId and return trueif successful. If the batchId's metadata has already been stored, this method will returnfalse.- Definition Classes
- HDFSMetadataLog → MetadataLog
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        addAsync(batchId: Long, metadata: OffsetSeq): CompletableFuture[(Long, Boolean)]
      
      
      Writes a new batch to the offset log asynchronously Writes a new batch to the offset log asynchronously - batchId
- id of batch to write 
- metadata
- metadata of batch to write 
- returns
- a CompeletableFuture that contains the batch id. The future is completed when the async write of the batch is completed. Future may also be completed exceptionally to indicate some write error. 
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        addNewBatchByStream(batchId: Long)(fn: (OutputStream) ⇒ Unit): Boolean
      
      
      Store the metadata for the specified batchId and return trueif successful.Store the metadata for the specified batchId and return trueif successful. This method fills the content of metadata via executing function. If the function throws an exception, writing will be automatically cancelled and this method will propagate the exception.If the batchId's metadata has already been stored, this method will return false.Writing the metadata is done by writing a batch to a temp file then rename it to the batch file. There may be multiple HDFSMetadataLog using the same metadata path. Although it is not a valid behavior, we still need to prevent it from destroying the files. - Definition Classes
- HDFSMetadataLog
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        applyFnToBatchByStream[RET](batchId: Long, skipExistingCheck: Boolean = false)(fn: (InputStream) ⇒ RET): RET
      
      
      Apply provided function to each entry in the specific batch metadata log. Apply provided function to each entry in the specific batch metadata log. Unlike get which will materialize all entries into memory, this method streamlines the process via READ-AND-PROCESS. This helps to avoid the memory issue on huge metadata log file. NOTE: This no longer fails early on corruption. The caller should handle the exception properly and make sure the logic is not affected by failing in the middle. - Definition Classes
- HDFSMetadataLog
 
- 
      
      
      
        
      
    
      
        final 
        def
      
      
        asInstanceOf[T0]: T0
      
      
      - Definition Classes
- Any
 
- 
      
      
      
        
      
    
      
        
        val
      
      
        batchCache: Map[Long, OffsetSeq]
      
      
      Cache the latest two batches. Cache the latest two batches. StreamExecution usually just accesses the latest two batches when committing offsets, this cache will save some file system operations. - Attributes
- protected[sql]
- Definition Classes
- HDFSMetadataLog
 
- 
      
      
      
        
      
    
      
        
        val
      
      
        batchFilesFilter: PathFilter
      
      
      A PathFilterto filter only batch filesA PathFilterto filter only batch files- Attributes
- protected
- Definition Classes
- HDFSMetadataLog
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        batchIdToPath(batchId: Long): Path
      
      
      - Attributes
- protected
- Definition Classes
- HDFSMetadataLog
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        clone(): AnyRef
      
      
      - Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        deserialize(in: InputStream): OffsetSeq
      
      
      Read and deserialize the metadata from input stream. Read and deserialize the metadata from input stream. If this method is overridden in a subclass, the overriding method should not close the given input stream, as it will be closed in the caller. - Attributes
- protected
- Definition Classes
- OffsetSeqLog → HDFSMetadataLog
 
- 
      
      
      
        
      
    
      
        final 
        def
      
      
        eq(arg0: AnyRef): Boolean
      
      
      - Definition Classes
- AnyRef
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        equals(arg0: Any): Boolean
      
      
      - Definition Classes
- AnyRef → Any
 
- 
      
      
      
        
      
    
      
        
        val
      
      
        fileManager: CheckpointFileManager
      
      
      - Attributes
- protected
- Definition Classes
- HDFSMetadataLog
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        finalize(): Unit
      
      
      - Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        get(startId: Option[Long], endId: Option[Long]): Array[(Long, OffsetSeq)]
      
      
      Return metadata for batches between startId (inclusive) and endId (inclusive). Return metadata for batches between startId (inclusive) and endId (inclusive). If startIdisNone, just return all batches before endId (inclusive).- Definition Classes
- HDFSMetadataLog → MetadataLog
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        get(batchId: Long): Option[OffsetSeq]
      
      
      Return the metadata for the specified batchId if it's stored. Return the metadata for the specified batchId if it's stored. Otherwise, return None. - Definition Classes
- HDFSMetadataLog → MetadataLog
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        getAsyncOffsetWrite(batchId: Long): Option[CompletableFuture[Long]]
      
      
      Get a async offset write by batch id. Get a async offset write by batch id. To check if a corresponding commit log entry needs to be written to durable storage as well - returns
- a option to indicate whether a async offset write was issued for the batch with id 
 
- 
      
      
      
        
      
    
      
        final 
        def
      
      
        getClass(): Class[_]
      
      
      - Definition Classes
- AnyRef → Any
- Annotations
- @native()
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        getLatest(): Option[(Long, OffsetSeq)]
      
      
      Return the latest batch Id and its metadata if exist. Return the latest batch Id and its metadata if exist. - Definition Classes
- HDFSMetadataLog → MetadataLog
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        getLatestBatchId(): Option[Long]
      
      
      Return the latest batch id without reading the file. Return the latest batch id without reading the file. - Definition Classes
- HDFSMetadataLog
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        getOrderedBatchFiles(): Array[FileStatus]
      
      
      Get an array of [FileStatus] referencing batch files. Get an array of [FileStatus] referencing batch files. The array is sorted by most recent batch file first to oldest batch file. - Definition Classes
- HDFSMetadataLog
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        getPrevBatchFromStorage(batchId: Long): Option[Long]
      
      
      Get the id of the previous batch from storage Get the id of the previous batch from storage - batchId
- get the previous batch id of this batch with batchId 
 - Definition Classes
- HDFSMetadataLog
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        hashCode(): Int
      
      
      - Definition Classes
- AnyRef → Any
- Annotations
- @native()
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
      
      
      - Attributes
- protected
- Definition Classes
- Logging
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        initializeLogIfNecessary(isInterpreter: Boolean): Unit
      
      
      - Attributes
- protected
- Definition Classes
- Logging
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        isBatchFile(path: Path): Boolean
      
      
      - Attributes
- protected
- Definition Classes
- HDFSMetadataLog
 
- 
      
      
      
        
      
    
      
        final 
        def
      
      
        isInstanceOf[T0]: Boolean
      
      
      - Definition Classes
- Any
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        isTraceEnabled(): Boolean
      
      
      - Attributes
- protected
- Definition Classes
- Logging
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        listBatches: Array[Long]
      
      
      List the available batches on file system. List the available batches on file system. - Attributes
- protected
- Definition Classes
- HDFSMetadataLog
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        listBatchesOnDisk: Array[Long]
      
      
      List the batches persisted to storage 
- 
      
      
      
        
      
    
      
        
        def
      
      
        log: Logger
      
      
      - Attributes
- protected
- Definition Classes
- Logging
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        logDebug(msg: ⇒ String, throwable: Throwable): Unit
      
      
      - Attributes
- protected
- Definition Classes
- Logging
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        logDebug(msg: ⇒ String): Unit
      
      
      - Attributes
- protected
- Definition Classes
- Logging
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        logError(msg: ⇒ String, throwable: Throwable): Unit
      
      
      - Attributes
- protected
- Definition Classes
- Logging
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        logError(msg: ⇒ String): Unit
      
      
      - Attributes
- protected
- Definition Classes
- Logging
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        logInfo(msg: ⇒ String, throwable: Throwable): Unit
      
      
      - Attributes
- protected
- Definition Classes
- Logging
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        logInfo(msg: ⇒ String): Unit
      
      
      - Attributes
- protected
- Definition Classes
- Logging
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        logName: String
      
      
      - Attributes
- protected
- Definition Classes
- Logging
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        logTrace(msg: ⇒ String, throwable: Throwable): Unit
      
      
      - Attributes
- protected
- Definition Classes
- Logging
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        logTrace(msg: ⇒ String): Unit
      
      
      - Attributes
- protected
- Definition Classes
- Logging
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        logWarning(msg: ⇒ String, throwable: Throwable): Unit
      
      
      - Attributes
- protected
- Definition Classes
- Logging
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        logWarning(msg: ⇒ String): Unit
      
      
      - Attributes
- protected
- Definition Classes
- Logging
 
- 
      
      
      
        
      
    
      
        
        val
      
      
        metadataCacheEnabled: Boolean
      
      
      - Attributes
- protected
- Definition Classes
- HDFSMetadataLog
 
- 
      
      
      
        
      
    
      
        
        val
      
      
        metadataPath: Path
      
      
      - Definition Classes
- HDFSMetadataLog
 
- 
      
      
      
        
      
    
      
        final 
        def
      
      
        ne(arg0: AnyRef): Boolean
      
      
      - Definition Classes
- AnyRef
 
- 
      
      
      
        
      
    
      
        final 
        def
      
      
        notify(): Unit
      
      
      - Definition Classes
- AnyRef
- Annotations
- @native()
 
- 
      
      
      
        
      
    
      
        final 
        def
      
      
        notifyAll(): Unit
      
      
      - Definition Classes
- AnyRef
- Annotations
- @native()
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        offsetSeqMetadataForBatchId(batchId: Long): Option[OffsetSeqMetadata]
      
      
      - Definition Classes
- OffsetSeqLog
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        pathToBatchId(path: Path): Long
      
      
      - Attributes
- protected
- Definition Classes
- HDFSMetadataLog
 
-  def pendingAsyncOffsetWrite(): Int
- 
      
      
      
        
      
    
      
        
        def
      
      
        purge(thresholdBatchId: Long): Unit
      
      
      Purge entries in the offset log up to thresholdBatchId. Purge entries in the offset log up to thresholdBatchId. - Definition Classes
- AsyncOffsetSeqLog → HDFSMetadataLog → MetadataLog
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        purgeAfter(thresholdBatchId: Long): Unit
      
      
      Removes all log entries later than thresholdBatchId (exclusive). Removes all log entries later than thresholdBatchId (exclusive). - Definition Classes
- HDFSMetadataLog
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        removeAsyncOffsetWrite(batchId: Long): Unit
      
      
      Remove the async offset write when we don't need to keep track of it anymore 
- 
      
      
      
        
      
    
      
        
        def
      
      
        serialize(offsetSeq: OffsetSeq, out: OutputStream): Unit
      
      
      Serialize the metadata and write to the output stream. Serialize the metadata and write to the output stream. If this method is overridden in a subclass, the overriding method should not close the given output stream, as it will be closed in the caller. - Attributes
- protected
- Definition Classes
- OffsetSeqLog → HDFSMetadataLog
 
- 
      
      
      
        
      
    
      
        final 
        def
      
      
        synchronized[T0](arg0: ⇒ T0): T0
      
      
      - Definition Classes
- AnyRef
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        toString(): String
      
      
      - Definition Classes
- AnyRef → Any
 
- 
      
      
      
        
      
    
      
        final 
        def
      
      
        wait(): Unit
      
      
      - Definition Classes
- AnyRef
- Annotations
- @throws( ... )
 
- 
      
      
      
        
      
    
      
        final 
        def
      
      
        wait(arg0: Long, arg1: Int): Unit
      
      
      - Definition Classes
- AnyRef
- Annotations
- @throws( ... )
 
- 
      
      
      
        
      
    
      
        final 
        def
      
      
        wait(arg0: Long): Unit
      
      
      - Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
 
- 
      
      
      
        
      
    
      
        
        def
      
      
        write(batchMetadataFile: Path, fn: (OutputStream) ⇒ Unit): Unit
      
      
      - Attributes
- protected
- Definition Classes
- HDFSMetadataLog
 
-  val writtenToDurableStorage: ConcurrentLinkedDeque[Long]