Class MarkSweepGarbageCollector
- java.lang.Object
-
- org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector
-
- All Implemented Interfaces:
BlobGarbageCollector
public class MarkSweepGarbageCollector extends java.lang.Object implements BlobGarbageCollector
Mark and sweep garbage collector. Uses the file system to store internal state while in process to account for huge data. This class is not thread safe.
-
-
Field Summary
Fields Modifier and Type Field Description static int
DEFAULT_BATCH_COUNT
static java.lang.String
DELIM
static Logger
LOG
static java.lang.String
TEMP_DIR
-
Constructor Summary
Constructors Constructor Description MarkSweepGarbageCollector(BlobReferenceRetriever marker, org.apache.jackrabbit.oak.spi.blob.GarbageCollectableBlobStore blobStore, java.util.concurrent.Executor executor, long maxLastModifiedInterval, @Nullable java.lang.String repositoryId, @Nullable org.apache.jackrabbit.oak.spi.whiteboard.Whiteboard whiteboard, @Nullable org.apache.jackrabbit.oak.stats.StatisticsProvider statisticsProvider)
Instantiates a new blob garbage collector.MarkSweepGarbageCollector(BlobReferenceRetriever marker, org.apache.jackrabbit.oak.spi.blob.GarbageCollectableBlobStore blobStore, java.util.concurrent.Executor executor, java.lang.String root, int batchCount, long maxLastModifiedInterval, boolean checkConsistencyAfterGc, boolean sweepIfRefsPastRetention, @Nullable java.lang.String repositoryId, @Nullable org.apache.jackrabbit.oak.spi.whiteboard.Whiteboard whiteboard, @Nullable org.apache.jackrabbit.oak.stats.StatisticsProvider statisticsProvider)
Creates an instance of MarkSweepGarbageCollectorMarkSweepGarbageCollector(BlobReferenceRetriever marker, org.apache.jackrabbit.oak.spi.blob.GarbageCollectableBlobStore blobStore, java.util.concurrent.Executor executor, java.lang.String root, int batchCount, long maxLastModifiedInterval, @Nullable java.lang.String repositoryId)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description long
checkConsistency()
Checks for the DataStore consistency and reports the number of missing blobs still referenced.long
checkConsistency(boolean markOnly)
Collects the blob references and consolidates references from other repositories if available in the DataStore.void
collectGarbage(boolean markOnly)
Marks garbage blobs from the passed node store instance.void
collectGarbage(boolean markOnly, boolean forceBlobRetrieve)
Marks garbage blobs from the passed node store instance.OperationsStatsMBean
getConsistencyOperationStats()
Returns consistency operation statisticsOperationsStatsMBean
getOperationStats()
Returns operation statisticsjava.util.List<GarbageCollectionRepoStats>
getStats()
Returns the stats related to GC for all reposprotected void
iterateNodeTree(GarbageCollectorFileState fs, boolean logPath)
Iterates the complete node tree and collect all blob referencesprotected void
mark(GarbageCollectorFileState fs)
Mark phase of the GC.protected void
markAndSweep(boolean markOnly, boolean forceBlobRetrieve)
Mark and sweep.void
setClock(org.apache.jackrabbit.oak.stats.Clock clock)
void
setTraceOutput(boolean trace)
protected long
sweep(GarbageCollectorFileState fs, long markStart, boolean forceBlobRetrieve)
Sweep phase of gc candidate deletion.
-
-
-
Field Detail
-
LOG
public static final Logger LOG
-
TEMP_DIR
public static final java.lang.String TEMP_DIR
-
DEFAULT_BATCH_COUNT
public static final int DEFAULT_BATCH_COUNT
- See Also:
- Constant Field Values
-
DELIM
public static final java.lang.String DELIM
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
MarkSweepGarbageCollector
public MarkSweepGarbageCollector(BlobReferenceRetriever marker, org.apache.jackrabbit.oak.spi.blob.GarbageCollectableBlobStore blobStore, java.util.concurrent.Executor executor, java.lang.String root, int batchCount, long maxLastModifiedInterval, boolean checkConsistencyAfterGc, boolean sweepIfRefsPastRetention, @Nullable @Nullable java.lang.String repositoryId, @Nullable @Nullable org.apache.jackrabbit.oak.spi.whiteboard.Whiteboard whiteboard, @Nullable @Nullable org.apache.jackrabbit.oak.stats.StatisticsProvider statisticsProvider) throws java.io.IOException
Creates an instance of MarkSweepGarbageCollector- Parameters:
marker
- BlobReferenceRetriever instanced used to fetch refereed blob entriesblobStore
- the blob store instanceexecutor
- executorroot
- the root absolute path of directory under which temporary files would be createdbatchCount
- batch sized used for saving intermediate statemaxLastModifiedInterval
- lastModifiedTime in millis. Only files with time less than this time would be considered for GCrepositoryId
- unique repository id for this nodewhiteboard
- whiteboard instancestatisticsProvider
- statistics provider instance- Throws:
java.io.IOException
-
MarkSweepGarbageCollector
public MarkSweepGarbageCollector(BlobReferenceRetriever marker, org.apache.jackrabbit.oak.spi.blob.GarbageCollectableBlobStore blobStore, java.util.concurrent.Executor executor, java.lang.String root, int batchCount, long maxLastModifiedInterval, @Nullable @Nullable java.lang.String repositoryId) throws java.io.IOException
- Throws:
java.io.IOException
-
MarkSweepGarbageCollector
public MarkSweepGarbageCollector(BlobReferenceRetriever marker, org.apache.jackrabbit.oak.spi.blob.GarbageCollectableBlobStore blobStore, java.util.concurrent.Executor executor, long maxLastModifiedInterval, @Nullable @Nullable java.lang.String repositoryId, @Nullable @Nullable org.apache.jackrabbit.oak.spi.whiteboard.Whiteboard whiteboard, @Nullable @Nullable org.apache.jackrabbit.oak.stats.StatisticsProvider statisticsProvider) throws java.io.IOException
Instantiates a new blob garbage collector.- Throws:
java.io.IOException
-
-
Method Detail
-
collectGarbage
public void collectGarbage(boolean markOnly) throws java.lang.Exception
Description copied from interface:BlobGarbageCollector
Marks garbage blobs from the passed node store instance. Collects them only if markOnly is false.- Specified by:
collectGarbage
in interfaceBlobGarbageCollector
- Parameters:
markOnly
- whether to only mark references and not sweep in the mark and sweep operation.- Throws:
java.lang.Exception
- the exception
-
collectGarbage
public void collectGarbage(boolean markOnly, boolean forceBlobRetrieve) throws java.lang.Exception
Description copied from interface:BlobGarbageCollector
Marks garbage blobs from the passed node store instance. Collects them only if markOnly is false. Also forces retrieval of blob ids from the blob store rather than using any local tracking.- Specified by:
collectGarbage
in interfaceBlobGarbageCollector
- Parameters:
markOnly
- whether to only mark references and not sweep in the mark and sweep operation.forceBlobRetrieve
- whether to force retrieve of blob ids from datastore- Throws:
java.lang.Exception
-
getStats
public java.util.List<GarbageCollectionRepoStats> getStats() throws java.lang.Exception
Returns the stats related to GC for all repos- Specified by:
getStats
in interfaceBlobGarbageCollector
- Returns:
- a list of GarbageCollectionRepoStats objects
- Throws:
java.lang.Exception
-
getOperationStats
public OperationsStatsMBean getOperationStats()
Description copied from interface:BlobGarbageCollector
Returns operation statistics- Specified by:
getOperationStats
in interfaceBlobGarbageCollector
- Returns:
- stats object
-
getConsistencyOperationStats
public OperationsStatsMBean getConsistencyOperationStats()
Description copied from interface:BlobGarbageCollector
Returns consistency operation statistics- Specified by:
getConsistencyOperationStats
in interfaceBlobGarbageCollector
- Returns:
- stats object
-
markAndSweep
protected void markAndSweep(boolean markOnly, boolean forceBlobRetrieve) throws java.lang.Exception
Mark and sweep. Main entry method for GC.- Parameters:
markOnly
- whether to mark onlyforceBlobRetrieve
- force retrieve blob ids- Throws:
java.lang.Exception
- the exception
-
mark
protected void mark(GarbageCollectorFileState fs) throws java.io.IOException, DataStoreException
Mark phase of the GC.- Parameters:
fs
- the garbage collector file state- Throws:
java.io.IOException
DataStoreException
-
sweep
protected long sweep(GarbageCollectorFileState fs, long markStart, boolean forceBlobRetrieve) throws java.lang.Exception
Sweep phase of gc candidate deletion.Performs the following steps depending upon the type of the blob store refer
SharedDataStore.Type
:- Shared
-
- Merge all marked references (from the mark phase run independently) available in the data store meta store (from all configured independent repositories).
- Retrieve all blob ids available.
- Diffs the 2 sets above to retrieve list of blob ids not used.
- Deletes only blobs created after (earliest time stamp of the marked references - #maxLastModifiedInterval) from the above set.
- Default
-
- Mark phase already run.
- Retrieve all blob ids available.
- Diffs the 2 sets above to retrieve list of blob ids not used.
- Deletes only blobs created after (time stamp of the marked references - #maxLastModifiedInterval).
- Parameters:
fs
- the garbage collector file statemarkStart
- the start time of mark to take as reference for deletionforceBlobRetrieve
-- Returns:
- the number of blobs deleted
- Throws:
java.lang.Exception
- the exception
-
iterateNodeTree
protected void iterateNodeTree(GarbageCollectorFileState fs, boolean logPath) throws java.io.IOException
Iterates the complete node tree and collect all blob references- Parameters:
fs
- the garbage collector file statelogPath
- whether to log path in the file or not- Throws:
java.io.IOException
-
checkConsistency
public long checkConsistency(boolean markOnly) throws java.lang.Exception
Description copied from interface:BlobGarbageCollector
Collects the blob references and consolidates references from other repositories if available in the DataStore. Adds relevant metrics.- Specified by:
checkConsistency
in interfaceBlobGarbageCollector
- Returns:
- Throws:
java.lang.Exception
-
checkConsistency
public long checkConsistency() throws java.lang.Exception
Checks for the DataStore consistency and reports the number of missing blobs still referenced.- Specified by:
checkConsistency
in interfaceBlobGarbageCollector
- Returns:
- the missing blobs
- Throws:
java.lang.Exception
-
setTraceOutput
public void setTraceOutput(boolean trace)
-
setClock
public void setClock(org.apache.jackrabbit.oak.stats.Clock clock)
-
-