Package org.archive.modules.recrawl
Class ContentDigestHistoryLoader
java.lang.Object
org.archive.modules.Processor
org.archive.modules.recrawl.ContentDigestHistoryLoader
- All Implemented Interfaces:
org.archive.checkpointing.Checkpointable
,org.archive.spring.HasKeyedProperties
,org.springframework.beans.factory.Aware
,org.springframework.beans.factory.BeanNameAware
,org.springframework.context.Lifecycle
public class ContentDigestHistoryLoader extends Processor
-
Field Summary
Fields Modifier and Type Field Description protected AbstractContentDigestHistory
contentDigestHistory
-
Constructor Summary
Constructors Constructor Description ContentDigestHistoryLoader()
-
Method Summary
Modifier and Type Method Description protected void
innerProcess(CrawlURI curi)
Actually performs the process.void
setContentDigestHistory(AbstractContentDigestHistory contentDigestHistory)
protected boolean
shouldProcess(CrawlURI uri)
Determines whether the given uri should be processed by this processor.Methods inherited from class org.archive.modules.Processor
doCheckpoint, finishCheckpoint, flattenVia, fromCheckpointJson, getBeanName, getEnabled, getKeyedProperties, getRecordedSize, getShouldProcessRule, getURICount, hasHttpAuthenticationCredential, innerProcessResult, innerRejectProcess, isRunning, isSuccess, process, report, setBeanName, setEnabled, setRecoveryCheckpoint, setShouldProcessRule, start, startCheckpoint, stop, toCheckpointJson
-
Field Details
-
contentDigestHistory
-
-
Constructor Details
-
ContentDigestHistoryLoader
public ContentDigestHistoryLoader()
-
-
Method Details
-
setContentDigestHistory
-
shouldProcess
Description copied from class:Processor
Determines whether the given uri should be processed by this processor. For instance, a processor that only works on HTML content might reject the URI if its content type is not "text/html", if its content length is zero, and so on.- Specified by:
shouldProcess
in classProcessor
- Parameters:
uri
- the URI to test- Returns:
- true if this processor should process that uri; false if not
-
innerProcess
Description copied from class:Processor
Actually performs the process. By the time this method is invoked, it is known that the given URI passes theProcessor.getEnabled()
, theProcessor.getShouldProcessRule()
and theProcessor.shouldProcess(CrawlURI)
tests.- Specified by:
innerProcess
in classProcessor
- Parameters:
curi
- the URI to process- Throws:
InterruptedException
- if the thread is interrupted
-