Uses of Class org.archive.modules.CrawlURI (Heritrix 3: 'modules' subproject (reusable components) 3.4.0-20210527 API)

Packages that use CrawlURI
Package	Description
org.archive.crawler.util
org.archive.modules	The beginnings of a refactored settings framework.
org.archive.modules.credential	Contains html form login and basic and digest credentials used by Heritrix logging into sites.
org.archive.modules.deciderules
org.archive.modules.deciderules.recrawl
org.archive.modules.deciderules.surt
org.archive.modules.extractor
org.archive.modules.fetcher
org.archive.modules.forms
org.archive.modules.net
org.archive.modules.recrawl
org.archive.modules.seeds
org.archive.modules.warc
org.archive.modules.writer
org.archive.state

Uses of CrawlURI in org.archive.crawler.util

Methods in org.archive.crawler.util with parameters of type CrawlURI
Modifier and Type	Method and Description
`void`	CrawledBytesHistotable.`accumulate(CrawlURI curi)`

Uses of CrawlURI in org.archive.modules

Fields in org.archive.modules declared as CrawlURI
Modifier and Type	Field and Description
`protected CrawlURI`	CrawlURI.`fullVia`

Fields in org.archive.modules with type parameters of type CrawlURI
Modifier and Type	Field and Description
`protected Collection<CrawlURI>`	CrawlURI.`outLinks` All discovered outbound urls as CrawlURIs (navlinks, embeds, etc.)

Methods in org.archive.modules that return CrawlURI
Modifier and Type	Method and Description
`CrawlURI`	CrawlURI.`clearPrerequisiteUri()` Clear prerequisite, if any.
`CrawlURI`	CrawlURI.`createCrawlURI(String destination, LinkContext context, Hop hop)`
`CrawlURI`	CrawlURI.`createCrawlURI(UURI destination, LinkContext context, Hop hop)` Utility method for creating CrawlURIs that were found as out links from the current CrawlURI links from this CrawlURI.
`CrawlURI`	CrawlURI.`createCrawlURI(UURI destination, LinkContext context, Hop hop, int scheduling, boolean seed)` Utility method for creation of CrawlURIs found extracting links from this CrawlURI.
`static CrawlURI`	CrawlURI.`fromHopsViaString(String uriHopsViaContext)`
`CrawlURI`	CrawlURI.`getFullVia()`
`CrawlURI`	CrawlURI.`getPrerequisiteUri()` Get the prerequisite for this URI.
`CrawlURI`	CrawlURI.`markPrerequisite(String preq)` Do all actions associated with setting a `CrawlURI` as requiring a prerequisite.

Methods in org.archive.modules that return types with arguments of type CrawlURI
Modifier and Type	Method and Description
`Collection<CrawlURI>`	CrawlURI.`getOutLinks()` Returns discovered links.

Methods in org.archive.modules with parameters of type CrawlURI
Modifier and Type	Method and Description
`int`	CrawlURI.`compareTo(CrawlURI o)`
`static String`	Processor.`flattenVia(CrawlURI puri)`
`static long`	Processor.`getRecordedSize(CrawlURI puri)`
`static boolean`	Processor.`hasHttpAuthenticationCredential(CrawlURI puri)`
`protected void`	CrawlURI.`inheritFrom(CrawlURI ancestor)` Inherit (copy) the relevant keys-values from the ancestor.
`protected void`	ScriptedProcessor.`innerProcess(CrawlURI curi)`
`protected abstract void`	Processor.`innerProcess(CrawlURI uri)` Actually performs the process.
`protected ProcessResult`	Processor.`innerProcessResult(CrawlURI uri)`
`protected void`	Processor.`innerRejectProcess(CrawlURI uri)` Invoked after a URI has been rejected.
`static boolean`	Processor.`isSuccess(CrawlURI puri)`
`ProcessResult`	Processor.`process(CrawlURI uri)` Processes the given URI.
`void`	ProcessorChain.`process(CrawlURI curi, ProcessorChain.ChainStatusReceiver thread)`
`void`	CrawlURI.`setFullVia(CrawlURI curi)`
`void`	CrawlURI.`setPrerequisiteUri(CrawlURI pre)` Set a prerequisite for this URI.
`protected boolean`	ScriptedProcessor.`shouldProcess(CrawlURI curi)`
`protected abstract boolean`	Processor.`shouldProcess(CrawlURI uri)` Determines whether the given uri should be processed by this processor.

Uses of CrawlURI in org.archive.modules.credential

Methods in org.archive.modules.credential with parameters of type CrawlURI
Modifier and Type	Method and Description
`void`	Credential.`attach(CrawlURI curi)` Attach this credentials avatar to the passed `curi` .
`boolean`	Credential.`detach(CrawlURI curi)` Detach this credential from passed curi.
`boolean`	Credential.`detachAll(CrawlURI curi)` Detach all credentials of this type from passed curi.
`static HttpAuthenticationCredential`	HttpAuthenticationCredential.`getByRealm(Set<Credential> rfc2617Credentials, String realm, CrawlURI context)` Convenience method that does look up on passed set using realm for key.
`String`	HttpAuthenticationCredential.`getPrerequisite(CrawlURI curi)`
`String`	HtmlFormCredential.`getPrerequisite(CrawlURI curi)`
`abstract String`	Credential.`getPrerequisite(CrawlURI curi)` Return the authentication URI, either absolute or relative, that serves as prerequisite the passed `curi`.
`boolean`	HttpAuthenticationCredential.`hasPrerequisite(CrawlURI curi)`
`boolean`	HtmlFormCredential.`hasPrerequisite(CrawlURI curi)`
`abstract boolean`	Credential.`hasPrerequisite(CrawlURI curi)`
`boolean`	HttpAuthenticationCredential.`isPrerequisite(CrawlURI curi)`
`boolean`	HtmlFormCredential.`isPrerequisite(CrawlURI curi)`
`abstract boolean`	Credential.`isPrerequisite(CrawlURI curi)`
`boolean`	Credential.`rootUriMatch(ServerCache cache, CrawlURI curi)` Test passed curi matches this credentials rootUri.
`Set<Credential>`	CredentialStore.`subset(CrawlURI context, Class<?> type)` Return set made up of all credentials of the passed `type`.
`Set<Credential>`	CredentialStore.`subset(CrawlURI context, Class<?> type, String rootUri)` Return set made up of all credentials of the passed `type`.

Uses of CrawlURI in org.archive.modules.deciderules

Methods in org.archive.modules.deciderules with parameters of type CrawlURI
Modifier and Type	Method and Description
`boolean`	DecideRule.`accepts(CrawlURI uri)`
`DecideResult`	DecideRule.`decisionFor(CrawlURI uri)`
`protected void`	DecideRuleSequence.`decisionMade(CrawlURI uri, DecideRule decisiveRule, int decisiveRuleNumber, DecideResult result)`
`protected boolean`	NotMatchesRegexDecideRule.`evaluate(CrawlURI object)` Evaluate whether given object's string version does not match configured regex (by reversing the superclass's answer).
`protected boolean`	IpAddressSetDecideRule.`evaluate(CrawlURI curi)`
`protected boolean`	NotMatchesFilePatternDecideRule.`evaluate(CrawlURI uri)` Evaluate whether given object's string version does not match configured regex (by reversing the superclass's answer).
`protected boolean`	SchemeNotInSetDecideRule.`evaluate(CrawlURI uri)` Evaluate whether given object is over the threshold number of hops.
`protected boolean`	FetchStatusDecideRule.`evaluate(CrawlURI uri)` Evaluate whether given object is equal to the configured status
`protected boolean`	FetchStatusNotMatchesRegexDecideRule.`evaluate(CrawlURI object)` Evaluate whether given object's FetchStatus does not match configured regex (by reversing the superclass's answer).
`protected boolean`	TransclusionDecideRule.`evaluate(CrawlURI curi)` Evaluate whether given object is within the acceptable thresholds of transitive hops.
`protected boolean`	TooManyPathSegmentsDecideRule.`evaluate(CrawlURI curi)` Evaluate whether given object is over the threshold number of path-segments.
`protected boolean`	SourceSeedDecideRule.`evaluate(CrawlURI curi)`
`protected boolean`	TooManyHopsDecideRule.`evaluate(CrawlURI uri)` Evaluate whether given object is over the threshold number of hops.
`protected boolean`	ResourceNoLongerThanDecideRule.`evaluate(CrawlURI curi)`
`protected boolean`	ExternalGeoLocationDecideRule.`evaluate(CrawlURI uri)`
`protected boolean`	MatchesListRegexDecideRule.`evaluate(CrawlURI uri)` Evaluate whether given object's string version matches configured regexes
`protected boolean`	NotMatchesListRegexDecideRule.`evaluate(CrawlURI object)` Evaluate whether given object's string version does not match configured regexs (by reversing the superclass's answer).
`protected boolean`	HopCrossesAssignmentLevelDomainDecideRule.`evaluate(CrawlURI uri)`
`protected boolean`	ContentTypeNotMatchesRegexDecideRule.`evaluate(CrawlURI o)` Evaluate whether given object's string version does not match configured regex (by reversing the superclass's answer).
`protected boolean`	ViaSurtPrefixedDecideRule.`evaluate(CrawlURI uri)` Evaluate whether given object's surt form matches one of the supplied surts
`protected boolean`	HasViaDecideRule.`evaluate(CrawlURI uri)` Evaluate whether given object is over the threshold number of hops.
`protected boolean`	MatchesRegexDecideRule.`evaluate(CrawlURI uri)` Evaluate whether given object's string version matches configured regex
`protected boolean`	ResponseContentLengthDecideRule.`evaluate(CrawlURI uri)`
`protected boolean`	MatchesStatusCodeDecideRule.`evaluate(CrawlURI uri)` Returns "true" if the provided CrawlURI has a fetch status that falls within this instance's specified range.
`protected boolean`	NotMatchesStatusCodeDecideRule.`evaluate(CrawlURI uri)` Returns "true" if the provided CrawlURI has a fetch status that does not fall within this instance's specified range.
`protected abstract boolean`	PredicatedDecideRule.`evaluate(CrawlURI object)`
`protected boolean`	AddRedirectFromRootServerToScope.`evaluate(CrawlURI uri)`
`protected String`	IpAddressSetDecideRule.`getHostAddress(CrawlURI curi)` from WriterPoolProcessor
`protected String`	FetchStatusMatchesRegexDecideRule.`getString(CrawlURI uri)`
`protected String`	HopsPathMatchesRegexDecideRule.`getString(CrawlURI uri)`
`protected String`	ContentTypeMatchesRegexDecideRule.`getString(CrawlURI uri)`
`protected String`	MatchesRegexDecideRule.`getString(CrawlURI uri)`
`DecideResult`	DecideRuleSequence.`innerDecide(CrawlURI uri)`
`protected DecideResult`	SeedAcceptDecideRule.`innerDecide(CrawlURI uri)`
`protected DecideResult`	AcceptDecideRule.`innerDecide(CrawlURI uri)`
`DecideResult`	ScriptedDecideRule.`innerDecide(CrawlURI uri)`
`protected DecideResult`	RejectDecideRule.`innerDecide(CrawlURI uri)`
`protected DecideResult`	ContentLengthDecideRule.`innerDecide(CrawlURI uri)`
`DecideResult`	PrerequisiteAcceptDecideRule.`innerDecide(CrawlURI uri)`
`protected DecideResult`	PathologicalPathDecideRule.`innerDecide(CrawlURI uri)`
`protected DecideResult`	PredicatedDecideRule.`innerDecide(CrawlURI uri)`
`protected abstract DecideResult`	DecideRule.`innerDecide(CrawlURI uri)`
`DecideResult`	AcceptDecideRule.`onlyDecision(CrawlURI uri)`
`DecideResult`	RejectDecideRule.`onlyDecision(CrawlURI uri)`
`DecideResult`	PredicatedDecideRule.`onlyDecision(CrawlURI uri)`
`DecideResult`	DecideRule.`onlyDecision(CrawlURI uri)`

Uses of CrawlURI in org.archive.modules.deciderules.recrawl

Methods in org.archive.modules.deciderules.recrawl with parameters of type CrawlURI
Modifier and Type	Method and Description
`protected boolean`	IdenticalDigestDecideRule.`evaluate(CrawlURI curi)` Evaluate whether given CrawlURI's revisit profile has been set to identical digest
`static boolean`	IdenticalDigestDecideRule.`hasIdenticalDigest(CrawlURI curi)` Utility method for testing if a CrawlURI's revisit profile matches an identical payload digest.

Uses of CrawlURI in org.archive.modules.deciderules.surt

Methods in org.archive.modules.deciderules.surt with parameters of type CrawlURI
Modifier and Type	Method and Description
`void`	SurtPrefixedDecideRule.`addedSeed(CrawlURI curi)` If appropriate, convert seed notification into prefix-addition.
`protected boolean`	NotOnHostsDecideRule.`evaluate(CrawlURI object)` Evaluate whether given object's URI is NOT in the set of hosts -- simply reverse superclass's determination
`protected boolean`	NotSurtPrefixedDecideRule.`evaluate(CrawlURI object)` Evaluate whether given object's URI is NOT in the SURT prefix set -- simply reverse superclass's determination
`protected boolean`	NotOnDomainsDecideRule.`evaluate(CrawlURI object)` Evaluate whether given object's URI is NOT in the set of domains -- simply reverse superclass's determination
`protected boolean`	SurtPrefixedDecideRule.`evaluate(CrawlURI uri)` Evaluate whether given object's URI is covered by the SURT prefix set

Uses of CrawlURI in org.archive.modules.extractor

Fields in org.archive.modules.extractor declared as CrawlURI
Modifier and Type	Field and Description
`protected CrawlURI`	ExtractorSWF.CrawlUriSWFAction.`curi`
`CrawlURI`	StringExtractorTestBase.TestData.`expectedResult`
`CrawlURI`	StringExtractorTestBase.TestData.`uri`

Methods in org.archive.modules.extractor that return CrawlURI
Modifier and Type	Method and Description
`static CrawlURI`	Extractor.`addRelativeToBase(CrawlURI uri, int max, String newUri, LinkContext context, Hop hop)`
`static CrawlURI`	Extractor.`addRelativeToVia(CrawlURI uri, int max, String newUri, LinkContext context, Hop hop)`
`protected CrawlURI`	ContentExtractorTestBase.`defaultURI()` Returns a CrawlURI for testing purposes.

Methods in org.archive.modules.extractor with parameters of type CrawlURI
Modifier and Type	Method and Description
`static void`	Extractor.`add(CrawlURI uri, int max, String newUri, LinkContext context, Hop hop)`
`protected void`	ExtractorSWF.CrawlUriSWFAction.`addAnnotations(CrawlURI relToVia, CrawlURI relToBase)`
`protected void`	ExtractorHTTP.`addContentLocationHeaderLink(CrawlURI curi, String headerKey)`
`protected void`	ExtractorHTTP.`addHeaderLink(CrawlURI curi, String headerKey)`
`protected void`	ExtractorHTTP.`addHeaderLink(CrawlURI curi, String headerName, String url)`
`protected void`	ExtractorHTML.`addLinkFromString(CrawlURI curi, CharSequence uri, CharSequence context, Hop hop)`
`protected void`	Extractor.`addOutlink(CrawlURI curi, String uri, LinkContext context, Hop hop)` Create and add a 'Link' to the CrawlURI with given URI/context/hop-type
`protected void`	Extractor.`addOutlink(CrawlURI curi, UURI uuri, LinkContext context, Hop hop)`
`protected void`	ExtractorHTTP.`addRefreshHeaderLink(CrawlURI curi, String headerKey)`
`static CrawlURI`	Extractor.`addRelativeToBase(CrawlURI uri, int max, String newUri, LinkContext context, Hop hop)`
`static CrawlURI`	Extractor.`addRelativeToVia(CrawlURI uri, int max, String newUri, LinkContext context, Hop hop)`
`protected static void`	ContentExtractorTestBase.`assertNoSideEffects(CrawlURI uri)` Asserts that the given URI has no URI errors, no localized errors, and no annotations.
`protected void`	ExtractorMultipleRegex.`buildAndAddOutlink(CrawlURI curi, Map<String,Object> bindings)`
`protected void`	ExtractorHTML.`considerIfLikelyUri(CrawlURI curi, CharSequence candidate, CharSequence valueContext, Hop hop)` Consider whether a given string is URI-like.
`protected void`	ExtractorHTML.`considerQueryStringValues(CrawlURI curi, CharSequence queryString, CharSequence valueContext, Hop hop)` Consider a query-string-like collections of key=value[&key=value] pairs for URI-like strings in the values.
`protected boolean`	ExtractorJS.`considerString(Extractor ext, CrawlURI curi, boolean handlingJSFile, String candidate)`
`protected long`	ExtractorJS.`considerStrings(CrawlURI curi, CharSequence cs)`
`long`	ExtractorJS.`considerStrings(Extractor ext, CrawlURI curi, CharSequence cs)`
`long`	ExtractorJS.`considerStrings(Extractor ext, CrawlURI curi, CharSequence cs, boolean handlingJSFile)`
`void`	ExtractorURI.`extract(CrawlURI curi)` Perform usual extraction on a CrawlURI
`void`	ExtractorMultipleRegex.`extract(CrawlURI curi)`
`protected void`	ExtractorHTTP.`extract(CrawlURI curi)`
`protected void`	ContentExtractor.`extract(CrawlURI uri)` Extracts links
`void`	ExtractorImpliedURI.`extract(CrawlURI curi)` Perform usual extraction on a CrawlURI
`protected abstract void`	Extractor.`extract(CrawlURI uri)` Extracts links from the given URI.
`protected void`	ExtractorHTML.`extract(CrawlURI curi, CharSequence cs)` Run extractor.
`protected void`	JerichoExtractorHTML.`extract(CrawlURI curi, CharSequence cs)` Run extractor.
`protected void`	ExtractorURI.`extractLink(CrawlURI curi, CrawlURI wref)` Consider a single Link for internal URIs
`protected Charset`	ExtractorXML.`getContentDeclaredCharset(CrawlURI curi, String contentPrefix)`
`protected Charset`	ExtractorHTML.`getContentDeclaredCharset(CrawlURI curi, String contentPrefix)`
`protected boolean`	ExtractorSWF.`innerExtract(CrawlURI curi)`
`protected boolean`	ExtractorSitemap.`innerExtract(CrawlURI uri)`
`protected boolean`	TrapSuppressExtractor.`innerExtract(CrawlURI curi)`
`protected boolean`	ExtractorUniversal.`innerExtract(CrawlURI curi)`
`protected boolean`	ExtractorRobotsTxt.`innerExtract(CrawlURI curi)`
`protected boolean`	ExtractorXML.`innerExtract(CrawlURI curi)`
`protected boolean`	ExtractorDOC.`innerExtract(CrawlURI curi)` Processes a word document and extracts any hyperlinks from it.
`boolean`	ExtractorCSS.`innerExtract(CrawlURI curi)`
`protected boolean`	ExtractorPDF.`innerExtract(CrawlURI curi)`
`protected boolean`	ExtractorJS.`innerExtract(CrawlURI curi)`
`protected abstract boolean`	ContentExtractor.`innerExtract(CrawlURI uri)` Actually extracts links.
`boolean`	ExtractorHTML.`innerExtract(CrawlURI curi)`
`protected void`	HTTPContentDigest.`innerProcess(CrawlURI curi)`
`protected void`	Extractor.`innerProcess(CrawlURI uri)` Processes the given URI.
`protected boolean`	ExtractorHTML.`isHtmlExpectedHere(CrawlURI curi)` Test whether this HTML is so unexpected (eg in place of a GIF URI) that it shouldn't be scanned for links.
`protected void`	ExtractorHTML.`processEmbed(CrawlURI curi, CharSequence value, CharSequence context)`
`protected void`	ExtractorHTML.`processEmbed(CrawlURI curi, CharSequence value, CharSequence context, Hop hop)`
`protected void`	JerichoExtractorHTML.`processForm(CrawlURI curi, au.id.jericho.lib.html.Element element)`
`protected void`	ExtractorHTML.`processGeneralTag(CrawlURI curi, CharSequence element, CharSequence cs)`
`protected void`	JerichoExtractorHTML.`processGeneralTag(CrawlURI curi, au.id.jericho.lib.html.Element element, au.id.jericho.lib.html.Attributes attributes)`
`protected void`	ExtractorHTML.`processLink(CrawlURI curi, CharSequence value, CharSequence context)` Handle generic HREF cases.
`protected boolean`	ExtractorHTML.`processMeta(CrawlURI curi, CharSequence cs)` Process metadata tags.
`protected boolean`	JerichoExtractorHTML.`processMeta(CrawlURI curi, au.id.jericho.lib.html.Element element)`
`protected void`	AggressiveExtractorHTML.`processScript(CrawlURI curi, CharSequence sequence, int endOfOpenTag)`
`protected void`	ExtractorHTML.`processScript(CrawlURI curi, CharSequence sequence, int endOfOpenTag)`
`protected void`	JerichoExtractorHTML.`processScript(CrawlURI curi, au.id.jericho.lib.html.Element element)`
`protected void`	ExtractorHTML.`processScriptCode(CrawlURI curi, CharSequence cs)` Extract the (java)script source in the given CharSequence.
`protected void`	ExtractorHTML.`processStyle(CrawlURI curi, CharSequence sequence, int endOfOpenTag)` Process style text.
`protected void`	JerichoExtractorHTML.`processStyle(CrawlURI curi, au.id.jericho.lib.html.Element element)`
`static long`	ExtractorCSS.`processStyleCode(Extractor ext, CrawlURI curi, CharSequence cs)`
`static long`	ExtractorXML.`processXml(Extractor ext, CrawlURI curi, CharSequence cs)`
`protected boolean`	ExtractorSWF.`shouldExtract(CrawlURI uri)`
`protected boolean`	ExtractorSitemap.`shouldExtract(CrawlURI uri)`
`protected boolean`	TrapSuppressExtractor.`shouldExtract(CrawlURI uri)`
`protected boolean`	ExtractorUniversal.`shouldExtract(CrawlURI uri)`
`protected boolean`	ExtractorRobotsTxt.`shouldExtract(CrawlURI uri)`
`protected boolean`	ExtractorXML.`shouldExtract(CrawlURI curi)`
`protected boolean`	ExtractorDOC.`shouldExtract(CrawlURI uri)`
`protected boolean`	ExtractorCSS.`shouldExtract(CrawlURI curi)`
`protected boolean`	ExtractorPDF.`shouldExtract(CrawlURI uri)`
`protected boolean`	ExtractorJS.`shouldExtract(CrawlURI uri)`
`protected abstract boolean`	ContentExtractor.`shouldExtract(CrawlURI uri)` Determines if otherwise valid URIs should have links extracted or not.
`protected boolean`	ExtractorHTML.`shouldExtract(CrawlURI uri)`
`protected boolean`	ExtractorURI.`shouldProcess(CrawlURI uri)`
`protected boolean`	ExtractorMultipleRegex.`shouldProcess(CrawlURI uri)`
`protected boolean`	ExtractorHTTP.`shouldProcess(CrawlURI uri)`
`protected boolean`	HTTPContentDigest.`shouldProcess(CrawlURI uri)`
`protected boolean`	ContentExtractor.`shouldProcess(CrawlURI uri)` Determines if links should be extracted from the given URI.
`protected boolean`	ExtractorImpliedURI.`shouldProcess(CrawlURI uri)`

Constructors in org.archive.modules.extractor with parameters of type CrawlURI
Constructor and Description
`CrawlUriSWFAction(CrawlURI curi, Extractor ext)`
`TestData(CrawlURI uri, CrawlURI expectedResult)`

Uses of CrawlURI in org.archive.modules.fetcher

Fields in org.archive.modules.fetcher declared as CrawlURI
Modifier and Type	Field and Description
`protected CrawlURI`	FetchHTTPRequest.`curi`

Methods in org.archive.modules.fetcher with parameters of type CrawlURI
Modifier and Type	Method and Description
`protected void`	FetchHTTP.`addResponseContent(org.apache.http.HttpResponse response, CrawlURI curi)` This method populates `curi` with response status and content type.
`protected void`	FetchWhois.`addWhoisLink(CrawlURI curi, String query)`
`protected void`	FetchWhois.`addWhoisLinks(CrawlURI curi)` Adds outlinks to whois:{domain} and whois:{ipAddress}
`protected org.apache.http.HttpEntity`	FetchHTTPRequest.`buildPostRequestEntity(CrawlURI curi)`
`protected boolean`	FetchHTTP.`checkMidfetchAbort(CrawlURI curi)`
`protected void`	FetchHTTP.`cleanup(CrawlURI curi, Exception exception, String message, int status)` Cleanup after a failed method execute.
`org.apache.http.client.CookieStore`	FetchHTTPCookieStore.`cookieStoreFor(CrawlURI curi)` Returns a `CookieStore` whose `CookieStore.getCookies()` returns all the cookies that could possibly apply `curi`.
`org.apache.http.client.CookieStore`	AbstractCookieStore.`cookieStoreFor(CrawlURI curi)`
`protected ProcessResult`	FetchWhois.`deferOrFinishGeneric(CrawlURI curi, String domainOrIp)`
`protected void`	FetchHTTP.`doAbort(CrawlURI curi, org.apache.http.client.methods.AbstractExecutionAwareRequest request, String annotation)`
`protected Map<String,String>`	FetchHTTP.`extractChallenges(org.apache.http.HttpResponse response, CrawlURI curi, org.apache.http.client.AuthenticationStrategy authStrategy)`
`protected void`	FetchHTTP.`failedExecuteCleanup(CrawlURI curi, Exception exception)` Cleanup after a failed method execute.
`protected void`	FetchWhois.`fetch(CrawlURI curi, String whoisServer, String whoisQuery)`
`protected Object`	FetchHTTP.`getAttributeEither(CrawlURI curi, String key)` Get a value either from inside the CrawlURI instance, or from settings (module attributes).
`protected Set<Credential>`	FetchHTTP.`getCredentials(CrawlURI curi, Class<?> type)`
`protected static String`	FetchHTTP.`getServerKey(CrawlURI uri)`
`protected String`	FetchWhois.`getWhoisQuery(CrawlURI curi)`
`protected String`	FetchWhois.`getWhoisServer(CrawlURI curi)`
`protected void`	FetchHTTP.`handle401(org.apache.http.HttpResponse response, CrawlURI curi)` Server is looking for basic/digest auth credentials (RFC2617).
`protected void`	FetchFTP.`innerProcess(CrawlURI curi)` Processes the given URI.
`protected void`	FetchWhois.`innerProcess(CrawlURI uri)`
`protected void`	FetchSFTP.`innerProcess(CrawlURI curi)` Processes the given URI.
`protected void`	FetchDNS.`innerProcess(CrawlURI curi)`
`protected void`	FetchHTTP.`innerProcess(CrawlURI curi)`
`protected ProcessResult`	FetchWhois.`innerProcessResult(CrawlURI curi)`
`protected boolean`	FetchDNS.`isQuadAddress(CrawlURI curi, String dnsName, CrawlHost targetHost)`
`protected boolean`	FetchHTTP.`maybeMidfetchAbort(CrawlURI curi, org.apache.http.client.methods.AbstractExecutionAwareRequest request)`
`protected void`	FetchHTTP.`promoteCredentials(CrawlURI curi)` Promote successful credential to the server.
`protected void`	FetchDNS.`recordDNS(CrawlURI curi, org.xbill.DNS.Record[] rrecordSet)`
`protected void`	FetchHTTP.`setCharacterEncoding(CrawlURI curi, Recorder rec, org.apache.http.HttpResponse response)` Set the character encoding based on the result headers or default.
`protected void`	FetchHTTP.`setOtherCodings(CrawlURI uri, Recorder rec, org.apache.http.HttpResponse response)` Set the transfer, content encodings based on headers (if necessary).
`protected void`	FetchHTTP.`setSizes(CrawlURI curi, Recorder rec)` Update CrawlURI internal sizes based on current transaction (and in the case of 304s, history)
`protected void`	FetchDNS.`setUnresolvable(CrawlURI curi, CrawlHost host)`
`protected boolean`	FetchFTP.`shouldProcess(CrawlURI curi)`
`protected boolean`	FetchWhois.`shouldProcess(CrawlURI uri)`
`protected boolean`	FetchSFTP.`shouldProcess(CrawlURI curi)`
`protected boolean`	FetchDNS.`shouldProcess(CrawlURI curi)`
`protected boolean`	FetchHTTP.`shouldProcess(CrawlURI curi)` Can this processor fetch the given CrawlURI.
`protected void`	FetchDNS.`storeDNSRecord(CrawlURI curi, String dnsName, CrawlHost targetHost, org.xbill.DNS.Record[] rrecordSet)`
`void`	FetchStats.`tally(CrawlURI curi, FetchStats.Stage stage)`
`void`	FetchStats.CollectsFetchStats.`tally(CrawlURI curi, FetchStats.Stage stage)`

Constructors in org.archive.modules.fetcher with parameters of type CrawlURI
Constructor and Description
`FetchHTTPRequest(FetchHTTP fetcher, CrawlURI curi)`

Uses of CrawlURI in org.archive.modules.forms

Methods in org.archive.modules.forms with parameters of type CrawlURI
Modifier and Type	Method and Description
`protected void`	ExtractorHTMLForms.`analyze(CrawlURI curi, CharSequence cs)` Run analysis: find form METHOD, ACTION, and all INPUT names/values Log as configured.
`protected void`	FormLoginProcessor.`createFormSubmissionAttempt(CrawlURI curi, HTMLForm templateForm, String formProvince)`
`void`	ExtractorHTMLForms.`extract(CrawlURI curi)`
`protected String`	FormLoginProcessor.`getFormProvince(CrawlURI curi)` Get the 'form province' - either the configured (applicableSurtPrefix) or inferred (full current server) range of URIs that is considered covered by one form login
`protected void`	FormLoginProcessor.`innerProcess(CrawlURI curi)`
`protected boolean`	FormLoginProcessor.`shouldProcess(CrawlURI curi)`
`protected boolean`	ExtractorHTMLForms.`shouldProcess(CrawlURI uri)`

Uses of CrawlURI in org.archive.modules.net

Methods in org.archive.modules.net with parameters of type CrawlURI
Modifier and Type	Method and Description
`boolean`	ObeyRobotsPolicy.`allows(String userAgent, CrawlURI curi, Robotstxt robotstxt)`
`boolean`	FirstNamedRobotsPolicy.`allows(String userAgent, CrawlURI curi, Robotstxt robotstxt)`
`boolean`	CustomRobotsPolicy.`allows(String userAgent, CrawlURI curi, Robotstxt robotstxt)`
`boolean`	MostFavoredRobotsPolicy.`allows(String userAgent, CrawlURI curi, Robotstxt robotstxt)`
`abstract boolean`	RobotsPolicy.`allows(String userAgent, CrawlURI curi, Robotstxt robotstxt)`
`boolean`	IgnoreRobotsPolicy.`allows(String userAgent, CrawlURI curi, Robotstxt robotstxt)`
`String`	RobotsPolicy.`getPathQuery(CrawlURI curi)`
`void`	CrawlServer.`updateRobots(CrawlURI curi)` Update the server's robotstxt

Uses of CrawlURI in org.archive.modules.recrawl

Methods in org.archive.modules.recrawl with parameters of type CrawlURI
Modifier and Type	Method and Description
`static boolean`	FetchHistoryProcessor.`hasIdenticalDigest(CrawlURI curi)` Utility method for testing if a CrawlURI's last two history entries (one being the most recent fetch) have identical content-digest information.
`protected boolean`	AbstractPersistProcessor.`hasWriteTag(CrawlURI uri)`
`protected HashMap<String,Object>[]`	FetchHistoryProcessor.`historyRealloc(CrawlURI curi)` Get or create proper-sized history array
`protected void`	FetchHistoryProcessor.`innerProcess(CrawlURI puri)`
`protected void`	PersistStoreProcessor.`innerProcess(CrawlURI curi)`
`protected void`	PersistLoadProcessor.`innerProcess(CrawlURI curi)`
`protected void`	PersistLogProcessor.`innerProcess(CrawlURI curi)`
`protected void`	ContentDigestHistoryStorer.`innerProcess(CrawlURI curi)`
`protected void`	ContentDigestHistoryLoader.`innerProcess(CrawlURI curi)`
`void`	BdbContentDigestHistory.`load(CrawlURI curi)`
`abstract void`	AbstractContentDigestHistory.`load(CrawlURI curi)` Looks up the history by key `persistKeyFor(curi)` and loads it into `curi.getContentDigestHistory()`.
`protected String`	AbstractContentDigestHistory.`persistKeyFor(CrawlURI curi)`
`static String`	PersistProcessor.`persistKeyFor(CrawlURI curi)` Return a preferred String key for persisting the given CrawlURI's AList state.
`protected void`	FetchHistoryProcessor.`saveHeader(CrawlURI curi, Map<String,Object> map, String key)` Save a header from the given HTTP operation into the Map.
`protected boolean`	AbstractPersistProcessor.`shouldLoad(CrawlURI curi)` Whether the current CrawlURI's state should be loaded
`protected boolean`	FetchHistoryProcessor.`shouldProcess(CrawlURI curi)`
`protected boolean`	PersistStoreProcessor.`shouldProcess(CrawlURI uri)`
`protected boolean`	PersistLoadProcessor.`shouldProcess(CrawlURI uri)`
`protected boolean`	PersistLogProcessor.`shouldProcess(CrawlURI uri)`
`protected boolean`	ContentDigestHistoryStorer.`shouldProcess(CrawlURI uri)`
`protected boolean`	ContentDigestHistoryLoader.`shouldProcess(CrawlURI uri)`
`protected boolean`	AbstractPersistProcessor.`shouldStore(CrawlURI curi)` Whether the current CrawlURI's state should be persisted (to log or direct to database)
`void`	BdbContentDigestHistory.`store(CrawlURI curi)`
`abstract void`	AbstractContentDigestHistory.`store(CrawlURI curi)` Stores `curi.getContentDigestHistory()` for the key `persistKeyFor(curi)`.

Uses of CrawlURI in org.archive.modules.seeds

Methods in org.archive.modules.seeds with parameters of type CrawlURI
Modifier and Type	Method and Description
`void`	SeedListener.`addedSeed(CrawlURI uuri)`
`void`	TextSeedModule.`addSeed(CrawlURI curi)` Add a new seed to scope.
`abstract void`	SeedModule.`addSeed(CrawlURI curi)`
`protected void`	SeedModule.`publishAddedSeed(CrawlURI curi)`

Uses of CrawlURI in org.archive.modules.warc

Methods in org.archive.modules.warc with parameters of type CrawlURI
Modifier and Type	Method and Description
`org.archive.io.warc.WARCRecordInfo`	MetadataRecordBuilder.`buildRecord(CrawlURI curi, URI concurrentTo)`
`org.archive.io.warc.WARCRecordInfo`	HttpRequestRecordBuilder.`buildRecord(CrawlURI curi, URI concurrentTo)`
`org.archive.io.warc.WARCRecordInfo`	DnsResponseRecordBuilder.`buildRecord(CrawlURI curi, URI concurrentTo)`
`org.archive.io.warc.WARCRecordInfo`	FtpControlConversationRecordBuilder.`buildRecord(CrawlURI curi, URI concurrentTo)`
`org.archive.io.warc.WARCRecordInfo`	HttpResponseRecordBuilder.`buildRecord(CrawlURI curi, URI concurrentTo)`
`org.archive.io.warc.WARCRecordInfo`	RevisitRecordBuilder.`buildRecord(CrawlURI curi, URI concurrentTo)`
`org.archive.io.warc.WARCRecordInfo`	FtpResponseRecordBuilder.`buildRecord(CrawlURI curi, URI concurrentTo)`
`org.archive.io.warc.WARCRecordInfo`	WARCRecordBuilder.`buildRecord(CrawlURI curi, URI concurrentTo)` Builds a warc record for this capture.
`org.archive.io.warc.WARCRecordInfo`	WhoisResponseRecordBuilder.`buildRecord(CrawlURI curi, URI concurrentTo)`
`protected String`	BaseWARCRecordBuilder.`getHostAddress(CrawlURI curi)` Return IP address of given URI suitable for recording (as in a classic ARC 5-field header line).
`boolean`	MetadataRecordBuilder.`shouldBuildRecord(CrawlURI curi)` If you don't want metadata records, take this class out of the chain.
`boolean`	HttpRequestRecordBuilder.`shouldBuildRecord(CrawlURI curi)`
`boolean`	DnsResponseRecordBuilder.`shouldBuildRecord(CrawlURI curi)`
`boolean`	FtpControlConversationRecordBuilder.`shouldBuildRecord(CrawlURI curi)`
`boolean`	HttpResponseRecordBuilder.`shouldBuildRecord(CrawlURI curi)`
`boolean`	RevisitRecordBuilder.`shouldBuildRecord(CrawlURI curi)`
`boolean`	FtpResponseRecordBuilder.`shouldBuildRecord(CrawlURI curi)`
`boolean`	WARCRecordBuilder.`shouldBuildRecord(CrawlURI curi)` Decides whether to build a record for the given capture.
`boolean`	WhoisResponseRecordBuilder.`shouldBuildRecord(CrawlURI curi)`

Uses of CrawlURI in org.archive.modules.writer

Methods in org.archive.modules.writer with parameters of type CrawlURI
Modifier and Type	Method and Description
`protected void`	WriterPoolProcessor.`copyForwardWriteTagIfDupe(CrawlURI curi)` If this fetch is identical to the last written (archived) fetch, then copy forward the writeTag.
`protected String`	WriterPoolProcessor.`getHostAddress(CrawlURI curi)` Deprecated. WARCRecordBuilder instances use `BaseWARCRecordBuilder.getHostAddress(CrawlURI)`
`protected OutputStream`	Kw3WriterProcessor.`initOutputStream(CrawlURI curi)` Get the OutputStream for the file to write to.
`protected void`	MirrorWriterProcessor.`innerProcess(CrawlURI curi)`
`protected void`	Kw3WriterProcessor.`innerProcess(CrawlURI curi)`
`protected void`	WriterPoolProcessor.`innerProcess(CrawlURI puri)`
`protected ProcessResult`	WARCWriterProcessor.`innerProcessResult(CrawlURI curi)` Deprecated. Writes a CrawlURI and its associated data to store file.
`protected ProcessResult`	WARCWriterChainProcessor.`innerProcessResult(CrawlURI curi)`
`protected ProcessResult`	ARCWriterProcessor.`innerProcessResult(CrawlURI curi)` Writes a CrawlURI and its associated data to store file.
`protected abstract ProcessResult`	WriterPoolProcessor.`innerProcessResult(CrawlURI uri)`
`protected void`	WriterPoolProcessor.`innerRejectProcess(CrawlURI curi)`
`protected void`	WARCWriterProcessor.`saveHeader(CrawlURI curi, org.archive.util.anvl.ANVLRecord warcHeaders, String origName, String newName)` Deprecated. Saves a header from the given HTTP operation into the provider headers under a new name
`protected boolean`	MirrorWriterProcessor.`shouldProcess(CrawlURI curi)`
`protected boolean`	Kw3WriterProcessor.`shouldProcess(CrawlURI curi)`
`protected boolean`	WriterPoolProcessor.`shouldProcess(CrawlURI curi)`
`protected boolean`	WARCWriterChainProcessor.`shouldWrite(CrawlURI curi)`
`protected boolean`	WriterPoolProcessor.`shouldWrite(CrawlURI curi)` Whether the given CrawlURI should be written to archive files.
`protected void`	BaseWARCWriterProcessor.`updateMetadataAfterWrite(CrawlURI curi, org.archive.io.warc.WARCWriter writer, long startPosition)`
`protected ProcessResult`	WARCWriterChainProcessor.`write(CrawlURI curi)`
`protected ProcessResult`	ARCWriterProcessor.`write(CrawlURI curi, long recordLength, InputStream in, String ip)`
`protected ProcessResult`	WARCWriterProcessor.`write(String lowerCaseScheme, CrawlURI curi)` Deprecated.
`protected void`	Kw3WriterProcessor.`writeArchiveInfoPart(String boundary, CrawlURI curi, ReplayInputStream ris, OutputStream out)`
`protected void`	Kw3WriterProcessor.`writeContentPart(String boundary, CrawlURI curi, ReplayInputStream ris, OutputStream out)`
`protected void`	WARCWriterProcessor.`writeDnsRecords(CrawlURI curi, org.archive.io.warc.WARCWriter w, URI baseid, String timestamp)` Deprecated.
`protected URI`	WARCWriterProcessor.`writeFtpControlConversation(org.archive.io.warc.WARCWriter w, String timestamp, URI baseid, CrawlURI curi, org.archive.util.anvl.ANVLRecord headers, String controlConversation)` Deprecated.
`protected void`	WARCWriterProcessor.`writeFtpRecords(org.archive.io.warc.WARCWriter w, CrawlURI curi, URI baseid, String timestamp)` Deprecated.
`protected void`	WARCWriterProcessor.`writeHttpRecords(CrawlURI curi, org.archive.io.warc.WARCWriter w, URI baseid, String timestamp)` Deprecated.
`protected URI`	WARCWriterProcessor.`writeMetadata(org.archive.io.warc.WARCWriter w, String timestamp, URI baseid, CrawlURI curi, org.archive.util.anvl.ANVLRecord namedFields)` Deprecated.
`protected void`	Kw3WriterProcessor.`writeMimeFile(CrawlURI curi)` The actual writing of the Kulturarw3 MIME-file.
`protected void`	WARCWriterChainProcessor.`writeRecords(CrawlURI curi, org.archive.io.warc.WARCWriter writer)`
`protected URI`	WARCWriterProcessor.`writeRequest(org.archive.io.warc.WARCWriter w, String timestamp, String mimetype, URI baseid, CrawlURI curi, org.archive.util.anvl.ANVLRecord namedFields)` Deprecated.
`protected URI`	WARCWriterProcessor.`writeResource(org.archive.io.warc.WARCWriter w, String timestamp, String mimetype, URI baseid, CrawlURI curi, org.archive.util.anvl.ANVLRecord namedFields)` Deprecated.
`protected URI`	WARCWriterProcessor.`writeResponse(org.archive.io.warc.WARCWriter w, String timestamp, String mimetype, URI baseid, CrawlURI curi, org.archive.util.anvl.ANVLRecord suppliedFields)` Deprecated.
`protected URI`	WARCWriterProcessor.`writeRevisit(org.archive.io.warc.WARCWriter w, String timestamp, String mimetype, URI baseid, CrawlURI curi, org.archive.util.anvl.ANVLRecord headers)` Deprecated.
`protected URI`	WARCWriterProcessor.`writeRevisit(org.archive.io.warc.WARCWriter w, String timestamp, String mimetype, URI baseid, CrawlURI curi, org.archive.util.anvl.ANVLRecord headers, long contentLength)` Deprecated.
`protected void`	WARCWriterProcessor.`writeWhoisRecords(org.archive.io.warc.WARCWriter w, CrawlURI curi, URI baseid, String timestamp)` Deprecated.

Uses of CrawlURI in org.archive.state

Methods in org.archive.state that return CrawlURI
Modifier and Type	Method and Description
`protected CrawlURI`	ModuleTestBase.`makeCrawlURI(String uri)`

Uses of Classorg.archive.modules.CrawlURI

Uses of CrawlURI in org.archive.crawler.util

Uses of CrawlURI in org.archive.modules

Uses of CrawlURI in org.archive.modules.credential

Uses of CrawlURI in org.archive.modules.deciderules

Uses of CrawlURI in org.archive.modules.deciderules.recrawl

Uses of CrawlURI in org.archive.modules.deciderules.surt

Uses of CrawlURI in org.archive.modules.extractor

Uses of CrawlURI in org.archive.modules.fetcher

Uses of CrawlURI in org.archive.modules.forms

Uses of CrawlURI in org.archive.modules.net

Uses of CrawlURI in org.archive.modules.recrawl

Uses of CrawlURI in org.archive.modules.seeds

Uses of CrawlURI in org.archive.modules.warc

Uses of CrawlURI in org.archive.modules.writer

Uses of CrawlURI in org.archive.state

Uses of Class
org.archive.modules.CrawlURI