Package org.archive.crawler.processor


package org.archive.crawler.processor
  • Classes
    Class
    Description
    A simple crawl splitter/mapper, dividing up CrawlURIs/CrawlURIs between crawlers by diverting some range of URIs to local log files (which can then be imported to other crawlers).
    Maps URIs to one of N crawler names by applying a hash to the URI's (possibly-transformed) classKey.
    A simple crawl splitter/mapper, dividing up CrawlURIs/CrawlURIs between crawlers by diverting some range of URIs to local log files (which can then be imported to other crawlers).