Package com.digitalpebble.stormcrawler.parse
-
Interface Summary Interface Description JSoupFilter Implementations of ParseFilter are responsible for extracting custom data from the crawled content. -
Class Summary Class Description DocumentFragmentBuilder Adapted from org.jsoup.helper.W3CDom but does not transfer namespaces.DocumentFragmentBuilder.W3CBuilder Implements the conversion by walking the input.JSoupFilters Wrapper for the JSoupFilters defined in a JSON configurationOutlink ParseData ParseFilter Implementations of ParseFilter are responsible for extracting custom data from the crawled content.ParseFilters Wrapper for the ParseFilters defined in a JSON configurationParseResult TextExtractor Filters the text extracted from HTML documents, used by JSoupParserBolt.