Class FileProtocol
- java.lang.Object
-
- com.digitalpebble.stormcrawler.protocol.file.FileProtocol
-
-
Constructor Summary
Constructors Constructor Description FileProtocol()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
cleanup()
void
configure(org.apache.storm.Config conf)
String
getEncoding()
ProtocolResponse
getProtocolOutput(String url, Metadata md)
Fetches the content and additional metadatacrawlercommons.robots.BaseRobotRules
getRobotRules(String url)
static void
main(String[] args)
-
-
-
Method Detail
-
configure
public void configure(org.apache.storm.Config conf)
-
getProtocolOutput
public ProtocolResponse getProtocolOutput(String url, Metadata md) throws Exception
Description copied from interface:Protocol
Fetches the content and additional metadataIMPORTANT: the metadata returned within the response should only be new additional, no need to return the metadata passed in.
- Specified by:
getProtocolOutput
in interfaceProtocol
- Parameters:
url
- the location of the contentmd
- extra information- Returns:
- the content and optional metadata fetched via this protocol
- Throws:
Exception
-
getRobotRules
public crawlercommons.robots.BaseRobotRules getRobotRules(String url)
- Specified by:
getRobotRules
in interfaceProtocol
-
getEncoding
public String getEncoding()
-
-