Class SeleniumProtocol
- java.lang.Object
-
- com.digitalpebble.stormcrawler.protocol.AbstractHttpProtocol
-
- com.digitalpebble.stormcrawler.protocol.selenium.SeleniumProtocol
-
- All Implemented Interfaces:
Protocol
- Direct Known Subclasses:
RemoteDriverProtocol
public abstract class SeleniumProtocol extends AbstractHttpProtocol
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class com.digitalpebble.stormcrawler.protocol.AbstractHttpProtocol
AbstractHttpProtocol.KeyValue
-
-
Field Summary
Fields Modifier and Type Field Description protected LinkedBlockingQueue<org.openqa.selenium.remote.RemoteWebDriver>
drivers
protected static org.slf4j.Logger
LOG
-
Fields inherited from class com.digitalpebble.stormcrawler.protocol.AbstractHttpProtocol
customHeaders, protocolMDprefix, protocolVersions, proxyManager, RESPONSE_COOKIES_HEADER, SET_HEADER_BY_REQUEST, skipRobots, storeHTTPHeaders, useCookies
-
-
Constructor Summary
Constructors Constructor Description SeleniumProtocol()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
cleanup()
void
configure(org.apache.storm.Config conf)
ProtocolResponse
getProtocolOutput(String url, Metadata metadata)
Fetches the content and additional metadata-
Methods inherited from class com.digitalpebble.stormcrawler.protocol.AbstractHttpProtocol
getAgentString, getRobotRules, main
-
-
-
-
Field Detail
-
LOG
protected static final org.slf4j.Logger LOG
-
drivers
protected LinkedBlockingQueue<org.openqa.selenium.remote.RemoteWebDriver> drivers
-
-
Method Detail
-
configure
public void configure(org.apache.storm.Config conf)
- Specified by:
configure
in interfaceProtocol
- Overrides:
configure
in classAbstractHttpProtocol
-
getProtocolOutput
public ProtocolResponse getProtocolOutput(String url, Metadata metadata) throws Exception
Description copied from interface:Protocol
Fetches the content and additional metadataIMPORTANT: the metadata returned within the response should only be new additional, no need to return the metadata passed in.
- Parameters:
url
- the location of the contentmetadata
- extra information- Returns:
- the content and optional metadata fetched via this protocol
- Throws:
Exception
-
cleanup
public void cleanup()
- Specified by:
cleanup
in interfaceProtocol
- Overrides:
cleanup
in classAbstractHttpProtocol
-
-