Class AbstractHttpProtocol
- java.lang.Object
-
- com.digitalpebble.stormcrawler.protocol.AbstractHttpProtocol
-
- All Implemented Interfaces:
Protocol
- Direct Known Subclasses:
HttpProtocol
,HttpProtocol
,SeleniumProtocol
public abstract class AbstractHttpProtocol extends Object implements Protocol
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected static class
AbstractHttpProtocol.KeyValue
-
Field Summary
Fields Modifier and Type Field Description protected List<AbstractHttpProtocol.KeyValue>
customHeaders
protected String
protocolMDprefix
protected List<String>
protocolVersions
ProxyManager
proxyManager
protected static String
RESPONSE_COOKIES_HEADER
protected static String
SET_HEADER_BY_REQUEST
protected boolean
skipRobots
protected boolean
storeHTTPHeaders
protected boolean
useCookies
-
Constructor Summary
Constructors Constructor Description AbstractHttpProtocol()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
cleanup()
void
configure(org.apache.storm.Config conf)
static String
getAgentString(org.apache.storm.Config conf)
crawlercommons.robots.BaseRobotRules
getRobotRules(String url)
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface com.digitalpebble.stormcrawler.protocol.Protocol
getProtocolOutput
-
-
-
-
Field Detail
-
skipRobots
protected boolean skipRobots
-
storeHTTPHeaders
protected boolean storeHTTPHeaders
-
useCookies
protected boolean useCookies
-
RESPONSE_COOKIES_HEADER
protected static final String RESPONSE_COOKIES_HEADER
- See Also:
- Constant Field Values
-
SET_HEADER_BY_REQUEST
protected static final String SET_HEADER_BY_REQUEST
- See Also:
- Constant Field Values
-
protocolMDprefix
protected String protocolMDprefix
-
proxyManager
public ProxyManager proxyManager
-
customHeaders
protected final List<AbstractHttpProtocol.KeyValue> customHeaders
-
-
Method Detail
-
configure
public void configure(org.apache.storm.Config conf)
-
getRobotRules
public crawlercommons.robots.BaseRobotRules getRobotRules(String url)
- Specified by:
getRobotRules
in interfaceProtocol
-
getAgentString
public static String getAgentString(org.apache.storm.Config conf)
-
-