Interface URLBuffer
-
- All Known Implementing Classes:
AbstractURLBuffer
,PriorityURLBuffer
,SchedulingURLBuffer
,SimpleURLBuffer
public interface URLBuffer
Buffers URLs to be processed into separate queues; used by spouts. Guarantees that no URL can be put in the buffer more than once.Configured by setting
urlbuffer.class: "com.digitalpebble.stormcrawler.persistence.SimpleURLBuffer"
in the configuration
- Since:
- 1.15
-
-
Field Summary
Fields Modifier and Type Field Description static String
bufferClassParamName
Implementation to use for URLBuffer.
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Default Methods Deprecated Methods Modifier and Type Method Description default void
acked(String url)
Notify the buffer that a URL has been successfully processed used e.g to compute an ideal delay for a host queuedefault boolean
add(String URL, Metadata m)
Stores the URL and its Metadata using the hostname as key.boolean
add(String URL, Metadata m, String key)
Stores the URL and its Metadata under a given key.default void
configure(Map<String,Object> stormConf)
static @NotNull URLBuffer
createInstance(@NotNull Map<String,Object> stormConf)
Returns a URLBuffer instance based on the configuration *static URLBuffer
getInstance(Map<String,Object> stormConf)
Deprecated.boolean
hasNext()
Implementations of this method should be synchronisedorg.apache.storm.tuple.Values
next()
Retrieves the next available URL, guarantees that the URLs are always perfectly shuffledint
numQueues()
Total number of queues in the buffer *void
setEmptyQueueListener(EmptyQueueListener l)
int
size()
Total number of URLs in the buffer *
-
-
-
Field Detail
-
bufferClassParamName
static final String bufferClassParamName
Implementation to use for URLBuffer. Must implement the interface URLBuffer.- See Also:
- Constant Field Values
-
-
Method Detail
-
createInstance
@NotNull static @NotNull URLBuffer createInstance(@NotNull @NotNull Map<String,Object> stormConf)
Returns a URLBuffer instance based on the configuration *
-
getInstance
@Deprecated static URLBuffer getInstance(Map<String,Object> stormConf)
Deprecated.Replace withcreateInstance(Map)
-
add
boolean add(String URL, Metadata m, String key)
Stores the URL and its Metadata under a given key.Implementations of this method should be synchronised
- Returns:
- false if the URL was already in the buffer, true if it wasn't and was added
-
add
default boolean add(String URL, Metadata m)
Stores the URL and its Metadata using the hostname as key.Implementations of this method should be synchronised
- Returns:
- false if the URL was already in the buffer, true if it wasn't and was added
-
size
int size()
Total number of URLs in the buffer *
-
numQueues
int numQueues()
Total number of queues in the buffer *
-
next
org.apache.storm.tuple.Values next()
Retrieves the next available URL, guarantees that the URLs are always perfectly shuffledImplementations of this method should be synchronised
-
hasNext
boolean hasNext()
Implementations of this method should be synchronised
-
setEmptyQueueListener
void setEmptyQueueListener(EmptyQueueListener l)
-
acked
default void acked(String url)
Notify the buffer that a URL has been successfully processed used e.g to compute an ideal delay for a host queue
-
-