Class MemorySpout

  • All Implemented Interfaces:
    Serializable, org.apache.storm.spout.ISpout, org.apache.storm.topology.IComponent, org.apache.storm.topology.IRichSpout

    public class MemorySpout
    extends org.apache.storm.topology.base.BaseRichSpout
    Stores URLs in memory. Useful for testing and debugging in local mode or with a single worker. Uses StringTabScheme to parse the lines into URLs and Metadata, generates tuples on the default stream unless withDiscoveredStatus is set to true. Can be used with the MemoryStatusUpdater to receive discovered URLs and emulate a recursive crawl.
    See Also:
    Serialized Form
    • Constructor Detail

      • MemorySpout

        public MemorySpout​(String... urls)
      • MemorySpout

        public MemorySpout​(boolean withDiscoveredStatus,
                           String... urls)
        Emits tuples with DISCOVERED status, which is useful when injecting seeds directly to a statusupdaterbolt.
        Parameters:
        withDiscoveredStatus - whether the tuples generated should contain a Status field with DISCOVERED as value and be emitted on the status stream
    • Method Detail

      • add

        public static void add​(String url,
                               Metadata md,
                               Date nextFetch)
        Add a new URL with the given metadata and nextFetch-date
      • open

        public void open​(Map<String,​Object> conf,
                         org.apache.storm.task.TopologyContext context,
                         org.apache.storm.spout.SpoutOutputCollector collector)
      • nextTuple

        public void nextTuple()
      • declareOutputFields

        public void declareOutputFields​(org.apache.storm.topology.OutputFieldsDeclarer declarer)
      • activate

        public void activate()
        Specified by:
        activate in interface org.apache.storm.spout.ISpout
        Overrides:
        activate in class org.apache.storm.topology.base.BaseRichSpout
      • deactivate

        public void deactivate()
        Specified by:
        deactivate in interface org.apache.storm.spout.ISpout
        Overrides:
        deactivate in class org.apache.storm.topology.base.BaseRichSpout