Class FileSpout

  • All Implemented Interfaces:
    Serializable, org.apache.storm.spout.ISpout, org.apache.storm.topology.IComponent, org.apache.storm.topology.IRichSpout

    public class FileSpout
    extends org.apache.storm.topology.base.BaseRichSpout
    Reads the lines from a UTF-8 file and use them as a spout. Load the entire content into memory. Uses StringTabScheme to parse the lines into URLs and Metadata, generates tuples on the default stream unless withDiscoveredStatus is set to true.
    See Also:
    Serialized Form
    • Field Detail

      • LOG

        public static final org.slf4j.Logger LOG
      • _collector

        protected org.apache.storm.spout.SpoutOutputCollector _collector
      • _scheme

        protected org.apache.storm.spout.Scheme _scheme
      • active

        protected boolean active
    • Constructor Detail

      • FileSpout

        public FileSpout​(String dir,
                         String filter)
        Parameters:
        dir - containing the seed files
        filter - to apply on the file names
      • FileSpout

        public FileSpout​(String... files)
        Parameters:
        files - containing the URLs
      • FileSpout

        public FileSpout​(String dir,
                         String filter,
                         boolean withDiscoveredStatus)
        Parameters:
        withDiscoveredStatus - whether the tuples generated should contain a Status field with DISCOVERED as value and be emitted on the status stream
        dir - containing the seed files
        filter - to apply on the file names
        Since:
        1.13
      • FileSpout

        public FileSpout​(boolean withDiscoveredStatus,
                         String... files)
        Parameters:
        withDiscoveredStatus - whether the tuples generated should contain a Status field with DISCOVERED as value and be emitted on the status stream
        files - containing the URLs
        Since:
        1.13
    • Method Detail

      • setScheme

        public void setScheme​(org.apache.storm.spout.Scheme scheme)
        Specify a Scheme for parsing the lines into URLs and Metadata. StringTabScheme is used by default. The Scheme must generate a String for the URL and a Metadata object.
        Since:
        1.13
      • open

        public void open​(Map<String,​Object> conf,
                         org.apache.storm.task.TopologyContext context,
                         org.apache.storm.spout.SpoutOutputCollector collector)
      • nextTuple

        public void nextTuple()
      • declareOutputFields

        public void declareOutputFields​(org.apache.storm.topology.OutputFieldsDeclarer declarer)
      • close

        public void close()
        Specified by:
        close in interface org.apache.storm.spout.ISpout
        Overrides:
        close in class org.apache.storm.topology.base.BaseRichSpout
      • activate

        public void activate()
        Specified by:
        activate in interface org.apache.storm.spout.ISpout
        Overrides:
        activate in class org.apache.storm.topology.base.BaseRichSpout
      • deactivate

        public void deactivate()
        Specified by:
        deactivate in interface org.apache.storm.spout.ISpout
        Overrides:
        deactivate in class org.apache.storm.topology.base.BaseRichSpout
      • ack

        public void ack​(Object msgId)
        Specified by:
        ack in interface org.apache.storm.spout.ISpout
        Overrides:
        ack in class org.apache.storm.topology.base.BaseRichSpout
      • fail

        public void fail​(Object msgId)
        Specified by:
        fail in interface org.apache.storm.spout.ISpout
        Overrides:
        fail in class org.apache.storm.topology.base.BaseRichSpout