Class FileProtocol

    • Constructor Detail

      • FileProtocol

        public FileProtocol()
    • Method Detail

      • configure

        public void configure​(org.apache.storm.Config conf)
        Specified by:
        configure in interface Protocol
      • getProtocolOutput

        public ProtocolResponse getProtocolOutput​(String url,
                                                  Metadata md)
                                           throws Exception
        Description copied from interface: Protocol
        Fetches the content and additional metadata

        IMPORTANT: the metadata returned within the response should only be new additional, no need to return the metadata passed in.

        Specified by:
        getProtocolOutput in interface Protocol
        Parameters:
        url - the location of the content
        md - extra information
        Returns:
        the content and optional metadata fetched via this protocol
        Throws:
        Exception
      • getRobotRules

        public crawlercommons.robots.BaseRobotRules getRobotRules​(String url)
        Specified by:
        getRobotRules in interface Protocol
      • getEncoding

        public String getEncoding()
      • cleanup

        public void cleanup()
        Specified by:
        cleanup in interface Protocol