Interface HtmlParser


  • public interface HtmlParser
    The HTML parser is a service to parse HTML and generate SAX events or a Document out of the HTML.
    • Method Detail

      • parse

        void parse​(InputStream inputStream,
                   String encoding,
                   ContentHandler contentHandler)
            throws SAXException
        Parse HTML and send SAX events.
        Parameters:
        inputStream - The input stream
        encoding - Encoding of the input stream, null for default encoding.
        contentHandler - Content handler receiving the SAX events. The content handler might also implement the lexical handler interface.
        Throws:
        SAXException - Exception thrown when parsing fails.
      • parse

        Document parse​(String systemId,
                       InputStream inputStream,
                       String encoding)
                throws IOException
        Parse HTML and return a DOM Document.
        Parameters:
        systemId - The system id
        inputStream - The input stream
        encoding - Encoding of the input stream, null for default encoding.
        Returns:
        A DOM Document built from parsed HTML or null
        Throws:
        IOException - Exception thrown when parsing fails.