Interface OOXMLExtractor

    • Method Detail

      • getDocument

        org.apache.poi.ooxml.POIXMLDocument getDocument()
        Returns the opened document.
        See Also:
        POIXMLTextExtractor.getDocument()
      • getMetadataExtractor

        MetadataExtractor getMetadataExtractor()
        POIXMLTextExtractor.getMetadataTextExtractor() not yet supported for OOXML by POI.
      • getXHTML

        void getXHTML​(org.xml.sax.ContentHandler handler,
                      Metadata metadata,
                      ParseContext context)
               throws org.xml.sax.SAXException,
                      XmlException,
                      java.io.IOException,
                      TikaException
        Parses the document into a sequence of XHTML SAX events sent to the given content handler.
        Throws:
        org.xml.sax.SAXException
        XmlException
        java.io.IOException
        TikaException