Class OOXMLParser

java.lang.Object
org.apache.tika.parser.microsoft.AbstractOfficeParser
org.apache.tika.parser.microsoft.ooxml.OOXMLParser
All Implemented Interfaces:
Serializable, org.apache.tika.parser.Parser

public class OOXMLParser extends AbstractOfficeParser
Office Open XML (OOXML) parser.
See Also:
  • Field Details

    • SIGNATURE_RELATIONSHIP

      protected static final String SIGNATURE_RELATIONSHIP
      See Also:
    • XPS

      protected static final org.apache.tika.mime.MediaType XPS
    • SUPPORTED_TYPES

      protected static final Set<org.apache.tika.mime.MediaType> SUPPORTED_TYPES
    • UNSUPPORTED_OOXML_TYPES

      protected static final Set<org.apache.tika.mime.MediaType> UNSUPPORTED_OOXML_TYPES
      We claim to support all OOXML files, but we actually don't support a small number of them. This list is used to decline certain formats that are not yet supported by Tika and/or POI.
  • Constructor Details

    • OOXMLParser

      public OOXMLParser()
  • Method Details

    • getSupportedTypes

      public Set<org.apache.tika.mime.MediaType> getSupportedTypes(org.apache.tika.parser.ParseContext context)
    • parse

      public void parse(InputStream stream, ContentHandler handler, org.apache.tika.metadata.Metadata metadata, org.apache.tika.parser.ParseContext context) throws IOException, SAXException, org.apache.tika.exception.TikaException
      Throws:
      IOException
      SAXException
      org.apache.tika.exception.TikaException