Class OOXMLParser
java.lang.Object
org.apache.tika.parser.microsoft.AbstractOfficeParser
org.apache.tika.parser.microsoft.ooxml.OOXMLParser
- All Implemented Interfaces:
Serializable,org.apache.tika.parser.Parser
Office Open XML (OOXML) parser.
- See Also:
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected static final Stringprotected static final Set<org.apache.tika.mime.MediaType>protected static final Set<org.apache.tika.mime.MediaType>We claim to support all OOXML files, but we actually don't support a small number of them.protected static final org.apache.tika.mime.MediaType -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionSet<org.apache.tika.mime.MediaType>getSupportedTypes(org.apache.tika.parser.ParseContext context) voidparse(InputStream stream, ContentHandler handler, org.apache.tika.metadata.Metadata metadata, org.apache.tika.parser.ParseContext context) Methods inherited from class org.apache.tika.parser.microsoft.AbstractOfficeParser
configure, getByteArrayMaxOverride, getDateFormatOverride, isConcatenatePhoneticRuns, isExtractAllAlternativesFromMSG, isExtractMacros, isIncludeDeletedContent, isIncludeHeadersAndFooters, isIncludeMoveFromContent, isIncludeShapeBasedContent, isUseSAXDocxExtractor, isUseSAXPptxExtractor, setByteArrayMaxOverride, setConcatenatePhoneticRuns, setDateFormatOverride, setExtractAllAlternativesFromMSG, setExtractMacros, setIncludeDeletedContent, setIncludeHeadersAndFooters, setIncludeMoveFromContent, setIncludeShapeBasedContent, setUseSAXDocxExtractor, setUseSAXPptxExtractor
-
Field Details
-
SIGNATURE_RELATIONSHIP
- See Also:
-
XPS
protected static final org.apache.tika.mime.MediaType XPS -
SUPPORTED_TYPES
-
UNSUPPORTED_OOXML_TYPES
We claim to support all OOXML files, but we actually don't support a small number of them. This list is used to decline certain formats that are not yet supported by Tika and/or POI.
-
-
Constructor Details
-
OOXMLParser
public OOXMLParser()
-
-
Method Details
-
getSupportedTypes
public Set<org.apache.tika.mime.MediaType> getSupportedTypes(org.apache.tika.parser.ParseContext context) -
parse
public void parse(InputStream stream, ContentHandler handler, org.apache.tika.metadata.Metadata metadata, org.apache.tika.parser.ParseContext context) throws IOException, SAXException, org.apache.tika.exception.TikaException - Throws:
IOExceptionSAXExceptionorg.apache.tika.exception.TikaException
-