Class OOXMLParser
- java.lang.Object
-
- org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- org.apache.tika.parser.microsoft.ooxml.OOXMLParser
-
- All Implemented Interfaces:
Serializable,org.apache.tika.parser.Parser
public class OOXMLParser extends AbstractOfficeParser
Office Open XML (OOXML) parser.- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected static StringSIGNATURE_RELATIONSHIPprotected static Set<org.apache.tika.mime.MediaType>SUPPORTED_TYPESprotected static Set<org.apache.tika.mime.MediaType>UNSUPPORTED_OOXML_TYPESWe claim to support all OOXML files, but we actually don't support a small number of them.protected static org.apache.tika.mime.MediaTypeXPS
-
Constructor Summary
Constructors Constructor Description OOXMLParser()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Set<org.apache.tika.mime.MediaType>getSupportedTypes(org.apache.tika.parser.ParseContext context)voidparse(InputStream stream, ContentHandler handler, org.apache.tika.metadata.Metadata metadata, org.apache.tika.parser.ParseContext context)-
Methods inherited from class org.apache.tika.parser.microsoft.AbstractOfficeParser
configure, getByteArrayMaxOverride, getDateFormatOverride, isConcatenatePhoneticRuns, isExtractAllAlternativesFromMSG, isExtractMacros, isIncludeDeletedContent, isIncludeHeadersAndFooters, isIncludeMoveFromContent, isIncludeShapeBasedContent, isUseSAXDocxExtractor, isUseSAXPptxExtractor, isWriteSelectHeadersInBody, setByteArrayMaxOverride, setConcatenatePhoneticRuns, setDateFormatOverride, setExtractAllAlternativesFromMSG, setExtractMacros, setIncludeDeletedContent, setIncludeHeadersAndFooters, setIncludeMoveFromContent, setIncludeShapeBasedContent, setUseSAXDocxExtractor, setUseSAXPptxExtractor, setWriteSelectHeadersInBody
-
-
-
-
Field Detail
-
SIGNATURE_RELATIONSHIP
protected static final String SIGNATURE_RELATIONSHIP
- See Also:
- Constant Field Values
-
XPS
protected static final org.apache.tika.mime.MediaType XPS
-
SUPPORTED_TYPES
protected static final Set<org.apache.tika.mime.MediaType> SUPPORTED_TYPES
-
UNSUPPORTED_OOXML_TYPES
protected static final Set<org.apache.tika.mime.MediaType> UNSUPPORTED_OOXML_TYPES
We claim to support all OOXML files, but we actually don't support a small number of them. This list is used to decline certain formats that are not yet supported by Tika and/or POI.
-
-
Method Detail
-
getSupportedTypes
public Set<org.apache.tika.mime.MediaType> getSupportedTypes(org.apache.tika.parser.ParseContext context)
-
parse
public void parse(InputStream stream, ContentHandler handler, org.apache.tika.metadata.Metadata metadata, org.apache.tika.parser.ParseContext context) throws IOException, SAXException, org.apache.tika.exception.TikaException
- Throws:
IOExceptionSAXExceptionorg.apache.tika.exception.TikaException
-
-