Class WordMLParser
java.lang.Object
org.apache.tika.parser.microsoft.xml.AbstractXML2003Parser
org.apache.tika.parser.microsoft.xml.WordMLParser
- All Implemented Interfaces:
Serializable,org.apache.tika.parser.Parser
Parses wordml 2003 format word files. These are single xml files
that predate ooxml.
See https://en.wikipedia.org/wiki/Microsoft_Office_XML_formats
- See Also:
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected ContentHandlergetContentHandler(ContentHandler ch, org.apache.tika.metadata.Metadata metadata, org.apache.tika.parser.ParseContext context) Set<org.apache.tika.mime.MediaType>getSupportedTypes(org.apache.tika.parser.ParseContext context) voidsetContentType(org.apache.tika.metadata.Metadata metadata) Methods inherited from class org.apache.tika.parser.microsoft.xml.AbstractXML2003Parser
parse
-
Constructor Details
-
WordMLParser
public WordMLParser()
-
-
Method Details
-
getSupportedTypes
public Set<org.apache.tika.mime.MediaType> getSupportedTypes(org.apache.tika.parser.ParseContext context) -
getContentHandler
protected ContentHandler getContentHandler(ContentHandler ch, org.apache.tika.metadata.Metadata metadata, org.apache.tika.parser.ParseContext context) - Overrides:
getContentHandlerin classAbstractXML2003Parser
-
setContentType
public void setContentType(org.apache.tika.metadata.Metadata metadata) - Specified by:
setContentTypein classAbstractXML2003Parser
-