Package org.apache.tika.parser.microsoft
Class EMFParser
java.lang.Object
org.apache.tika.parser.microsoft.EMFParser
- All Implemented Interfaces:
Serializable,org.apache.tika.parser.Parser
Extracts files embedded in EMF and offers a
very rough capability to extract text if there
is text stored in the EMF.
To improve text extraction, we'd have to implement
quite a bit more at the POI level. We'd want to track changes
in font and use that information for identifying character sets,
inserting spaces and new lines.
- See Also:
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic org.apache.tika.metadata.Propertystatic org.apache.tika.metadata.Property -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionSet<org.apache.tika.mime.MediaType>getSupportedTypes(org.apache.tika.parser.ParseContext context) voidparse(InputStream stream, ContentHandler handler, org.apache.tika.metadata.Metadata metadata, org.apache.tika.parser.ParseContext context)
-
Field Details
-
EMF_ICON_ONLY
public static org.apache.tika.metadata.Property EMF_ICON_ONLY -
EMF_ICON_STRING
public static org.apache.tika.metadata.Property EMF_ICON_STRING
-
-
Constructor Details
-
EMFParser
public EMFParser()
-
-
Method Details
-
getSupportedTypes
public Set<org.apache.tika.mime.MediaType> getSupportedTypes(org.apache.tika.parser.ParseContext context) - Specified by:
getSupportedTypesin interfaceorg.apache.tika.parser.Parser
-
parse
public void parse(InputStream stream, ContentHandler handler, org.apache.tika.metadata.Metadata metadata, org.apache.tika.parser.ParseContext context) throws IOException, SAXException, org.apache.tika.exception.TikaException - Specified by:
parsein interfaceorg.apache.tika.parser.Parser- Throws:
IOExceptionSAXExceptionorg.apache.tika.exception.TikaException
-