Class LibPstParser

  • All Implemented Interfaces:
    Serializable, org.apache.tika.config.Initializable, org.apache.tika.parser.Parser

    public class LibPstParser
    extends Object
    implements org.apache.tika.parser.Parser, org.apache.tika.config.Initializable
    This is an optional PST parser that relies on the user installing the GPL-3 libpst/readpst commandline tool and configuring Tika to call this library via tika-config.xml
    See Also:
    Serialized Form
    • Field Detail

      • MS_OUTLOOK_PST_MIMETYPE

        public static final org.apache.tika.mime.MediaType MS_OUTLOOK_PST_MIMETYPE
    • Constructor Detail

      • LibPstParser

        public LibPstParser()
    • Method Detail

      • getSupportedTypes

        public Set<org.apache.tika.mime.MediaType> getSupportedTypes​(org.apache.tika.parser.ParseContext parseContext)
        Specified by:
        getSupportedTypes in interface org.apache.tika.parser.Parser
      • parse

        public void parse​(InputStream inputStream,
                          ContentHandler contentHandler,
                          org.apache.tika.metadata.Metadata metadata,
                          org.apache.tika.parser.ParseContext parseContext)
                   throws IOException,
                          SAXException,
                          org.apache.tika.exception.TikaException
        Specified by:
        parse in interface org.apache.tika.parser.Parser
        Throws:
        IOException
        SAXException
        org.apache.tika.exception.TikaException
      • initialize

        public void initialize​(Map<String,​org.apache.tika.config.Param> map)
                        throws org.apache.tika.exception.TikaConfigException
        Specified by:
        initialize in interface org.apache.tika.config.Initializable
        Throws:
        org.apache.tika.exception.TikaConfigException
      • checkInitialization

        public void checkInitialization​(org.apache.tika.config.InitializableProblemHandler initializableProblemHandler)
                                 throws org.apache.tika.exception.TikaConfigException
        Specified by:
        checkInitialization in interface org.apache.tika.config.Initializable
        Throws:
        org.apache.tika.exception.TikaConfigException
      • checkQuietly

        public boolean checkQuietly()
      • setTimeoutSeconds

        @Field
        public void setTimeoutSeconds​(long timeoutSeconds)
      • setProcessEmailAsMsg

        @Field
        public void setProcessEmailAsMsg​(boolean processEmailAsMsg)
      • setIncludeDeleted

        @Field
        public void setIncludeDeleted​(boolean includeDeleted)
      • setMaxEmails

        @Field
        public void setMaxEmails​(int maxEmails)
      • setReadPstPath

        @Field
        public void setReadPstPath​(String readPstPath)
        This should include the path up to but not including 'readpst', e.g. "C:\my_bin" where readpst is at "C:\my_bin\readpst"
        Parameters:
        readPstPath -