Class StringsParser

  • All Implemented Interfaces:
    Serializable, org.apache.tika.config.Initializable, org.apache.tika.parser.Parser

    public class StringsParser
    extends org.apache.tika.parser.AbstractParser
    implements org.apache.tika.config.Initializable
    Parser that uses the "strings" (or strings-alternative) command to find the printable strings in a object, or other binary, file (application/octet-stream). Useful as "best-effort" parser for files detected as application/octet-stream.
    Author:
    gtotaro
    See Also:
    Serialized Form
    • Constructor Detail

      • StringsParser

        public StringsParser()
    • Method Detail

      • getStringsProg

        public static String getStringsProg()
      • getSupportedTypes

        public Set<org.apache.tika.mime.MediaType> getSupportedTypes​(org.apache.tika.parser.ParseContext context)
        Specified by:
        getSupportedTypes in interface org.apache.tika.parser.Parser
      • parse

        public void parse​(InputStream stream,
                          ContentHandler handler,
                          org.apache.tika.metadata.Metadata metadata,
                          org.apache.tika.parser.ParseContext context)
                   throws IOException,
                          SAXException,
                          org.apache.tika.exception.TikaException
        Specified by:
        parse in interface org.apache.tika.parser.Parser
        Throws:
        IOException
        SAXException
        org.apache.tika.exception.TikaException
      • getStringsPath

        public String getStringsPath()
      • setStringsPath

        @Field
        public void setStringsPath​(String path)
        Sets the "strings" installation folder.
        Parameters:
        path - the "strings" installation folder.
      • setEncoding

        @Field
        public void setEncoding​(String encoding)
      • getMinLength

        public int getMinLength()
      • setMinLength

        @Field
        public void setMinLength​(int minLength)
      • getTimeoutSeconds

        public int getTimeoutSeconds()
      • setTimeoutSeconds

        @Field
        public void setTimeoutSeconds​(int timeoutSeconds)
      • initialize

        public void initialize​(Map<String,​org.apache.tika.config.Param> params)
                        throws org.apache.tika.exception.TikaConfigException
        Specified by:
        initialize in interface org.apache.tika.config.Initializable
        Throws:
        org.apache.tika.exception.TikaConfigException
      • checkInitialization

        public void checkInitialization​(org.apache.tika.config.InitializableProblemHandler problemHandler)
                                 throws org.apache.tika.exception.TikaConfigException
        Specified by:
        checkInitialization in interface org.apache.tika.config.Initializable
        Throws:
        org.apache.tika.exception.TikaConfigException