All Classes and Interfaces
Class
Description
CharsetDetector
provides a facility for detecting the
charset or encoding of character data in an unknown format.This class represents a charset that has been identified by a CharsetDetector
as a possible encoding for a set of input data.
Parser to extract printable Latin1 strings from arbitrary files with pure java
without running any external process.
Configuration for the "strings" (or strings-alternative) command.
Character encoding of the strings that are to be found using the "strings" command.
Parser that uses the "strings" (or strings-alternative) command to find the
printable strings in a object, or other binary, file
(application/octet-stream).
Unless the
TikaCoreProperties.CONTENT_TYPE_USER_OVERRIDE
is set,
this parser tries to assess whether the file is a text file, csv or tsv.Plain text parser.