A B C D E F G H I J K L M N O P Q R S T U V W X Z _
All Classes All Packages
All Classes All Packages
All Classes All Packages
A
- ABS_PEAK_AUDIO_FILE_PATH - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The absolute path to the file's peak audio file.
- AbstractEncodingDetectorParser - Class in org.apache.tika.parser
-
Abstract base class for parsers that use the AutoDetectReader and need to use the
EncodingDetector
configured byTikaConfig
- AbstractEncodingDetectorParser() - Constructor for class org.apache.tika.parser.AbstractEncodingDetectorParser
- AbstractEncodingDetectorParser(EncodingDetector) - Constructor for class org.apache.tika.parser.AbstractEncodingDetectorParser
- AbstractParser - Class in org.apache.tika.parser
-
Abstract base class for new parsers.
- AbstractParser() - Constructor for class org.apache.tika.parser.AbstractParser
- AbstractRecursiveParserWrapperHandler - Class in org.apache.tika.sax
-
This is a special handler to be used only with the
RecursiveParserWrapper
. - AbstractRecursiveParserWrapperHandler(ContentHandlerFactory) - Constructor for class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
- AbstractRecursiveParserWrapperHandler(ContentHandlerFactory, int) - Constructor for class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
- AccessPermissionException - Exception in org.apache.tika.exception
-
Exception to be thrown when a document does not allow content extraction.
- AccessPermissionException() - Constructor for exception org.apache.tika.exception.AccessPermissionException
- AccessPermissionException(String) - Constructor for exception org.apache.tika.exception.AccessPermissionException
- AccessPermissionException(String, Throwable) - Constructor for exception org.apache.tika.exception.AccessPermissionException
- AccessPermissionException(Throwable) - Constructor for exception org.apache.tika.exception.AccessPermissionException
- AccessPermissions - Interface in org.apache.tika.metadata
-
Until we can find a common standard, we'll use these options.
- ACKNOWLEDGEMENT - Static variable in interface org.apache.tika.metadata.ClimateForcast
- ACRONYM_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- ACTION_TRIGGER - Static variable in interface org.apache.tika.metadata.PDF
-
This specifies where an action or destination would be found/triggered in the document: on document open, before close, etc.
- add(String) - Method in class org.apache.tika.language.LanguageProfile
-
Deprecated.Adds a single occurrence of the given ngram to this profile.
- add(StringBuffer) - Method in class org.apache.tika.language.LanguageProfilerBuilder
-
Deprecated.Adds ngrams from a single word to this profile
- add(String, long) - Method in class org.apache.tika.language.LanguageProfile
-
Deprecated.Adds multiple occurrences of the given ngram to this profile.
- add(String, String) - Method in class org.apache.tika.metadata.Metadata
-
Add a metadata name/value mapping.
- add(Property, int) - Method in class org.apache.tika.metadata.Metadata
-
Adds the integer value of the identified metadata property.
- add(Property, String) - Method in class org.apache.tika.metadata.Metadata
-
Add a metadata property/value mapping.
- addAlias(MediaType, MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
- addData(byte[], int, int) - Method in class org.apache.tika.detect.TextStatistics
- addingService(ServiceReference) - Method in class org.apache.tika.config.TikaActivator
- ADDITIONAL_MODEL_INFO - Static variable in interface org.apache.tika.metadata.IPTC
-
Information about the ethnicity and other facets of the model(s) in a model-released image.
- addPattern(MimeType, String) - Method in class org.apache.tika.mime.MimeTypes
-
Adds a file name pattern for the given media type.
- addPattern(MimeType, String, boolean) - Method in class org.apache.tika.mime.MimeTypes
-
Adds a file name pattern for the given media type.
- addPrefix(String, String) - Method in class org.apache.tika.sax.xpath.XPathParser
- addProfile(String, LanguageProfile) - Static method in class org.apache.tika.language.LanguageIdentifier
-
Deprecated.Adds a single language profile
- addResource(Closeable) - Method in class org.apache.tika.io.TemporaryResources
-
Adds a new resource to the set of tracked resources that will all be closed when the
TemporaryResources.close()
method is called. - addSuperType(MediaType, MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
- addText(char[], int, int) - Method in class org.apache.tika.language.detect.LanguageDetector
-
Add statistics about this text for the current document.
- addText(CharSequence) - Method in class org.apache.tika.language.detect.LanguageDetector
-
Add
to the statistics being accumulated for the current document. - addType(MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
- afterRead(int) - Method in class org.apache.tika.io.ProxyInputStream
-
Invoked by the read methods after the proxied call has returned successfully.
- afterRead(int) - Method in class org.apache.tika.io.TikaInputStream
- ALBUM - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The name of the album."
- ALBUM_ARTIST - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The name of the album artist or group for compilation albums."
- ALIAS_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- ALIAS_TYPE_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- ALT - org.apache.tika.metadata.Property.PropertyType
-
An ordered array with some sort of criteria
- ALT_TAPE_NAME - Static variable in interface org.apache.tika.metadata.XMPDM
-
"An alternative tape name, set via the project window or timecode dialog in Premiere.
- ALTITUDE - Static variable in interface org.apache.tika.metadata.Geographic
-
The WGS84 Altitude of the Point
- ALTITUDE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- analyze(StringBuilder) - Method in class org.apache.tika.language.LanguageProfilerBuilder
-
Deprecated.Analyzes a piece of text
- AnnotationUtils - Class in org.apache.tika.utils
-
This class contains utilities for dealing with tika annotations
- AnnotationUtils() - Constructor for class org.apache.tika.utils.AnnotationUtils
- APP_VERSION - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- application(String) - Static method in class org.apache.tika.mime.MediaType
- APPLICATION - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- APPLICATION_NAME - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- APPLICATION_VERSION - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- APPLICATION_XML - Static variable in class org.apache.tika.mime.MediaType
- APPLICATION_ZIP - Static variable in class org.apache.tika.mime.MediaType
- ARTIST - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The name of the artist or artists."
- ARTWORK_OR_OBJECT - Static variable in interface org.apache.tika.metadata.IPTC
-
A set of metadata about artwork or an object in the item
- ARTWORK_OR_OBJECT_DETAIL_COPYRIGHT_NOTICE - Static variable in interface org.apache.tika.metadata.IPTC
-
Contains any necessary copyright notice for claiming the intellectual property for artwork or an object in the image and should identify the current owner of the copyright of this work with associated intellectual property rights.
- ARTWORK_OR_OBJECT_DETAIL_CREATOR - Static variable in interface org.apache.tika.metadata.IPTC
-
Contains the name of the artist who has created artwork or an object in the image.
- ARTWORK_OR_OBJECT_DETAIL_DATE_CREATED - Static variable in interface org.apache.tika.metadata.IPTC
-
Designates the date and optionally the time the artwork or object in the image was created.
- ARTWORK_OR_OBJECT_DETAIL_SOURCE - Static variable in interface org.apache.tika.metadata.IPTC
-
The organisation or body holding and registering the artwork or object in the image for inventory purposes.
- ARTWORK_OR_OBJECT_DETAIL_SOURCE_INVENTORY_NUMBER - Static variable in interface org.apache.tika.metadata.IPTC
-
The inventory number issued by the organisation or body holding and registering the artwork or object in the image.
- ARTWORK_OR_OBJECT_DETAIL_TITLE - Static variable in interface org.apache.tika.metadata.IPTC
-
A reference for the artwork or object in the image.
- asInputSource() - Method in class org.apache.tika.detect.AutoDetectReader
- ASSEMBLE_DOCUMENT - Static variable in interface org.apache.tika.metadata.AccessPermissions
-
Can the user insert/rotate/delete pages.
- assignFieldParams(Object, Map<String, Param>) - Static method in class org.apache.tika.utils.AnnotationUtils
-
Assigns the param values to bean
- assignValue(Object, Object) - Method in class org.apache.tika.config.ParamField
-
Sets given value to the annotated field of bean
- attachExternalParsers(List<ExternalParser>, TikaConfig) - Static method in class org.apache.tika.parser.external.ExternalParsersFactory
- attachExternalParsers(TikaConfig) - Static method in class org.apache.tika.parser.external.ExternalParsersFactory
- ATTACHMENT - org.apache.tika.metadata.TikaCoreProperties.EmbeddedResourceType
- AttributeMatcher - Class in org.apache.tika.sax.xpath
-
Final evaluation state of a
.../@*
XPath expression. - AttributeMatcher() - Constructor for class org.apache.tika.sax.xpath.AttributeMatcher
- audio(String) - Static method in class org.apache.tika.mime.MediaType
- AUDIO_CHANNEL_TYPE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The audio channel type."
- AUDIO_COMPRESSOR - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The audio compression used.
- AUDIO_MOD_DATE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The date and time when the audio was last modified."
- AUDIO_SAMPLE_RATE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The audio sample rate.
- AUDIO_SAMPLE_TYPE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The audio sample type."
- AUTHOR - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- AUTHOR - Static variable in interface org.apache.tika.metadata.Office
-
Name of the principal author(s) of a document
- AUTHORS_POSITION - Static variable in interface org.apache.tika.metadata.Photoshop
- AutoDetectParser - Class in org.apache.tika.parser
- AutoDetectParser() - Constructor for class org.apache.tika.parser.AutoDetectParser
-
Creates an auto-detecting parser instance using the default Tika configuration.
- AutoDetectParser(TikaConfig) - Constructor for class org.apache.tika.parser.AutoDetectParser
- AutoDetectParser(Detector) - Constructor for class org.apache.tika.parser.AutoDetectParser
- AutoDetectParser(Detector, Parser...) - Constructor for class org.apache.tika.parser.AutoDetectParser
- AutoDetectParser(Parser...) - Constructor for class org.apache.tika.parser.AutoDetectParser
-
Creates an auto-detecting parser instance using the specified set of parser.
- AutoDetectParserFactory - Class in org.apache.tika.parser
-
Factory for an AutoDetectParser
- AutoDetectParserFactory(Map<String, String>) - Constructor for class org.apache.tika.parser.AutoDetectParserFactory
- AutoDetectReader - Class in org.apache.tika.detect
-
An input stream reader that automatically detects the character encoding to be used for converting bytes to characters.
- AutoDetectReader(InputStream) - Constructor for class org.apache.tika.detect.AutoDetectReader
- AutoDetectReader(InputStream, Metadata) - Constructor for class org.apache.tika.detect.AutoDetectReader
- AutoDetectReader(InputStream, Metadata, ServiceLoader) - Constructor for class org.apache.tika.detect.AutoDetectReader
- AutoDetectReader(InputStream, Metadata, EncodingDetector) - Constructor for class org.apache.tika.detect.AutoDetectReader
- available() - Method in class org.apache.tika.io.LookaheadInputStream
- available() - Method in class org.apache.tika.io.NullInputStream
-
Return the number of bytes that can be read.
- available() - Method in class org.apache.tika.io.ProxyInputStream
-
Invokes the delegate's
available()
method.
B
- BAG - org.apache.tika.metadata.Property.PropertyType
-
An un-ordered array
- BasicContentHandlerFactory - Class in org.apache.tika.sax
-
Basic factory for creating common types of ContentHandlers
- BasicContentHandlerFactory(BasicContentHandlerFactory.HANDLER_TYPE, int) - Constructor for class org.apache.tika.sax.BasicContentHandlerFactory
- BasicContentHandlerFactory.HANDLER_TYPE - Enum in org.apache.tika.sax
-
Common handler types for content.
- beforeRead(int) - Method in class org.apache.tika.io.ProxyInputStream
-
Invoked by the read methods before the call is proxied.
- BITS_PER_SAMPLE - Static variable in interface org.apache.tika.metadata.TIFF
-
"Number of bits per component in each channel."
- BODY - org.apache.tika.sax.BasicContentHandlerFactory.HANDLER_TYPE
- BodyContentHandler - Class in org.apache.tika.sax
-
Content handler decorator that only passes everything inside the XHTML <body/> tag to the underlying handler.
- BodyContentHandler() - Constructor for class org.apache.tika.sax.BodyContentHandler
-
Creates a content handler that writes XHTML body character events to an internal string buffer.
- BodyContentHandler(int) - Constructor for class org.apache.tika.sax.BodyContentHandler
-
Creates a content handler that writes XHTML body character events to an internal string buffer.
- BodyContentHandler(OutputStream) - Constructor for class org.apache.tika.sax.BodyContentHandler
-
Creates a content handler that writes XHTML body character events to the given output stream using the default encoding.
- BodyContentHandler(Writer) - Constructor for class org.apache.tika.sax.BodyContentHandler
-
Creates a content handler that writes XHTML body character events to the given writer.
- BodyContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.BodyContentHandler
-
Creates a content handler that passes all XHTML body events to the given underlying content handler.
- BOOLEAN - org.apache.tika.metadata.Property.ValueType
- BoundedInputStream - Class in org.apache.tika.io
-
Very slight modification of Commons' BoundedInputStream so that we can figure out if this hit the bound or not.
- BoundedInputStream(long, InputStream) - Constructor for class org.apache.tika.io.BoundedInputStream
- BufferUnderrunException() - Constructor for exception org.apache.tika.io.EndianUtils.BufferUnderrunException
- build() - Method in class org.apache.tika.detect.NNTrainedModelBuilder
- build() - Method in class org.apache.tika.fork.ParserFactoryFactory
- build() - Method in class org.apache.tika.parser.AutoDetectParserFactory
- build() - Method in class org.apache.tika.parser.ParserFactory
- build() - Method in class org.apache.tika.sax.StandardReference.StandardReferenceBuilder
- BUILD - Static variable in interface org.apache.tika.metadata.QuattroPro
-
Build.
- build2() - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
-
Initialize the MimeTypes with this builder instance
- buildDOM(InputStream) - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Builds a Document with a DocumentBuilder from the pool
- buildDOM(InputStream, ParseContext) - Static method in class org.apache.tika.utils.XMLReaderUtils
-
This checks context for a user specified
DocumentBuilder
. - buildDOM(String) - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Builds a Document with a DocumentBuilder from the pool
- buildDOM(Path) - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Builds a Document with a DocumentBuilder from the pool
- Builder() - Constructor for class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
C
- CAN_MODIFY - Static variable in interface org.apache.tika.metadata.AccessPermissions
-
Can any modifications be made to the document
- CAN_MODIFY_ANNOTATIONS - Static variable in interface org.apache.tika.metadata.AccessPermissions
-
Can the user modify annotations
- CAN_PRINT - Static variable in interface org.apache.tika.metadata.AccessPermissions
-
Can the user print the document
- CAN_PRINT_DEGRADED - Static variable in interface org.apache.tika.metadata.AccessPermissions
-
Can the user print an image-degraded version of the document.
- CAPTION_WRITER - Static variable in interface org.apache.tika.metadata.Photoshop
- cast(InputStream) - Static method in class org.apache.tika.io.TikaInputStream
-
Returns the given stream casts to a TikaInputStream, or
null
if the stream is not a TikaInputStream. - CATEGORY - Static variable in interface org.apache.tika.metadata.IPTC
-
Deprecated.
- CATEGORY - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- CATEGORY - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
-
A categorization of the content of this package.
- CATEGORY - Static variable in interface org.apache.tika.metadata.Photoshop
- CERTIFICATE - Static variable in interface org.apache.tika.metadata.XMPRights
-
A Web URL for a rights management certificate.
- ChannelTypePropertyConverter() - Constructor for class org.apache.tika.metadata.XMPDM.ChannelTypePropertyConverter
-
Deprecated.
- CHARACTER_COUNT - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- CHARACTER_COUNT - Static variable in interface org.apache.tika.metadata.Office
-
The number of Characters in the document
- CHARACTER_COUNT_WITH_SPACES - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- CHARACTER_COUNT_WITH_SPACES - Static variable in interface org.apache.tika.metadata.Office
-
The number of Characters in the document, including spaces
- characters - Variable in class org.apache.tika.mime.MimeTypesReader
- characters(char[], int, int) - Method in class org.apache.tika.mime.MimeTypesReader
- characters(char[], int, int) - Method in class org.apache.tika.sax.ContentHandlerDecorator
- characters(char[], int, int) - Method in class org.apache.tika.sax.DIFContentHandler
- characters(char[], int, int) - Method in class org.apache.tika.sax.ExpandedTitleContentHandler
- characters(char[], int, int) - Method in class org.apache.tika.sax.LinkContentHandler
- characters(char[], int, int) - Method in class org.apache.tika.sax.PhoneExtractingContentHandler
-
The characters method is called whenever a Parser wants to pass raw...
- characters(char[], int, int) - Method in class org.apache.tika.sax.SafeContentHandler
- characters(char[], int, int) - Method in class org.apache.tika.sax.SecureContentHandler
- characters(char[], int, int) - Method in class org.apache.tika.sax.StandardsExtractingContentHandler
-
The characters method is called whenever a Parser wants to pass raw characters to the ContentHandler.
- characters(char[], int, int) - Method in class org.apache.tika.sax.TeeContentHandler
- characters(char[], int, int) - Method in class org.apache.tika.sax.TextContentHandler
- characters(char[], int, int) - Method in class org.apache.tika.sax.ToTextContentHandler
-
Writes the given characters to the given character stream.
- characters(char[], int, int) - Method in class org.apache.tika.sax.ToXMLContentHandler
- characters(char[], int, int) - Method in class org.apache.tika.sax.WriteOutContentHandler
-
Writes the given characters to the given character stream.
- characters(char[], int, int) - Method in class org.apache.tika.sax.XHTMLContentHandler
- characters(char[], int, int) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
- characters(String) - Method in class org.apache.tika.sax.XHTMLContentHandler
- CHARACTERS_PER_PAGE - Static variable in interface org.apache.tika.metadata.PDF
- CharsetUtils - Class in org.apache.tika.utils
- CharsetUtils() - Constructor for class org.apache.tika.utils.CharsetUtils
- check(String[], int...) - Static method in class org.apache.tika.embedder.ExternalEmbedder
-
Checks to see if the command can be run.
- check(String[], int...) - Static method in class org.apache.tika.parser.external.ExternalParser
- check(String, int...) - Static method in class org.apache.tika.embedder.ExternalEmbedder
-
Checks to see if the command can be run.
- check(String, int...) - Static method in class org.apache.tika.parser.external.ExternalParser
-
Checks to see if the command can be run.
- CHECK_TAG - Static variable in interface org.apache.tika.parser.external.ExternalParsersConfigReaderMetKeys
- checkInitialization(InitializableProblemHandler) - Method in interface org.apache.tika.config.Initializable
- ChildMatcher - Class in org.apache.tika.sax.xpath
-
Intermediate evaluation state of a
.../*...
XPath expression. - ChildMatcher(Matcher) - Constructor for class org.apache.tika.sax.xpath.ChildMatcher
- CITY - Static variable in interface org.apache.tika.metadata.IPTC
-
Name of the city the content is focussing on -- either the place shown in visual media or referenced by text or audio media.
- CITY - Static variable in interface org.apache.tika.metadata.Photoshop
- clean(String) - Static method in class org.apache.tika.sax.CleanPhoneText
- clean(String) - Static method in class org.apache.tika.utils.CharsetUtils
-
Handle various common charset name errors, and return something that will be considered valid (and is normalized)
- CleanPhoneText - Class in org.apache.tika.sax
-
Class to help de-obfuscate phone numbers in text.
- CleanPhoneText() - Constructor for class org.apache.tika.sax.CleanPhoneText
- cleanSubstitutions - Static variable in class org.apache.tika.sax.CleanPhoneText
- clearProfiles() - Static method in class org.apache.tika.language.LanguageIdentifier
-
Deprecated.Clears the current map of language profiles
- ClimateForcast - Interface in org.apache.tika.metadata
-
Met keys from NCAR CCSM files in the Climate Forecast Convention.
- cloneMetadata(Metadata) - Static method in class org.apache.tika.utils.ParserUtils
-
Does a deep clone of a Metadata object.
- close() - Method in class org.apache.tika.fork.ForkParser
- close() - Method in class org.apache.tika.io.CloseShieldInputStream
-
Replaces the underlying input stream with a
ClosedInputStream
sentinel. - close() - Method in class org.apache.tika.io.LookaheadInputStream
- close() - Method in class org.apache.tika.io.NullInputStream
-
Close this input stream - resets the internal state to the initial values.
- close() - Method in class org.apache.tika.io.ProxyInputStream
-
Invokes the delegate's
close()
method. - close() - Method in class org.apache.tika.io.TemporaryResources
-
Closes all tracked resources.
- close() - Method in class org.apache.tika.io.TikaInputStream
- close() - Method in class org.apache.tika.language.detect.LanguageWriter
-
Ignored.
- close() - Method in class org.apache.tika.language.ProfilingWriter
-
Deprecated.
- close() - Method in class org.apache.tika.parser.ParsingReader
-
Closes the read end of the pipe.
- close() - Method in class org.apache.tika.utils.RereadableInputStream
-
Closes the input stream and removes the temporary file if one was created.
- CLOSED_CHOICE - org.apache.tika.metadata.Property.ValueType
- ClosedInputStream - Class in org.apache.tika.io
-
Closed input stream.
- ClosedInputStream() - Constructor for class org.apache.tika.io.ClosedInputStream
- closeQuietly(InputStream) - Static method in class org.apache.tika.io.IOUtils
-
Unconditionally close an
InputStream
. - closeQuietly(OutputStream) - Static method in class org.apache.tika.io.IOUtils
-
Unconditionally close an
OutputStream
. - closeQuietly(Reader) - Static method in class org.apache.tika.io.IOUtils
-
Unconditionally close an
Reader
. - closeQuietly(Writer) - Static method in class org.apache.tika.io.IOUtils
-
Unconditionally close a
Writer
. - closeQuietly(Channel) - Static method in class org.apache.tika.io.IOUtils
-
Unconditionally close a
Channel
. - CloseShieldInputStream - Class in org.apache.tika.io
-
Proxy stream that prevents the underlying input stream from being closed.
- CloseShieldInputStream(InputStream) - Constructor for class org.apache.tika.io.CloseShieldInputStream
-
Creates a proxy that shields the given input stream from being closed.
- COLOR_MODE - Static variable in interface org.apache.tika.metadata.Photoshop
- COLUMN_COUNT - Static variable in interface org.apache.tika.metadata.Database
- COLUMN_NAME - Static variable in interface org.apache.tika.metadata.Database
- COMMAND_LINE - Static variable in interface org.apache.tika.metadata.ClimateForcast
- COMMAND_TAG - Static variable in interface org.apache.tika.parser.external.ExternalParsersConfigReaderMetKeys
- COMMENT - Static variable in interface org.apache.tika.metadata.ClimateForcast
- COMMENT_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- COMMENTS - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- COMMENTS - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- COMMENTS - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- COMPANY - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- COMPANY - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- compareTo(Property) - Method in class org.apache.tika.metadata.Property
- compareTo(MediaType) - Method in class org.apache.tika.mime.MediaType
- compareTo(MimeType) - Method in class org.apache.tika.mime.MimeType
- COMPILATION - Static variable in interface org.apache.tika.metadata.XMPDM
-
"An album created by various artists."
- COMPOSER - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The composer's name."
- composite(Property, Property[]) - Static method in class org.apache.tika.metadata.Property
-
Constructs a new composite property from the given primary and array of secondary properties.
- COMPOSITE - org.apache.tika.metadata.Property.PropertyType
-
Multiple child properties
- CompositeDetector - Class in org.apache.tika.detect
-
Content type detector that combines multiple different detection mechanisms.
- CompositeDetector(List<Detector>) - Constructor for class org.apache.tika.detect.CompositeDetector
- CompositeDetector(Detector...) - Constructor for class org.apache.tika.detect.CompositeDetector
- CompositeDetector(MediaTypeRegistry, List<Detector>) - Constructor for class org.apache.tika.detect.CompositeDetector
- CompositeDetector(MediaTypeRegistry, List<Detector>, Collection<Class<? extends Detector>>) - Constructor for class org.apache.tika.detect.CompositeDetector
- CompositeDigester - Class in org.apache.tika.parser.digest
- CompositeDigester(DigestingParser.Digester...) - Constructor for class org.apache.tika.parser.digest.CompositeDigester
- CompositeEncodingDetector - Class in org.apache.tika.detect
- CompositeEncodingDetector(List<EncodingDetector>) - Constructor for class org.apache.tika.detect.CompositeEncodingDetector
- CompositeEncodingDetector(List<EncodingDetector>, Collection<Class<? extends EncodingDetector>>) - Constructor for class org.apache.tika.detect.CompositeEncodingDetector
- CompositeExternalParser - Class in org.apache.tika.parser.external
-
A Composite Parser that wraps up all the available External Parsers, and provides an easy way to access them.
- CompositeExternalParser() - Constructor for class org.apache.tika.parser.external.CompositeExternalParser
- CompositeExternalParser(MediaTypeRegistry) - Constructor for class org.apache.tika.parser.external.CompositeExternalParser
- CompositeMatcher - Class in org.apache.tika.sax.xpath
-
Composite XPath evaluation state.
- CompositeMatcher(Matcher, Matcher) - Constructor for class org.apache.tika.sax.xpath.CompositeMatcher
- CompositeParser - Class in org.apache.tika.parser
-
Composite parser that delegates parsing tasks to a component parser based on the declared content type of the incoming document.
- CompositeParser() - Constructor for class org.apache.tika.parser.CompositeParser
- CompositeParser(MediaTypeRegistry, List<Parser>) - Constructor for class org.apache.tika.parser.CompositeParser
- CompositeParser(MediaTypeRegistry, List<Parser>, Collection<Class<? extends Parser>>) - Constructor for class org.apache.tika.parser.CompositeParser
- CompositeParser(MediaTypeRegistry, Parser...) - Constructor for class org.apache.tika.parser.CompositeParser
- ConcurrentUtils - Class in org.apache.tika.utils
-
Utility Class for Concurrency in Tika
- ConcurrentUtils() - Constructor for class org.apache.tika.utils.ConcurrentUtils
- ConfigurableThreadPoolExecutor - Interface in org.apache.tika.concurrent
-
Allows Thread Pool to be Configurable.
- consume(String) - Method in interface org.apache.tika.parser.external.ExternalParser.LineConsumer
-
Consume a line
- CONTACT - Static variable in interface org.apache.tika.metadata.ClimateForcast
- CONTACT_INFO_ADDRESS - Static variable in interface org.apache.tika.metadata.IPTC
-
The contact information address part.
- CONTACT_INFO_CITY - Static variable in interface org.apache.tika.metadata.IPTC
-
The contact information city part.
- CONTACT_INFO_COUNTRY - Static variable in interface org.apache.tika.metadata.IPTC
-
The contact information country part.
- CONTACT_INFO_EMAIL - Static variable in interface org.apache.tika.metadata.IPTC
-
The contact information email address part.
- CONTACT_INFO_PHONE - Static variable in interface org.apache.tika.metadata.IPTC
-
The contact information phone number part.
- CONTACT_INFO_POSTAL_CODE - Static variable in interface org.apache.tika.metadata.IPTC
-
The contact information part denoting the local postal code.
- CONTACT_INFO_STATE_PROVINCE - Static variable in interface org.apache.tika.metadata.IPTC
-
The contact information part denoting regional information such as state or province.
- CONTACT_INFO_WEB_URL - Static variable in interface org.apache.tika.metadata.IPTC
-
The contact information web address part.
- CONTAINER_EXCEPTION - Static variable in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
- ContainerExtractor - Interface in org.apache.tika.extractor
-
Tika container extractor interface.
- CONTENT_DISPOSITION - Static variable in interface org.apache.tika.metadata.HttpHeaders
- CONTENT_ENCODING - Static variable in interface org.apache.tika.metadata.HttpHeaders
- CONTENT_LANGUAGE - Static variable in interface org.apache.tika.metadata.HttpHeaders
- CONTENT_LENGTH - Static variable in interface org.apache.tika.metadata.HttpHeaders
- CONTENT_LOCATION - Static variable in interface org.apache.tika.metadata.HttpHeaders
- CONTENT_MD5 - Static variable in interface org.apache.tika.metadata.HttpHeaders
- CONTENT_STATUS - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- CONTENT_STATUS - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
-
The status of the content.
- CONTENT_TYPE - Static variable in interface org.apache.tika.metadata.HttpHeaders
- CONTENT_TYPE_HINT - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
This is currently used to identify Content-Type that may be included within a document, such as in html documents (e.g.
- CONTENT_TYPE_OVERRIDE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- contentEquals(InputStream, InputStream) - Static method in class org.apache.tika.io.IOUtils
-
Compare the contents of two Streams to determine if they are equal or not.
- contentEquals(Reader, Reader) - Static method in class org.apache.tika.io.IOUtils
-
Compare the contents of two Readers to determine if they are equal or not.
- ContentHandlerDecorator - Class in org.apache.tika.sax
-
Decorator base class for the
ContentHandler
interface. - ContentHandlerDecorator() - Constructor for class org.apache.tika.sax.ContentHandlerDecorator
-
Creates a decorator that by default forwards incoming SAX events to a dummy content handler that simply ignores all the events.
- ContentHandlerDecorator(ContentHandler) - Constructor for class org.apache.tika.sax.ContentHandlerDecorator
-
Creates a decorator for the given SAX event handler.
- ContentHandlerFactory - Interface in org.apache.tika.sax
-
Interface to allow easier injection of code for getting a new ContentHandler
- CONTRIBUTOR - Static variable in interface org.apache.tika.metadata.DublinCore
-
An entity responsible for making contributions to the content of the resource.
- CONTRIBUTOR - Static variable in class org.apache.tika.metadata.Metadata
-
Deprecated.use TikaCoreProperties#CONTRIBUTOR
- CONTRIBUTOR - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- CONTROLLED_VOCABULARY_TERM - Static variable in interface org.apache.tika.metadata.IPTC
-
A term to describe the content of the image by a value from a Controlled Vocabulary.
- CONVENTIONS - Static variable in interface org.apache.tika.metadata.ClimateForcast
- convert(Object) - Static method in class org.apache.tika.metadata.XMPDM.ChannelTypePropertyConverter
-
Deprecated.How a standalone converter might work
- convertAndSet(Metadata, Object) - Static method in class org.apache.tika.metadata.XMPDM.ChannelTypePropertyConverter
-
Deprecated.How convert+set might work
- copy(InputStream, OutputStream) - Static method in class org.apache.tika.io.IOUtils
-
Copy bytes from an
InputStream
to anOutputStream
. - copy(InputStream, Writer) - Static method in class org.apache.tika.io.IOUtils
-
Copy bytes from an
InputStream
to chars on aWriter
using the default character encoding of the platform. - copy(InputStream, Writer, String) - Static method in class org.apache.tika.io.IOUtils
-
Copy bytes from an
InputStream
to chars on aWriter
using the specified character encoding. - copy(Reader, OutputStream) - Static method in class org.apache.tika.io.IOUtils
-
Copy chars from a
Reader
to bytes on anOutputStream
using the default character encoding of the platform, and calling flush. - copy(Reader, OutputStream, String) - Static method in class org.apache.tika.io.IOUtils
-
Copy chars from a
Reader
to bytes on anOutputStream
using the specified character encoding, and calling flush. - copy(Reader, Writer) - Static method in class org.apache.tika.io.IOUtils
-
Copy chars from a
Reader
to aWriter
. - copyLarge(InputStream, OutputStream) - Static method in class org.apache.tika.io.IOUtils
-
Copy bytes from a large (over 2GB)
InputStream
to anOutputStream
. - copyLarge(Reader, Writer) - Static method in class org.apache.tika.io.IOUtils
-
Copy chars from a large (over 2GB)
Reader
to aWriter
. - COPYRIGHT - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The copyright information."
- COPYRIGHT_NOTICE - Static variable in interface org.apache.tika.metadata.IPTC
-
Contains any necessary copyright notice for claiming the intellectual property for this item and should identify the current owner of the copyright for the item.
- COPYRIGHT_OWNER - Static variable in interface org.apache.tika.metadata.IPTC
-
Owner or owners of the copyright in the licensed image.
- COPYRIGHT_OWNER_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
The ID of the owner or owners of the copyright in the licensed image.
- COPYRIGHT_OWNER_ID_WRONG_CASE - Static variable in interface org.apache.tika.metadata.IPTC
-
Deprecated.
- COPYRIGHT_OWNER_NAME - Static variable in interface org.apache.tika.metadata.IPTC
-
The name of the owner or owners of the copyright in the licensed image.
- CorruptedFileException - Exception in org.apache.tika.exception
-
This exception should be thrown when the parse absolutely, positively has to stop.
- CorruptedFileException(String) - Constructor for exception org.apache.tika.exception.CorruptedFileException
- CorruptedFileException(String, Throwable) - Constructor for exception org.apache.tika.exception.CorruptedFileException
- count() - Method in class org.apache.tika.detect.TextStatistics
-
Returns the total number of bytes seen so far.
- count(int) - Method in class org.apache.tika.detect.TextStatistics
-
Returns the number of occurrences of the given byte.
- countControl() - Method in class org.apache.tika.detect.TextStatistics
-
Counts control characters (i.e.
- countEightBit() - Method in class org.apache.tika.detect.TextStatistics
-
Counts eight bit characters, i.e.
- CountingInputStream - Class in org.apache.tika.io
-
A decorating input stream that counts the number of bytes that have passed through the stream so far.
- CountingInputStream(InputStream) - Constructor for class org.apache.tika.io.CountingInputStream
-
Constructs a new CountingInputStream.
- COUNTRY - Static variable in interface org.apache.tika.metadata.IPTC
-
Full name of the country the content is focussing on -- either the country shown in visual media or referenced in text or audio media.
- COUNTRY - Static variable in interface org.apache.tika.metadata.Photoshop
- COUNTRY_CODE - Static variable in interface org.apache.tika.metadata.IPTC
-
Code of the country the content is focussing on -- either the country shown in visual media or referenced in text or audio media.
- countSafeAscii() - Method in class org.apache.tika.detect.TextStatistics
-
Counts "safe" (i.e.
- COVERAGE - Static variable in interface org.apache.tika.metadata.DublinCore
-
The extent or scope of the content of the resource.
- COVERAGE - Static variable in class org.apache.tika.metadata.Metadata
-
Deprecated.use TikaCoreProperties#COVERAGE
- COVERAGE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- create() - Static method in class org.apache.tika.mime.MimeTypesFactory
-
Creates an empty instance; same as calling new MimeTypes().
- create() - Static method in class org.apache.tika.parser.external.ExternalParsersFactory
- create(InputStream) - Static method in class org.apache.tika.mime.MimeTypesFactory
- create(InputStream...) - Static method in class org.apache.tika.mime.MimeTypesFactory
-
Creates and returns a MimeTypes instance from the specified input stream.
- create(String) - Static method in class org.apache.tika.mime.MimeTypesFactory
-
Creates and returns a MimeTypes instance from the specified file path, as interpreted by the class loader in getResource().
- create(String, InputStream, String) - Static method in class org.apache.tika.language.LanguageProfilerBuilder
-
Deprecated.Creates a new Language profile from (preferably quite large - 5-10k of lines) text file
- create(String, String) - Static method in class org.apache.tika.mime.MimeTypesFactory
-
Creates and returns a MimeTypes instance.
- create(String, String, ClassLoader) - Static method in class org.apache.tika.mime.MimeTypesFactory
-
Creates and returns a MimeTypes instance.
- create(String, ServiceLoader) - Static method in class org.apache.tika.parser.external.ExternalParsersFactory
- create(URL) - Static method in class org.apache.tika.mime.MimeTypesFactory
- create(URL...) - Static method in class org.apache.tika.mime.MimeTypesFactory
-
Creates and returns a MimeTypes instance from the resource at the location specified by the URL.
- create(URL...) - Static method in class org.apache.tika.parser.external.ExternalParsersFactory
- create(ServiceLoader) - Static method in class org.apache.tika.parser.external.ExternalParsersFactory
- create(Document) - Static method in class org.apache.tika.mime.MimeTypesFactory
-
Creates and returns a MimeTypes instance from the specified document.
- CREATE_DATE - Static variable in interface org.apache.tika.metadata.XMP
-
The date and time the resource was created.
- CREATED - Static variable in interface org.apache.tika.metadata.DublinCore
-
Date of creation of the resource.
- CREATED - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- createTempFile() - Method in class org.apache.tika.io.TemporaryResources
-
Creates a temporary file that will automatically be deleted when the
TemporaryResources.close()
method is called, returning its path. - createTemporaryFile() - Method in class org.apache.tika.io.TemporaryResources
-
Creates and returns a temporary file that will automatically be deleted when the
TemporaryResources.close()
method is called. - CREATION_DATE - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- CREATION_DATE - Static variable in interface org.apache.tika.metadata.Office
-
When was the document created?
- CreativeCommons - Interface in org.apache.tika.metadata
-
A collection of Creative Commons properties names.
- CREATOR - Static variable in interface org.apache.tika.metadata.DublinCore
-
An entity primarily responsible for making the content of the resource.
- CREATOR - Static variable in interface org.apache.tika.metadata.IPTC
-
Contains the name of the person who created the content of this item, a photographer for photos, a graphic artist for graphics, or a writer for textual news, but in cases where the photographer should not be identified the name of a company or organisation may be appropriate.
- CREATOR - Static variable in class org.apache.tika.metadata.Metadata
-
Deprecated.use TikaCoreProperties#CREATOR
- CREATOR - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- CREATOR_TOOL - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- CREATOR_TOOL - Static variable in interface org.apache.tika.metadata.XMP
-
The name of the first known tool used to create the resource.
- CREATORS_CONTACT_INFO - Static variable in interface org.apache.tika.metadata.IPTC
-
The creator's contact information provides all necessary information to get in contact with the creator of this item and comprises a set of sub-properties for proper addressing.
- CREATORS_JOB_TITLE - Static variable in interface org.apache.tika.metadata.IPTC
-
Contains the job title of the person who created the content of this item.
- CREDIT - Static variable in interface org.apache.tika.metadata.Photoshop
- CREDIT_LINE - Static variable in interface org.apache.tika.metadata.IPTC
-
The credit to person(s) and/or organisation(s) required by the supplier of the item to be used when published.
- CryptoParser - Class in org.apache.tika.parser
-
Decrypts the incoming document stream and delegates further parsing to another parser instance.
- CryptoParser(String, Provider, Set<MediaType>) - Constructor for class org.apache.tika.parser.CryptoParser
- CryptoParser(String, Set<MediaType>) - Constructor for class org.apache.tika.parser.CryptoParser
- CURRENT - org.apache.tika.config.TikaConfigSerializer.Mode
-
Current config, roughly as loaded
- CUSTOM_MIMES_SYS_PROP - Static variable in class org.apache.tika.mime.MimeTypesFactory
-
System property to set a path to an additional external custom mimetypes XML file to be loaded.
D
- Database - Interface in org.apache.tika.metadata
- DATE - org.apache.tika.metadata.Property.ValueType
- DATE - Static variable in interface org.apache.tika.metadata.DublinCore
-
A date associated with an event in the life cycle of the resource.
- DATE - Static variable in class org.apache.tika.metadata.Metadata
-
Deprecated.use TikaCoreProperties#CREATED
- DATE_CREATED - Static variable in interface org.apache.tika.metadata.IPTC
-
Designates the date and optionally the time the intellectual content was created rather than the date of the creation of the physical representation.
- DATE_CREATED - Static variable in interface org.apache.tika.metadata.Photoshop
- DateUtils - Class in org.apache.tika.utils
-
Date related utility methods and constants
- DateUtils() - Constructor for class org.apache.tika.utils.DateUtils
- decode(char[]) - Static method in class org.apache.tika.mime.HexCoDec
-
Decode an array of hex chars
- decode(char[], int, int) - Static method in class org.apache.tika.mime.HexCoDec
-
Decode an array of hex chars.
- decode(String) - Static method in class org.apache.tika.mime.HexCoDec
-
Decode a hex string
- DEFAULT - Static variable in interface org.apache.tika.config.InitializableProblemHandler
- DEFAULT - Static variable in class org.apache.tika.config.ParamField
- DEFAULT_MAX_ENTITY_EXPANSIONS - Static variable in class org.apache.tika.utils.XMLReaderUtils
- DEFAULT_NGRAM_LENGTH - Static variable in class org.apache.tika.language.LanguageProfile
-
Deprecated.
- DEFAULT_POOL_SIZE - Static variable in class org.apache.tika.utils.XMLReaderUtils
-
Default size for the pool of SAX Parsers and the pool of DOM builders
- DefaultDetector - Class in org.apache.tika.detect
-
A composite detector based on all the
Detector
implementations available through theservice provider mechanism
. - DefaultDetector() - Constructor for class org.apache.tika.detect.DefaultDetector
- DefaultDetector(ClassLoader) - Constructor for class org.apache.tika.detect.DefaultDetector
- DefaultDetector(MimeTypes) - Constructor for class org.apache.tika.detect.DefaultDetector
- DefaultDetector(MimeTypes, ClassLoader) - Constructor for class org.apache.tika.detect.DefaultDetector
- DefaultDetector(MimeTypes, ServiceLoader) - Constructor for class org.apache.tika.detect.DefaultDetector
- DefaultDetector(MimeTypes, ServiceLoader, Collection<Class<? extends Detector>>) - Constructor for class org.apache.tika.detect.DefaultDetector
- DefaultEncodingDetector - Class in org.apache.tika.detect
-
A composite encoding detector based on all the
EncodingDetector
implementations available through theservice provider mechanism
. - DefaultEncodingDetector() - Constructor for class org.apache.tika.detect.DefaultEncodingDetector
- DefaultEncodingDetector(ServiceLoader) - Constructor for class org.apache.tika.detect.DefaultEncodingDetector
- DefaultEncodingDetector(ServiceLoader, Collection<Class<? extends EncodingDetector>>) - Constructor for class org.apache.tika.detect.DefaultEncodingDetector
- DefaultParser - Class in org.apache.tika.parser
-
A composite parser based on all the
Parser
implementations available through theservice provider mechanism
. - DefaultParser() - Constructor for class org.apache.tika.parser.DefaultParser
- DefaultParser(ClassLoader) - Constructor for class org.apache.tika.parser.DefaultParser
- DefaultParser(MediaTypeRegistry) - Constructor for class org.apache.tika.parser.DefaultParser
- DefaultParser(MediaTypeRegistry, ClassLoader) - Constructor for class org.apache.tika.parser.DefaultParser
- DefaultParser(MediaTypeRegistry, ServiceLoader) - Constructor for class org.apache.tika.parser.DefaultParser
- DefaultParser(MediaTypeRegistry, ServiceLoader, Collection<Class<? extends Parser>>) - Constructor for class org.apache.tika.parser.DefaultParser
- DefaultParser(MediaTypeRegistry, ServiceLoader, Collection<Class<? extends Parser>>, EncodingDetector) - Constructor for class org.apache.tika.parser.DefaultParser
- DefaultParser(MediaTypeRegistry, ServiceLoader, EncodingDetector) - Constructor for class org.apache.tika.parser.DefaultParser
- DefaultProbDetector - Class in org.apache.tika.detect
-
A version of
DefaultDetector
for probabilistic mime detectors, which use statistical techniques to blend the results of differing underlying detectors when attempting to detect the type of a given file. - DefaultProbDetector() - Constructor for class org.apache.tika.detect.DefaultProbDetector
- DefaultProbDetector(ClassLoader) - Constructor for class org.apache.tika.detect.DefaultProbDetector
- DefaultProbDetector(MimeTypes) - Constructor for class org.apache.tika.detect.DefaultProbDetector
- DefaultProbDetector(ProbabilisticMimeDetectionSelector, ClassLoader) - Constructor for class org.apache.tika.detect.DefaultProbDetector
- DefaultProbDetector(ProbabilisticMimeDetectionSelector, ServiceLoader) - Constructor for class org.apache.tika.detect.DefaultProbDetector
- DefaultTranslator - Class in org.apache.tika.language.translate
-
A translator which picks the first available
Translator
implementations available through theservice provider mechanism
. - DefaultTranslator() - Constructor for class org.apache.tika.language.translate.DefaultTranslator
- DefaultTranslator(ServiceLoader) - Constructor for class org.apache.tika.language.translate.DefaultTranslator
- DelegatingParser - Class in org.apache.tika.parser
-
Base class for parser implementations that want to delegate parts of the task of parsing an input document to another parser.
- DelegatingParser() - Constructor for class org.apache.tika.parser.DelegatingParser
- DERIVED_FROM_DOCUMENTID - Static variable in interface org.apache.tika.metadata.XMPMM
-
Document id for the document that this document was derived from
- DERIVED_FROM_INSTANCEID - Static variable in interface org.apache.tika.metadata.XMPMM
-
Instance id for the document instance that this document was derived from
- descend(String, String) - Method in class org.apache.tika.sax.xpath.ChildMatcher
- descend(String, String) - Method in class org.apache.tika.sax.xpath.CompositeMatcher
- descend(String, String) - Method in class org.apache.tika.sax.xpath.Matcher
-
Returns the XPath evaluation state that results from descending to a child element with the given name.
- descend(String, String) - Method in class org.apache.tika.sax.xpath.NamedElementMatcher
- descend(String, String) - Method in class org.apache.tika.sax.xpath.SubtreeMatcher
- DESCRIPTION - Static variable in interface org.apache.tika.metadata.DublinCore
-
An account of the content of the resource.
- DESCRIPTION - Static variable in interface org.apache.tika.metadata.IPTC
-
A textual description, including captions, of the item's content, particularly used where the object is not text.
- DESCRIPTION - Static variable in class org.apache.tika.metadata.Metadata
-
Deprecated.use TikaCoreProperties#DESCRIPTION
- DESCRIPTION - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- DESCRIPTION_WRITER - Static variable in interface org.apache.tika.metadata.IPTC
-
Identifier or the name of the person involved in writing, editing or correcting the description of the content.
- detect() - Method in class org.apache.tika.language.detect.LanguageDetector
- detect(byte[]) - Method in class org.apache.tika.Tika
-
Detects the media type of the given document.
- detect(byte[], String) - Method in class org.apache.tika.Tika
-
Detects the media type of the given document.
- detect(File) - Method in class org.apache.tika.Tika
-
Detects the media type of the given file.
- detect(InputStream) - Method in class org.apache.tika.Tika
-
Detects the media type of the given document.
- detect(InputStream, String) - Method in class org.apache.tika.Tika
-
Detects the media type of the given document.
- detect(InputStream, Metadata) - Method in class org.apache.tika.detect.CompositeDetector
- detect(InputStream, Metadata) - Method in class org.apache.tika.detect.CompositeEncodingDetector
- detect(InputStream, Metadata) - Method in interface org.apache.tika.detect.Detector
-
Detects the content type of the given input document.
- detect(InputStream, Metadata) - Method in class org.apache.tika.detect.EmptyDetector
- detect(InputStream, Metadata) - Method in interface org.apache.tika.detect.EncodingDetector
-
Detects the character encoding of the given text document, or
null
if the encoding of the document can not be detected. - detect(InputStream, Metadata) - Method in class org.apache.tika.detect.MagicDetector
- detect(InputStream, Metadata) - Method in class org.apache.tika.detect.NameDetector
-
Detects the content type of an input document based on the document name given in the input metadata.
- detect(InputStream, Metadata) - Method in class org.apache.tika.detect.NonDetectingEncodingDetector
- detect(InputStream, Metadata) - Method in class org.apache.tika.detect.OverrideDetector
- detect(InputStream, Metadata) - Method in class org.apache.tika.detect.TextDetector
-
Looks at the beginning of the document input stream to determine whether the document is text or not.
- detect(InputStream, Metadata) - Method in class org.apache.tika.detect.TrainedModelDetector
- detect(InputStream, Metadata) - Method in class org.apache.tika.detect.TypeDetector
-
Detects the content type of an input document based on a type hint given in the input metadata.
- detect(InputStream, Metadata) - Method in class org.apache.tika.detect.ZeroSizeFileDetector
- detect(InputStream, Metadata) - Method in class org.apache.tika.mime.MimeTypes
-
Automatically detects the MIME type of a document based on magic markers in the stream prefix and any given metadata hints.
- detect(InputStream, Metadata) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector
- detect(InputStream, Metadata) - Method in class org.apache.tika.Tika
-
Detects the media type of the given document.
- detect(CharSequence) - Method in class org.apache.tika.language.detect.LanguageDetector
- detect(String) - Method in class org.apache.tika.Tika
-
Detects the media type of a document with the given file name.
- detect(URL) - Method in class org.apache.tika.Tika
-
Detects the media type of the resource at the given URL.
- detect(Path) - Method in class org.apache.tika.Tika
-
Detects the media type of the file at the given path.
- detectAll() - Method in class org.apache.tika.language.detect.LanguageDetector
-
Detect languages based on previously submitted text (via addText calls).
- detectAll(String) - Method in class org.apache.tika.language.detect.LanguageDetector
-
Utility wrapper that detects the language of a given chunk of text.
- Detector - Interface in org.apache.tika.detect
-
Content type detector.
- DIFContentHandler - Class in org.apache.tika.sax
- DIFContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.sax.DIFContentHandler
- digest(InputStream, Metadata, ParseContext) - Method in class org.apache.tika.parser.digest.CompositeDigester
- digest(InputStream, Metadata, ParseContext) - Method in class org.apache.tika.parser.digest.InputStreamDigester
- digest(InputStream, Metadata, ParseContext) - Method in interface org.apache.tika.parser.DigestingParser.Digester
-
Digests an InputStream and sets the appropriate value(s) in the metadata.
- DigestingParser - Class in org.apache.tika.parser
- DigestingParser(Parser, DigestingParser.Digester) - Constructor for class org.apache.tika.parser.DigestingParser
-
Creates a decorator for the given parser.
- DigestingParser.Digester - Interface in org.apache.tika.parser
-
Interface for digester.
- DigestingParser.Encoder - Interface in org.apache.tika.parser
-
Encodes byte array from a MessageDigest to String
- DIGITAL_IMAGE_GUID - Static variable in interface org.apache.tika.metadata.IPTC
-
Globally unique identifier for the item.
- DIGITAL_SOURCE_FILE_TYPE - Static variable in interface org.apache.tika.metadata.IPTC
-
Deprecated.
- DIGITAL_SOURCE_TYPE - Static variable in interface org.apache.tika.metadata.IPTC
-
The type of the source of this digital image
- DISC_NUMBER - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The disc number for part of an album set."
- dispose() - Method in class org.apache.tika.io.TemporaryResources
-
Calls the
TemporaryResources.close()
method and wraps the potentialIOException
into aTikaException
for convenience when used within Tika. - distance(LanguageProfile) - Method in class org.apache.tika.language.LanguageProfile
-
Deprecated.Calculates the geometric distance between this and the given other language profile.
- DOC_INFO_CREATED - Static variable in interface org.apache.tika.metadata.PDF
- DOC_INFO_CREATOR - Static variable in interface org.apache.tika.metadata.PDF
- DOC_INFO_CREATOR_TOOL - Static variable in interface org.apache.tika.metadata.PDF
- DOC_INFO_KEY_WORDS - Static variable in interface org.apache.tika.metadata.PDF
- DOC_INFO_MODIFICATION_DATE - Static variable in interface org.apache.tika.metadata.PDF
- DOC_INFO_PRODUCER - Static variable in interface org.apache.tika.metadata.PDF
- DOC_INFO_SUBJECT - Static variable in interface org.apache.tika.metadata.PDF
- DOC_INFO_TITLE - Static variable in interface org.apache.tika.metadata.PDF
- DOC_INFO_TRAPPED - Static variable in interface org.apache.tika.metadata.PDF
- DOC_SECURITY - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- DOCUMENTID - Static variable in interface org.apache.tika.metadata.XMPMM
-
The common identifier for all versions and renditions of a resource.
- DocumentSelector - Interface in org.apache.tika.extractor
-
Interface for different document selection strategies for purposes like embedded document extraction by a
ContainerExtractor
instance. - DublinCore - Interface in org.apache.tika.metadata
-
A collection of Dublin Core metadata names.
- DURATION - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The duration of the media file."
E
- EDIT_TIME - Static variable in interface org.apache.tika.metadata.MSOffice
-
How long has been spent editing the document?
- element(String, String) - Method in class org.apache.tika.sax.XHTMLContentHandler
-
Emits an XHTML element with the given text content.
- ElementMappingContentHandler - Class in org.apache.tika.sax
-
Content handler decorator that maps element
QName
s using aMap
. - ElementMappingContentHandler(ContentHandler, Map<QName, ElementMappingContentHandler.TargetElement>) - Constructor for class org.apache.tika.sax.ElementMappingContentHandler
- ElementMappingContentHandler.TargetElement - Class in org.apache.tika.sax
- ElementMatcher - Class in org.apache.tika.sax.xpath
-
Final evaluation state of an XPath expression that targets an element.
- ElementMatcher() - Constructor for class org.apache.tika.sax.xpath.ElementMatcher
- EMB_APP_VERSION - Static variable in interface org.apache.tika.metadata.RTFMetadata
-
if an application and version is given as part of the embedded object, this is the literal string
- EMB_CLASS - Static variable in interface org.apache.tika.metadata.RTFMetadata
- EMB_ITEM - Static variable in interface org.apache.tika.metadata.RTFMetadata
- EMB_TOPIC - Static variable in interface org.apache.tika.metadata.RTFMetadata
- embed(Metadata, InputStream, OutputStream, ParseContext) - Method in interface org.apache.tika.embedder.Embedder
-
Embeds related document metadata from the given metadata object into the given output stream.
- embed(Metadata, InputStream, OutputStream, ParseContext) - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Executes the configured external command and passes the given document stream as a simple XHTML document to the given SAX content handler.
- EMBEDDED_DEPTH - Static variable in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
- EMBEDDED_EXCEPTION - Static variable in class org.apache.tika.parser.RecursiveParserWrapper
-
Deprecated.
- EMBEDDED_EXCEPTION - Static variable in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
- EMBEDDED_EXCEPTION - Static variable in class org.apache.tika.utils.ParserUtils
- EMBEDDED_PARSER - Static variable in class org.apache.tika.utils.ParserUtils
- EMBEDDED_RELATIONSHIP_ID - Static variable in interface org.apache.tika.metadata.TikaMetadataKeys
- EMBEDDED_RESOURCE_LIMIT_REACHED - Static variable in class org.apache.tika.parser.RecursiveParserWrapper
- EMBEDDED_RESOURCE_LIMIT_REACHED - Static variable in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
- EMBEDDED_RESOURCE_PATH - Static variable in class org.apache.tika.parser.RecursiveParserWrapper
-
Deprecated.
- EMBEDDED_RESOURCE_PATH - Static variable in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
- EMBEDDED_RESOURCE_TYPE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
Embedded resource type property
- EMBEDDED_RESOURCE_TYPE - Static variable in interface org.apache.tika.metadata.TikaMetadataKeys
- EMBEDDED_RESOURCE_TYPE_KEY - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- EMBEDDED_STORAGE_CLASS_ID - Static variable in interface org.apache.tika.metadata.TikaMetadataKeys
- EmbeddedContentHandler - Class in org.apache.tika.sax
-
Content handler decorator that prevents the
EmbeddedContentHandler.startDocument()
andEmbeddedContentHandler.endDocument()
events from reaching the decorated handler. - EmbeddedContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.EmbeddedContentHandler
-
Created a decorator that prevents the given handler from receiving
EmbeddedContentHandler.startDocument()
andEmbeddedContentHandler.endDocument()
events. - EmbeddedDocumentExtractor - Interface in org.apache.tika.extractor
- EmbeddedDocumentUtil - Class in org.apache.tika.extractor
-
Utility class to handle common issues with embedded documents.
- EmbeddedDocumentUtil(ParseContext) - Constructor for class org.apache.tika.extractor.EmbeddedDocumentUtil
- EmbeddedResourceHandler - Interface in org.apache.tika.extractor
-
Tika container extractor callback interface.
- Embedder - Interface in org.apache.tika.embedder
-
Tika embedder interface
- EMPTY - Static variable in class org.apache.tika.mime.MediaType
- EmptyDetector - Class in org.apache.tika.detect
-
Dummy detector that returns application/octet-stream for all documents.
- EmptyDetector() - Constructor for class org.apache.tika.detect.EmptyDetector
- EmptyParser - Class in org.apache.tika.parser
-
Dummy parser that always produces an empty XHTML document without even attempting to parse the given document stream.
- EmptyParser() - Constructor for class org.apache.tika.parser.EmptyParser
- EmptyTranslator - Class in org.apache.tika.language.translate
-
Dummy translator that always declines to give any text.
- EmptyTranslator() - Constructor for class org.apache.tika.language.translate.EmptyTranslator
- encode(byte[]) - Static method in class org.apache.tika.mime.HexCoDec
-
Hex encode an array of bytes
- encode(byte[]) - Method in interface org.apache.tika.parser.DigestingParser.Encoder
- encode(byte[], int, int) - Static method in class org.apache.tika.mime.HexCoDec
-
Hex encode an array of bytes
- EncodingDetector - Interface in org.apache.tika.detect
-
Character encoding detector.
- ENCRYPTED - Static variable in interface org.apache.tika.metadata.WordPerfect
-
Is encrypted?.
- EncryptedDocumentException - Exception in org.apache.tika.exception
- EncryptedDocumentException() - Constructor for exception org.apache.tika.exception.EncryptedDocumentException
- EncryptedDocumentException(String) - Constructor for exception org.apache.tika.exception.EncryptedDocumentException
- EncryptedDocumentException(String, Throwable) - Constructor for exception org.apache.tika.exception.EncryptedDocumentException
- EncryptedDocumentException(Throwable) - Constructor for exception org.apache.tika.exception.EncryptedDocumentException
- endDescription() - Method in class org.apache.tika.sax.XMPContentHandler
- endDocument() - Method in class org.apache.tika.sax.ContentHandlerDecorator
- endDocument() - Method in class org.apache.tika.sax.DIFContentHandler
- endDocument() - Method in class org.apache.tika.sax.EmbeddedContentHandler
-
Ignored.
- endDocument() - Method in class org.apache.tika.sax.EndDocumentShieldingContentHandler
- endDocument() - Method in class org.apache.tika.sax.PhoneExtractingContentHandler
-
This method is called whenever the Parser is done parsing the file.
- endDocument() - Method in class org.apache.tika.sax.SafeContentHandler
- endDocument() - Method in class org.apache.tika.sax.StandardsExtractingContentHandler
-
This method is called whenever the Parser is done parsing the file.
- endDocument() - Method in class org.apache.tika.sax.TeeContentHandler
- endDocument() - Method in class org.apache.tika.sax.TextContentHandler
- endDocument() - Method in class org.apache.tika.sax.ToTextContentHandler
-
Flushes the character stream so that no characters are forgotten in internal buffers.
- endDocument() - Method in class org.apache.tika.sax.XHTMLContentHandler
-
Ends the XHTML document by writing the following footer and clearing the namespace mappings:
- endDocument() - Method in class org.apache.tika.sax.XMPContentHandler
-
Ends the XMP document by writing the following footer and clearing the namespace mappings:
- endDocument(ContentHandler, Metadata) - Method in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
-
This is called after the full parse has completed.
- endDocument(ContentHandler, Metadata) - Method in class org.apache.tika.sax.RecursiveParserWrapperHandler
- EndDocumentShieldingContentHandler - Class in org.apache.tika.sax
-
A wrapper around a
ContentHandler
which will ignore normal SAX calls toEndDocumentShieldingContentHandler.endDocument()
, and only fire them later. - EndDocumentShieldingContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.EndDocumentShieldingContentHandler
-
Creates a decorator for the given SAX event handler.
- endElement(String) - Method in class org.apache.tika.sax.XHTMLContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.mime.MimeTypesReader
- endElement(String, String, String) - Method in class org.apache.tika.sax.ContentHandlerDecorator
- endElement(String, String, String) - Method in class org.apache.tika.sax.DIFContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.sax.ElementMappingContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.sax.ExpandedTitleContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.sax.LinkContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.sax.SafeContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.sax.SecureContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.sax.TeeContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.sax.ToHTMLContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.sax.ToTextContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.sax.ToXMLContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.sax.XHTMLContentHandler
-
Ends the given element.
- endElement(String, String, String) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
- endEmbeddedDocument(ContentHandler, Metadata) - Method in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
-
This is called after parsing each embedded document.
- endEmbeddedDocument(ContentHandler, Metadata) - Method in class org.apache.tika.sax.RecursiveParserWrapperHandler
-
This is called after parsing an embedded document.
- EndianUtils - Class in org.apache.tika.io
-
General Endian Related Utilties.
- EndianUtils() - Constructor for class org.apache.tika.io.EndianUtils
- EndianUtils.BufferUnderrunException - Exception in org.apache.tika.io
- ENDLINE - Static variable in class org.apache.tika.sax.XHTMLContentHandler
-
The elements that get appended with the
XHTMLContentHandler.NL
character. - endPrefixMapping(String) - Method in class org.apache.tika.sax.ContentHandlerDecorator
- endPrefixMapping(String) - Method in class org.apache.tika.sax.TeeContentHandler
- ENGINEER - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The engineer's name."
- equals(Object) - Method in class org.apache.tika.metadata.Metadata
- equals(Object) - Method in class org.apache.tika.metadata.Property
- equals(Object) - Method in class org.apache.tika.mime.MediaType
- equals(Object) - Method in class org.apache.tika.mime.MimeType
- equals(String, String) - Static method in class org.apache.tika.language.detect.LanguageNames
- EQUIPMENT_MAKE - Static variable in interface org.apache.tika.metadata.TIFF
-
"Manufacturer of the recording equipment."
- EQUIPMENT_MODEL - Static variable in interface org.apache.tika.metadata.TIFF
-
"Model name or number of the recording equipment."
- ERROR_CODES_TAG - Static variable in interface org.apache.tika.parser.external.ExternalParsersConfigReaderMetKeys
- ErrorParser - Class in org.apache.tika.parser
-
Dummy parser that always throws a
TikaException
without even attempting to parse the given document stream. - ErrorParser() - Constructor for class org.apache.tika.parser.ErrorParser
- escapeCommandLine(String) - Static method in class org.apache.tika.utils.ProcessUtils
-
This should correctly put double-quotes around an argument if ProcessBuilder doesn't seem to work (as it doesn't on paths with spaces on Windows)
- EVENT - Static variable in interface org.apache.tika.metadata.IPTC
-
Names or describes the specific event the content relates to.
- ExceptionUtils - Class in org.apache.tika.utils
- ExceptionUtils() - Constructor for class org.apache.tika.utils.ExceptionUtils
- execute(ParseContext, Runnable) - Static method in class org.apache.tika.utils.ConcurrentUtils
-
Execute a runnable using an ExecutorService from the ParseContext if possible.
- EXIF_PAGE_COUNT - Static variable in interface org.apache.tika.metadata.TIFF
- ExpandedTitleContentHandler - Class in org.apache.tika.sax
-
Content handler decorator which wraps a
TransformerHandler
in order to allow theTITLE
tag to render as<title></title>
rather than<title/>
which is accomplished by calling theContentHandler.characters(char[], int, int)
method with alength
of 1 but a zero length char array. - ExpandedTitleContentHandler() - Constructor for class org.apache.tika.sax.ExpandedTitleContentHandler
- ExpandedTitleContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.ExpandedTitleContentHandler
- EXPERIMENT_ID - Static variable in interface org.apache.tika.metadata.ClimateForcast
- EXPOSURE_TIME - Static variable in interface org.apache.tika.metadata.TIFF
-
"Exposure time in seconds."
- extension_neg(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
- extension_trust(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
- EXTERNAL_PARSERS_TAG - Static variable in interface org.apache.tika.parser.external.ExternalParsersConfigReaderMetKeys
- externalBoolean(String) - Static method in class org.apache.tika.metadata.Property
- externalClosedChoise(String, String...) - Static method in class org.apache.tika.metadata.Property
- externalDate(String) - Static method in class org.apache.tika.metadata.Property
- ExternalEmbedder - Class in org.apache.tika.embedder
-
Embedder that uses an external program (like sed or exiftool) to embed text content and metadata into a given document.
- ExternalEmbedder() - Constructor for class org.apache.tika.embedder.ExternalEmbedder
- externalInteger(String) - Static method in class org.apache.tika.metadata.Property
- externalOpenChoise(String, String...) - Static method in class org.apache.tika.metadata.Property
- ExternalParser - Class in org.apache.tika.parser.external
-
Parser that uses an external program (like catdoc or pdf2txt) to extract text content and metadata from a given document.
- ExternalParser() - Constructor for class org.apache.tika.parser.external.ExternalParser
- ExternalParser.LineConsumer - Interface in org.apache.tika.parser.external
-
Consumer contract
- ExternalParsersConfigReader - Class in org.apache.tika.parser.external
-
Builds up ExternalParser instances based on XML file(s) which define what to run, for what, and how to process any output metadata.
- ExternalParsersConfigReader() - Constructor for class org.apache.tika.parser.external.ExternalParsersConfigReader
- ExternalParsersConfigReaderMetKeys - Interface in org.apache.tika.parser.external
-
Met Keys used by the
ExternalParsersConfigReader
. - ExternalParsersFactory - Class in org.apache.tika.parser.external
-
Creates instances of ExternalParser based on XML configuration files.
- ExternalParsersFactory() - Constructor for class org.apache.tika.parser.external.ExternalParsersFactory
- externalReal(String) - Static method in class org.apache.tika.metadata.Property
- externalText(String) - Static method in class org.apache.tika.metadata.Property
- externalTextBag(String) - Static method in class org.apache.tika.metadata.Property
- extract(TikaInputStream, ContainerExtractor, EmbeddedResourceHandler) - Method in interface org.apache.tika.extractor.ContainerExtractor
-
Processes a container file, and extracts all the embedded resources from within it.
- extract(TikaInputStream, ContainerExtractor, EmbeddedResourceHandler) - Method in class org.apache.tika.extractor.ParserContainerExtractor
- EXTRACT_CONTENT - Static variable in interface org.apache.tika.metadata.AccessPermissions
-
Should content be extracted, generally.
- EXTRACT_FOR_ACCESSIBILITY - Static variable in interface org.apache.tika.metadata.AccessPermissions
-
Should content be extracted for the purposes of accessibility.
- extractLinks(String) - Static method in class org.apache.tika.utils.RegexUtils
-
Extract urls from plain text.
- extractPhoneNumbers(String) - Static method in class org.apache.tika.sax.CleanPhoneText
- extractRootElement(byte[]) - Method in class org.apache.tika.detect.XmlRootExtractor
- extractRootElement(InputStream) - Method in class org.apache.tika.detect.XmlRootExtractor
- extractStandardReferences(String, double) - Static method in class org.apache.tika.sax.StandardsText
-
Extracts the standard references found within the given text.
F
- F_NUMBER - Static variable in interface org.apache.tika.metadata.TIFF
-
"F-Number." The f-number is the focal length divided by the "effective" aperture diameter.
- FAIL - Static variable in class org.apache.tika.sax.xpath.Matcher
-
State of a failed XPath evaluation, where nothing is matched.
- Field - Annotation Type in org.apache.tika.config
-
Field annotation is a contract for binding
Param
value from Tika Configuration to an object. - FILE_DATA_RATE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The file data rate in megabytes per second.
- FILE_ID - Static variable in interface org.apache.tika.metadata.WordPerfect
-
File identifier.
- FILE_SIZE - Static variable in interface org.apache.tika.metadata.WordPerfect
-
File size as defined in document header.
- FILE_TYPE - Static variable in interface org.apache.tika.metadata.WordPerfect
-
File type.
- FilenameUtils - Class in org.apache.tika.io
- FilenameUtils() - Constructor for class org.apache.tika.io.FilenameUtils
- FILL_IN_FORM - Static variable in interface org.apache.tika.metadata.AccessPermissions
-
Can the user fill in a form
- findDuplicateParsers(ParseContext) - Method in class org.apache.tika.parser.CompositeParser
-
Utility method that goes through all the component parsers and finds all media types for which more than one parser declares support.
- findServiceResources(String) - Method in class org.apache.tika.config.ServiceLoader
-
Returns all the available service resources matching the given pattern, such as all instances of tika-mimetypes.xml on the classpath, or all org.apache.tika.parser.Parser service files.
- FLASH_FIRED - Static variable in interface org.apache.tika.metadata.TIFF
-
Did the Flash fire when taking this image?
- flush() - Method in class org.apache.tika.language.detect.LanguageWriter
-
Ignored.
- flush() - Method in class org.apache.tika.language.ProfilingWriter
-
Deprecated.Ignored.
- FOCAL_LENGTH - Static variable in interface org.apache.tika.metadata.TIFF
-
"Focal length of the lens, in millimeters."
- Font - Interface in org.apache.tika.metadata
- FONT_NAME - Static variable in interface org.apache.tika.metadata.Font
-
Basic name of a font used in a file
- ForkParser - Class in org.apache.tika.fork
- ForkParser() - Constructor for class org.apache.tika.fork.ForkParser
- ForkParser(ClassLoader) - Constructor for class org.apache.tika.fork.ForkParser
- ForkParser(ClassLoader, Parser) - Constructor for class org.apache.tika.fork.ForkParser
- ForkParser(Path, ParserFactoryFactory) - Constructor for class org.apache.tika.fork.ForkParser
-
If you have a directory with, say, tike-app.jar and you want the child process/server to build a parser and run it from that -- so that you can keep all of those dependencies out of your client code, use this initializer.
- ForkParser(Path, ParserFactoryFactory, ClassLoader) - Constructor for class org.apache.tika.fork.ForkParser
-
EXPERT
- ForkProxy - Interface in org.apache.tika.fork
- ForkResource - Interface in org.apache.tika.fork
- FORMAT - Static variable in interface org.apache.tika.metadata.DublinCore
-
Typically, Format may include the media-type or dimensions of the resource.
- FORMAT - Static variable in class org.apache.tika.metadata.Metadata
-
Deprecated.use TikaCoreProperties#FORMAT
- FORMAT - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- formatDate(Calendar) - Static method in class org.apache.tika.utils.DateUtils
-
Returns a ISO 8601 representation of the given date.
- formatDate(Date) - Static method in class org.apache.tika.utils.DateUtils
-
Returns a ISO 8601 representation of the given date.
- formatDateUnknownTimezone(Date) - Static method in class org.apache.tika.utils.DateUtils
-
Returns a ISO 8601 representation of the given date, which is in an unknown timezone.
- forName(String) - Method in class org.apache.tika.mime.MimeTypes
-
Returns the registered media type with the given name (or alias).
- forName(String) - Static method in class org.apache.tika.utils.CharsetUtils
-
Returns Charset impl, if one exists.
- freeBuffer(ByteBuffer) - Static method in class org.apache.tika.io.MappedBufferCleaner
-
If a cleaner is available, this buffer will be cleaned.
G
- GENRE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The name of the genre."
- Geographic - Interface in org.apache.tika.metadata
-
Geographic schema.
- get(byte[]) - Static method in class org.apache.tika.io.TikaInputStream
-
Creates a TikaInputStream from the given array of bytes.
- get(byte[], Metadata) - Static method in class org.apache.tika.io.TikaInputStream
-
Creates a TikaInputStream from the given array of bytes.
- get(File) - Static method in class org.apache.tika.io.TikaInputStream
-
Deprecated.use
TikaInputStream.get(Path)
. In Tika 2.0, this will be removed or modified to throw an IOException. - get(File, Metadata) - Static method in class org.apache.tika.io.TikaInputStream
-
Deprecated.use
TikaInputStream.get(Path, Metadata)
. In Tika 2.0, this will be removed or modified to throw an IOException. - get(InputStream) - Static method in class org.apache.tika.io.TaggedInputStream
-
Casts or wraps the given stream to a TaggedInputStream instance.
- get(InputStream) - Static method in class org.apache.tika.io.TikaInputStream
-
Casts or wraps the given stream to a TikaInputStream instance.
- get(InputStream, TemporaryResources) - Static method in class org.apache.tika.io.TikaInputStream
-
Casts or wraps the given stream to a TikaInputStream instance.
- get(Class<T>) - Method in class org.apache.tika.parser.ParseContext
-
Returns the object in this context that implements the given interface.
- get(Class<T>, T) - Method in class org.apache.tika.parser.ParseContext
-
Returns the object in this context that implements the given interface, or the given default value if such an object is not found.
- get(String) - Method in class org.apache.tika.metadata.Metadata
-
Get the value associated to a metadata name.
- get(String) - Static method in class org.apache.tika.metadata.Property
-
Retrieve the property object that corresponds to the given key
- get(URI) - Static method in class org.apache.tika.io.TikaInputStream
-
Creates a TikaInputStream from the resource at the given URI.
- get(URI, Metadata) - Static method in class org.apache.tika.io.TikaInputStream
-
Creates a TikaInputStream from the resource at the given URI.
- get(URL) - Static method in class org.apache.tika.io.TikaInputStream
-
Creates a TikaInputStream from the resource at the given URL.
- get(URL, Metadata) - Static method in class org.apache.tika.io.TikaInputStream
-
Creates a TikaInputStream from the resource at the given URL.
- get(Path) - Static method in class org.apache.tika.io.TikaInputStream
-
Creates a TikaInputStream from the file at the given path.
- get(Path, Metadata) - Static method in class org.apache.tika.io.TikaInputStream
-
Creates a TikaInputStream from the file at the given path.
- get(Blob) - Static method in class org.apache.tika.io.TikaInputStream
-
Creates a TikaInputStream from the given database BLOB.
- get(Blob, Metadata) - Static method in class org.apache.tika.io.TikaInputStream
-
Creates a TikaInputStream from the given database BLOB.
- get(Property) - Method in class org.apache.tika.metadata.Metadata
-
Returns the value (if any) of the identified metadata property.
- getAcronym() - Method in class org.apache.tika.mime.MimeType
-
Returns an acronym for this mime type.
- getAliases(MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
-
Returns the set of known aliases of the given canonical media type.
- getAllComponentParsers() - Method in class org.apache.tika.parser.CompositeParser
-
Returns all parsers registered with the Composite Parser, including ones which may not currently be active.
- getAllComponentParsers() - Method in class org.apache.tika.parser.DefaultParser
- getAttributesMapping() - Method in class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
- getAttrValue(String, Attributes) - Static method in class org.apache.tika.utils.XMLReaderUtils
- getBaseType() - Method in class org.apache.tika.mime.MediaType
-
Returns the base form of the MediaType, excluding any parameters, such as "text/plain" for "text/plain; charset=utf-8"
- getByteCount() - Method in class org.apache.tika.io.CountingInputStream
-
The number of bytes that have passed through this stream.
- getCause() - Method in exception org.apache.tika.io.TaggedIOException
-
Returns the wrapped exception.
- getCause() - Method in exception org.apache.tika.sax.TaggedSAXException
-
Returns the wrapped exception.
- getCharset() - Method in class org.apache.tika.detect.AutoDetectReader
- getCharset() - Method in class org.apache.tika.detect.NonDetectingEncodingDetector
- getChildTypes(MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
-
Returns the set of known children of the given canonical media type
- getChoices() - Method in class org.apache.tika.metadata.Property
-
Returns the (immutable) set of choices for the values of this property.
- getCommand() - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Gets the command to be run.
- getCommand() - Method in class org.apache.tika.parser.external.ExternalParser
- getCommandAppendOperator() - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Gets the operator to append rather than replace a value for the command line tool, i.e.
- getCommandAssignmentDelimeter() - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Gets the delimiter for multiple assignments for the command line tool, i.e.
- getCommandAssignmentOperator() - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Gets the assignment operator for the command line tool, i.e.
- getCommandMetadataSegments(Metadata) - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Constructs a collection of command line arguments responsible for setting individual metadata fields based on the given
metadata
. - getConfidence() - Method in class org.apache.tika.language.detect.LanguageResult
- getConfig() - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
-
Deprecated.as of 1.17, use
EmbeddedDocumentUtil.getTikaConfig()
instead - getContentHandlerFactory() - Method in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
- getCount() - Method in class org.apache.tika.io.CountingInputStream
-
The number of bytes that have passed through this stream.
- getCount() - Method in class org.apache.tika.language.LanguageProfile
-
Deprecated.
- getCount(String) - Method in class org.apache.tika.language.LanguageProfile
-
Deprecated.
- getDate(Property) - Method in class org.apache.tika.metadata.Metadata
-
Returns the value of the identified Date based metadata property.
- getDecorationName() - Method in class org.apache.tika.parser.ParserDecorator
- getDefaultConfig() - Static method in class org.apache.tika.config.TikaConfig
-
Provides a default configuration (TikaConfig).
- getDefaultDetector(MimeTypes, ServiceLoader) - Static method in class org.apache.tika.config.TikaConfig
- getDefaultEncodingDetector(ServiceLoader) - Static method in class org.apache.tika.config.TikaConfig
- getDefaultLanguageDetector() - Static method in class org.apache.tika.language.detect.LanguageDetector
- getDefaultMimeTypes() - Static method in class org.apache.tika.mime.MimeTypes
-
Get the default MimeTypes.
- getDefaultMimeTypes(ClassLoader) - Static method in class org.apache.tika.mime.MimeTypes
-
Get the default MimeTypes.
- getDefaultRegistry() - Static method in class org.apache.tika.mime.MediaTypeRegistry
-
Returns the built-in media type registry included in Tika.
- getDelegateParser(ParseContext) - Method in class org.apache.tika.parser.DelegatingParser
-
Returns the parser instance to which parsing tasks should be delegated.
- getDescription() - Method in class org.apache.tika.mime.MimeType
-
Returns the description of this media type.
- getDetector() - Method in class org.apache.tika.config.TikaConfig
-
Returns the configured detector instance.
- getDetector() - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
- getDetector() - Method in class org.apache.tika.language.detect.LanguageHandler
-
Returns the language detector used by this content handler.
- getDetector() - Method in class org.apache.tika.language.detect.LanguageWriter
-
Returns the language detector used by this writer.
- getDetector() - Method in class org.apache.tika.parser.AutoDetectParser
-
Returns the type detector used by this parser to auto-detect the type of a document.
- getDetector() - Method in class org.apache.tika.Tika
-
Returns the detector instance used by this facade.
- getDetectors() - Method in class org.apache.tika.detect.CompositeDetector
-
Returns the component detectors.
- getDetectors() - Method in class org.apache.tika.detect.CompositeEncodingDetector
- getDetectors() - Method in class org.apache.tika.detect.DefaultDetector
- getDetectors() - Method in class org.apache.tika.detect.DefaultProbDetector
- getDocumentBuilder() - Method in class org.apache.tika.parser.ParseContext
-
Returns the DOM builder specified in this parsing context.
- getDocumentBuilder() - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Returns the DOM builder specified in this parsing context.
- getDocumentBuilderFactory() - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Returns the DOM builder factory specified in this parsing context.
- getEmbeddedDocumentExtractor(ParseContext) - Static method in class org.apache.tika.extractor.EmbeddedDocumentUtil
-
This offers a uniform way to get an EmbeddedDocumentExtractor from a ParseContext.
- getEncodingDetector() - Method in class org.apache.tika.config.TikaConfig
-
Returns the configured encoding detector instance
- getEncodingDetector() - Method in class org.apache.tika.parser.AbstractEncodingDetectorParser
- getEncodingDetector(ParseContext) - Method in class org.apache.tika.parser.AbstractEncodingDetectorParser
-
Look for an EncodingDetetor in the ParseContext.
- getEndDocumentWasCalled() - Method in class org.apache.tika.sax.EndDocumentShieldingContentHandler
- getErrors() - Static method in class org.apache.tika.language.LanguageIdentifier
-
Deprecated.Returns a string of error messages related to initializing language profiles
- getExecutorService() - Method in class org.apache.tika.config.TikaConfig
- getExtension() - Method in class org.apache.tika.mime.MimeType
-
Returns the preferred file extension of this type, or an empty string if no extensions are known.
- getExtension(TikaInputStream, Metadata) - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
- getExtensions() - Method in class org.apache.tika.mime.MimeType
-
Returns the list of all known file extensions of this media type.
- getFallback() - Method in class org.apache.tika.parser.CompositeParser
-
Returns the fallback parser.
- getField() - Method in class org.apache.tika.config.ParamField
- getFile() - Method in class org.apache.tika.io.TikaInputStream
- getFileChannel() - Method in class org.apache.tika.io.TikaInputStream
- getFilteredStackTrace(Throwable) - Static method in class org.apache.tika.utils.ExceptionUtils
-
Simple util to get stack trace.
- getIdentifier() - Method in class org.apache.tika.sax.StandardReference
- getIgnoredLineConsumer() - Method in class org.apache.tika.parser.external.ExternalParser
-
Gets lines consumer
- getInitializableProblemHandler() - Method in class org.apache.tika.config.ServiceLoader
-
Returns the handler for problems with initializables
- getInt(Property) - Method in class org.apache.tika.metadata.Metadata
-
Returns the value of the identified Integer based metadata property.
- getIntBE(byte[]) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE int value from the beginning of a byte array
- getIntBE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE int value from a byte array
- getIntLE(byte[]) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE int value from the beginning of a byte array
- getIntLE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE int value from a byte array
- getIntValues(Property) - Method in class org.apache.tika.metadata.Metadata
-
Gets the array of ints of the identified "seq" integer metadata property.
- getJavaCommand() - Method in class org.apache.tika.fork.ForkParser
-
Deprecated.since 1.8
- getJavaCommandAsList() - Method in class org.apache.tika.fork.ForkParser
-
Returns the command used to start the forked server process.
- getLanguage() - Method in class org.apache.tika.language.detect.LanguageHandler
-
Returns the detected language based on text handled thus far.
- getLanguage() - Method in class org.apache.tika.language.detect.LanguageResult
-
The ISO 639-1 language code (plus optional country code)
- getLanguage() - Method in class org.apache.tika.language.detect.LanguageWriter
-
Returns the detected language based on text written thus far.
- getLanguage() - Method in class org.apache.tika.language.LanguageIdentifier
-
Deprecated.Gets the identified language
- getLanguage() - Method in class org.apache.tika.language.ProfilingHandler
-
Deprecated.Returns the language that best matches the current state of the language profile.
- getLanguage() - Method in class org.apache.tika.language.ProfilingWriter
-
Deprecated.Returns the language that best matches the current state of the language profile.
- getLanguageDetectors() - Static method in class org.apache.tika.language.detect.LanguageDetector
- getLanguageDetectors(ServiceLoader) - Static method in class org.apache.tika.language.detect.LanguageDetector
- getLength() - Method in class org.apache.tika.detect.MagicDetector
- getLength() - Method in class org.apache.tika.io.TikaInputStream
-
Returns the length (in bytes) of this stream.
- getLinks() - Method in class org.apache.tika.mime.MimeType
-
Get a list of links to help document this mime type
- getLinks() - Method in class org.apache.tika.sax.LinkContentHandler
-
Returns the list of collected links.
- getLoader() - Method in class org.apache.tika.config.ServiceLoader
- getLoadErrorHandler() - Method in class org.apache.tika.config.ServiceLoader
-
Returns the load error handler used by this loader.
- getLongLE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE long value from a byte array
- getMacroLanguage(String) - Static method in class org.apache.tika.language.detect.LanguageNames
-
If language is a specific variant of a macro language (e.g.
- getMainOrganizationAcronym() - Method in class org.apache.tika.sax.StandardReference
- getMappedTagName() - Method in class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
- getMaxEntityExpansions() - Static method in class org.apache.tika.utils.XMLReaderUtils
- getMaximumCompressionRatio() - Method in class org.apache.tika.sax.SecureContentHandler
-
Returns the maximum compression ratio.
- getMaximumDepth() - Method in class org.apache.tika.sax.SecureContentHandler
-
Returns the maximum XML element nesting level.
- getMaximumPackageEntryDepth() - Method in class org.apache.tika.sax.SecureContentHandler
-
Returns the maximum package entry nesting level.
- getMaxStringLength() - Method in class org.apache.tika.Tika
-
Returns the maximum length of strings returned by the parseToString methods.
- getMediaTypeRegistry() - Method in class org.apache.tika.config.TikaConfig
- getMediaTypeRegistry() - Method in class org.apache.tika.mime.MimeTypes
- getMediaTypeRegistry() - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector
- getMediaTypeRegistry() - Method in class org.apache.tika.parser.CompositeParser
-
Returns the media type registry used to infer type relationships.
- getMetadata() - Method in class org.apache.tika.parser.RecursiveParserWrapper
-
Deprecated.use a
RecursiveParserWrapperHandler
instead - getMetadataCommandArguments() - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Gets the map of Metadata keys to command line parameters.
- getMetadataExtractionPatterns() - Method in class org.apache.tika.parser.external.ExternalParser
- getMetadataList() - Method in class org.apache.tika.sax.RecursiveParserWrapperHandler
- getMimeRepository() - Method in class org.apache.tika.config.TikaConfig
- getMimeType(File) - Method in class org.apache.tika.mime.MimeTypes
-
Deprecated.Use
Tika.detect(File)
instead - getMimeType(String) - Method in class org.apache.tika.mime.MimeTypes
-
Deprecated.Use
Tika.detect(String)
instead - getMimeTypes() - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
- getMinLength() - Method in class org.apache.tika.detect.TrainedModelDetector
- getMinLength() - Method in class org.apache.tika.mime.MimeTypes
-
Return the minimum length of data to provide to analyzing methods based on the document's content in order to check all the known MimeTypes.
- getName() - Method in class org.apache.tika.config.Param
- getName() - Method in class org.apache.tika.config.ParamField
- getName() - Method in class org.apache.tika.language.LanguageProfilerBuilder
-
Deprecated.
- getName() - Method in class org.apache.tika.metadata.Property
- getName() - Method in class org.apache.tika.mime.MimeType
-
Returns the name of this media type.
- getName(String) - Static method in class org.apache.tika.io.FilenameUtils
-
This is a duplication of the algorithm and functionality available in commons io FilenameUtils.
- getNewContentHandler() - Method in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
- getNewContentHandler() - Method in class org.apache.tika.sax.BasicContentHandlerFactory
- getNewContentHandler() - Method in interface org.apache.tika.sax.ContentHandlerFactory
- getNewContentHandler(OutputStream, String) - Method in class org.apache.tika.sax.BasicContentHandlerFactory
- getNewContentHandler(OutputStream, String) - Method in interface org.apache.tika.sax.ContentHandlerFactory
- getNewContentHandler(OutputStream, Charset) - Method in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
- getNewContentHandler(OutputStream, Charset) - Method in class org.apache.tika.sax.BasicContentHandlerFactory
- getNewContentHandler(OutputStream, Charset) - Method in interface org.apache.tika.sax.ContentHandlerFactory
- getNumOfHidden() - Method in class org.apache.tika.detect.NNTrainedModelBuilder
- getNumOfInputs() - Method in class org.apache.tika.detect.NNTrainedModelBuilder
- getNumOfOutputs() - Method in class org.apache.tika.detect.NNTrainedModelBuilder
- getOpenContainer() - Method in class org.apache.tika.io.TikaInputStream
-
Returns the open container object, such as a POIFS FileSystem in the event of an OLE2 document being detected and processed by the OLE2 detector.
- getOrganizations() - Static method in class org.apache.tika.sax.StandardOrganizations
-
Returns the map containing the collection of the most important technical standard organizations.
- getOrganzationsRegex() - Static method in class org.apache.tika.sax.StandardOrganizations
-
Returns the regular expression containing the most important technical standard organizations.
- getOutputThreshold() - Method in class org.apache.tika.sax.SecureContentHandler
-
Returns the configured output threshold.
- getParameters() - Method in class org.apache.tika.mime.MediaType
-
Returns an immutable sorted map of the parameters of this media type.
- getParams() - Method in class org.apache.tika.detect.NNTrainedModelBuilder
- getParser() - Method in class org.apache.tika.config.TikaConfig
-
Returns the configured parser instance.
- getParser() - Method in class org.apache.tika.Tika
-
Returns the parser instance used by this facade.
- getParser(Metadata) - Method in class org.apache.tika.parser.CompositeParser
-
Returns the parser that best matches the given metadata.
- getParser(Metadata, ParseContext) - Method in class org.apache.tika.parser.CompositeParser
- getParser(MediaType) - Method in class org.apache.tika.config.TikaConfig
-
Deprecated.Use the
TikaConfig.getParser()
method instead - getParserClassname(Parser) - Static method in class org.apache.tika.utils.ParserUtils
-
Identifies the real class name of the
Parser
, unwrapping anyParserDecorator
decorations on top of it. - getParsers() - Method in class org.apache.tika.parser.CompositeParser
-
Returns the component parsers.
- getParsers(ParseContext) - Method in class org.apache.tika.parser.CompositeParser
- getParsers(ParseContext) - Method in class org.apache.tika.parser.DefaultParser
- getPassword(Metadata) - Method in interface org.apache.tika.parser.PasswordProvider
-
Looks up the password for a document with the given metadata, and returns it for the Parser.
- getPasswordProvider() - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
- getPath() - Method in class org.apache.tika.io.TikaInputStream
-
If the user created this TikaInputStream with a file, the original file will be returned.
- getPath(int) - Method in class org.apache.tika.io.TikaInputStream
- getPoolSize() - Method in class org.apache.tika.fork.ForkParser
-
Returns the size of the process pool.
- getPoolSize() - Static method in class org.apache.tika.utils.XMLReaderUtils
- getPosition() - Method in class org.apache.tika.io.NullInputStream
-
Return the current position.
- getPosition() - Method in class org.apache.tika.io.TikaInputStream
-
Returns the current position within the stream.
- getPrimaryProperty() - Method in class org.apache.tika.metadata.Property
-
Gets the primary property for a composite property
- getProfile() - Method in class org.apache.tika.language.ProfilingHandler
-
Deprecated.Returns the language profile being built by this content handler.
- getProfile() - Method in class org.apache.tika.language.ProfilingWriter
-
Deprecated.Returns the language profile being built by this writer.
- getProperties(String) - Static method in class org.apache.tika.metadata.Property
- getPropertyType() - Method in class org.apache.tika.metadata.Property
- getPropertyType(String) - Static method in class org.apache.tika.metadata.Property
-
Get the type of a property
- getProvider() - Method in class org.apache.tika.parser.digest.InputStreamDigester
-
When subclassing this, becare to ensure that your provider is thread-safe (not likely) or return a new provider with each call.
- getQNameAsString(QName) - Static method in class org.apache.tika.sax.ElementMappingContentHandler
- getRawScore() - Method in class org.apache.tika.language.detect.LanguageResult
- getRegisteredMimeType(String) - Method in class org.apache.tika.mime.MimeTypes
-
Returns the registered, normalised media type with the given name (or alias).
- getRel() - Method in class org.apache.tika.sax.Link
- getResource(Class<T>) - Method in class org.apache.tika.io.TemporaryResources
-
Returns the latest of the tracked resources that implements or extends the given interface or class.
- getResourceAsStream(String) - Method in class org.apache.tika.config.ServiceLoader
-
Returns an input stream for reading the specified resource from the configured class loader.
- getSAXParser() - Method in class org.apache.tika.parser.ParseContext
-
Returns the SAX parser specified in this parsing context.
- getSAXParser() - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Returns the SAX parser specified in this parsing context.
- getSAXParserFactory() - Method in class org.apache.tika.parser.ParseContext
-
Returns the SAX parser factory specified in this parsing context.
- getSAXParserFactory() - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Returns the SAX parser factory specified in this parsing context.
- getScore() - Method in class org.apache.tika.sax.StandardReference
- getSecondaryExtractProperties() - Method in class org.apache.tika.metadata.Property
-
Gets the secondary properties for a composite property
- getSecondOrganizationAcronym() - Method in class org.apache.tika.sax.StandardReference
- getSeparator() - Method in class org.apache.tika.sax.StandardReference
- getServiceClass(Class<T>, String) - Method in class org.apache.tika.config.ServiceLoader
-
Loads and returns the named service class that's expected to implement the given interface.
- getServiceLoader() - Method in class org.apache.tika.config.TikaConfig
- getSetter() - Method in class org.apache.tika.config.ParamField
- getShortBE(byte[]) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE short value from the beginning of a byte array
- getShortBE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE short value from a byte array
- getShortLE(byte[]) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE short value from the beginning of a byte array
- getShortLE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE short value from a byte array
- getSimilarity(LanguageProfilerBuilder) - Method in class org.apache.tika.language.LanguageProfilerBuilder
-
Deprecated.Calculates a score how well NGramProfiles match each other
- getSize() - Method in class org.apache.tika.io.NullInputStream
-
Return the size this
InputStream
emulates. - getSize() - Method in class org.apache.tika.utils.RereadableInputStream
-
Returns the number of bytes read from the original stream.
- getSorted() - Method in class org.apache.tika.language.LanguageProfilerBuilder
-
Deprecated.Returns a sorted list of ngrams (sort done by 1.
- getStackTrace(Throwable) - Static method in class org.apache.tika.utils.ExceptionUtils
-
Get the full stacktrace as a string
- getSubtype() - Method in class org.apache.tika.mime.MediaType
-
Return the Sub-Type of the MediaType, such as "plain" for "text/plain"
- getSupertype(MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
-
Returns the supertype of the given type.
- getSupportedEmbedTypes() - Method in class org.apache.tika.embedder.ExternalEmbedder
- getSupportedEmbedTypes(ParseContext) - Method in interface org.apache.tika.embedder.Embedder
-
Returns the set of media types supported by this embedder when used with the given parse context.
- getSupportedEmbedTypes(ParseContext) - Method in class org.apache.tika.embedder.ExternalEmbedder
- getSupportedLanguages() - Static method in class org.apache.tika.language.LanguageIdentifier
-
Deprecated.Returns what languages are supported for language identification
- getSupportedTypes() - Method in class org.apache.tika.parser.external.ExternalParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.fork.ForkParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.CompositeParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.CryptoParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.DelegatingParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.EmptyParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ErrorParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.external.ExternalParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.NetworkParser
- getSupportedTypes(ParseContext) - Method in interface org.apache.tika.parser.Parser
-
Returns the set of media types supported by this parser when used with the given parse context.
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ParserDecorator
-
Delegates the method call to the decorated parser.
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.RecursiveParserWrapper
- getTag() - Method in exception org.apache.tika.io.TaggedIOException
-
Returns the object reference used as the tag this exception.
- getTag() - Method in exception org.apache.tika.sax.TaggedSAXException
-
Returns the object reference used as the tag this exception.
- getTail() - Method in class org.apache.tika.io.TailStream
-
Returns an array with the last data read from the underlying stream.
- getText() - Method in class org.apache.tika.sax.Link
- getThreshold() - Method in class org.apache.tika.sax.StandardsExtractingContentHandler
-
Gets the threshold to be used for selecting the standard references found within the text based on their score.
- getTikaConfig() - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
- getTitle() - Method in class org.apache.tika.sax.Link
- getTransformer() - Method in class org.apache.tika.parser.ParseContext
-
Returns the transformer specified in this parsing context.
- getTransformer() - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Returns a new transformer
- getTranslator() - Method in class org.apache.tika.config.TikaConfig
-
Returns the configured translator instance.
- getTranslator() - Method in class org.apache.tika.language.translate.DefaultTranslator
-
Returns the current translator
- getTranslator() - Method in class org.apache.tika.Tika
-
Returns the translator instance used by this facade.
- getTranslators() - Method in class org.apache.tika.language.translate.DefaultTranslator
-
Returns all available translators
- getType() - Method in class org.apache.tika.config.Param
- getType() - Method in class org.apache.tika.config.ParamField
- getType() - Method in class org.apache.tika.detect.NNTrainedModelBuilder
- getType() - Method in class org.apache.tika.mime.MediaType
-
Return the Type of the MediaType, such as "text" for "text/plain"
- getType() - Method in class org.apache.tika.mime.MimeType
-
Returns the normalized media type name.
- getType() - Method in class org.apache.tika.sax.BasicContentHandlerFactory
- getType() - Method in class org.apache.tika.sax.Link
- getTypes() - Method in class org.apache.tika.mime.MediaTypeRegistry
-
Returns the set of all known canonical media types.
- getTypeString() - Method in class org.apache.tika.config.Param
- getUByte(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
get the unsigned value of a byte.
- getUIntBE(byte[]) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE unsigned int value from a byte array
- getUIntBE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE unsigned int value from a byte array
- getUIntLE(byte[]) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE unsigned int value from a byte array
- getUIntLE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE unsigned int value from a byte array
- getUniformTypeIdentifier() - Method in class org.apache.tika.mime.MimeType
-
Get the UTI for this mime type.
- getUri() - Method in class org.apache.tika.sax.Link
- getUShortBE(byte[]) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE unsigned short value from the beginning of a byte array
- getUShortBE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE unsigned short value from a byte array
- getUShortLE(byte[]) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE unsigned short value from the beginning of a byte array
- getUShortLE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE unsigned short value from a byte array
- getValue() - Method in class org.apache.tika.config.Param
- getValues(String) - Method in class org.apache.tika.metadata.Metadata
-
Get the values associated to a metadata name.
- getValues(Property) - Method in class org.apache.tika.metadata.Metadata
-
Get the values associated to a metadata name.
- getValueType() - Method in class org.apache.tika.metadata.Property
- getWrappedParser() - Method in class org.apache.tika.parser.ParserDecorator
-
Gets the parser wrapped by this ParserDecorator
- getXMLInputFactory() - Method in class org.apache.tika.parser.ParseContext
-
Returns the StAX input factory specified in this parsing context.
- getXMLInputFactory() - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Returns the StAX input factory specified in this parsing context.
- getXMLReader() - Method in class org.apache.tika.parser.ParseContext
-
Returns the XMLReader specified in this parsing context.
- getXMLReader() - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Returns the XMLReader specified in this parsing context.
- GLOB_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
H
- handle(String, MediaType, InputStream) - Method in interface org.apache.tika.extractor.EmbeddedResourceHandler
-
Called to process an embedded resource within the container.
- handleException(SAXException) - Method in class org.apache.tika.sax.ContentHandlerDecorator
-
Handle any exceptions thrown by methods in this class.
- handleException(SAXException) - Method in class org.apache.tika.sax.TaggedContentHandler
-
Tags any
SAXException
s thrown, wrapping and re-throwing. - handleGlobError(MimeType, String, MimeTypeException, String, Attributes) - Method in class org.apache.tika.mime.MimeTypesReader
- handleInitializableProblem(String, String) - Method in interface org.apache.tika.config.InitializableProblemHandler
- handleIOException(IOException) - Method in class org.apache.tika.io.ProxyInputStream
-
Handle any IOExceptions thrown.
- handleIOException(IOException) - Method in class org.apache.tika.io.TaggedInputStream
-
Tags any IOExceptions thrown, wrapping and re-throwing.
- handleLoadError(String, Throwable) - Method in interface org.apache.tika.config.LoadErrorHandler
-
Handles a problem encountered when trying to load the specified service class.
- handleMimeError(String, MimeTypeException, String, Attributes) - Method in class org.apache.tika.mime.MimeTypesReader
- HAS_ACROFORM_FIELDS - Static variable in interface org.apache.tika.metadata.PDF
-
Has > 0 AcroForm fields
- HAS_SIGNATURE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- HAS_XFA - Static variable in interface org.apache.tika.metadata.PDF
-
Has XFA
- HAS_XMP - Static variable in interface org.apache.tika.metadata.PDF
-
Has XMP, whether or not it is valid
- hasEnoughText() - Method in class org.apache.tika.language.detect.LanguageDetector
-
Tell the caller whether more text is required for the current document before the language can be reliably detected.
- hasErrors() - Static method in class org.apache.tika.language.LanguageIdentifier
-
Deprecated.Tests whether there were errors initializing language config
- hasFile() - Method in class org.apache.tika.io.TikaInputStream
- hashCode() - Method in class org.apache.tika.metadata.Metadata
- hashCode() - Method in class org.apache.tika.metadata.Property
- hashCode() - Method in class org.apache.tika.mime.MediaType
- hashCode() - Method in class org.apache.tika.mime.MimeType
- hasHitBound() - Method in class org.apache.tika.io.BoundedInputStream
- hasHitMaximumEmbeddedResources() - Method in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
- hasLength() - Method in class org.apache.tika.io.TikaInputStream
- hasMacroLanguage(String) - Static method in class org.apache.tika.language.detect.LanguageNames
- hasMagic() - Method in class org.apache.tika.mime.MimeType
- hasModel(String) - Method in class org.apache.tika.language.detect.LanguageDetector
-
Provide information about whether a model exists for a specific language.
- hasParameters() - Method in class org.apache.tika.mime.MediaType
-
Checks whether this media type contains parameters.
- HEADLINE - Static variable in interface org.apache.tika.metadata.IPTC
-
A brief synopsis of the caption.
- HEADLINE - Static variable in interface org.apache.tika.metadata.Photoshop
- HexCoDec - Class in org.apache.tika.mime
-
A set of Hex encoding and decoding utility methods.
- HexCoDec() - Constructor for class org.apache.tika.mime.HexCoDec
- HIDDEN_SLIDES - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- HIGH - org.apache.tika.language.detect.LanguageConfidence
- HISTORY - Static variable in interface org.apache.tika.metadata.ClimateForcast
- HISTORY_ACTION - Static variable in interface org.apache.tika.metadata.XMPMM
-
Action in the XMPMM's history section
- HISTORY_EVENT_INSTANCEID - Static variable in interface org.apache.tika.metadata.XMPMM
-
Instance id in the XMPMM's history section
- HISTORY_SOFTWARE_AGENT - Static variable in interface org.apache.tika.metadata.XMPMM
-
Software agent that created the action in the XMPMM's history section
- HISTORY_WHEN - Static variable in interface org.apache.tika.metadata.XMPMM
-
When the action occurred in the XMPMM's history section
- HTML - org.apache.tika.sax.BasicContentHandlerFactory.HANDLER_TYPE
- HTML - Interface in org.apache.tika.metadata
- HttpHeaders - Interface in org.apache.tika.metadata
-
A collection of HTTP header names.
I
- ID - Static variable in interface org.apache.tika.metadata.QuattroPro
-
ID.
- IDENTIFIER - Static variable in interface org.apache.tika.metadata.DublinCore
-
Recommended best practice is to identify the resource by means of a string or number conforming to a formal identification system.
- IDENTIFIER - Static variable in class org.apache.tika.metadata.Metadata
-
Deprecated.use TikaCoreProperties#IDENTIFIER
- IDENTIFIER - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- IDENTIFIER - Static variable in interface org.apache.tika.metadata.XMP
-
An unordered array of text strings that unambiguously identify the resource within a given context.
- identifyStaticServiceProviders(Class<T>) - Method in class org.apache.tika.config.ServiceLoader
-
Returns the defined static service providers of the given type, without attempting to load them.
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.ContentHandlerDecorator
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.DIFContentHandler
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.LinkContentHandler
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.SafeContentHandler
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.SecureContentHandler
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.TeeContentHandler
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.TextContentHandler
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.ToTextContentHandler
-
Writes the given ignorable characters to the given character stream.
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.WriteOutContentHandler
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
- IGNORE - org.apache.tika.sax.BasicContentHandlerFactory.HANDLER_TYPE
- IGNORE - Static variable in interface org.apache.tika.config.InitializableProblemHandler
-
Strategy that simply ignores all problems.
- IGNORE - Static variable in interface org.apache.tika.config.LoadErrorHandler
-
Strategy that simply ignores all problems.
- image(String) - Static method in class org.apache.tika.mime.MediaType
- IMAGE_COUNT - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- IMAGE_COUNT - Static variable in interface org.apache.tika.metadata.Office
-
The number of Images in the document
- IMAGE_CREATOR - Static variable in interface org.apache.tika.metadata.IPTC
-
Creator or creators of the image.
- IMAGE_CREATOR_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
The ID of the creator or creators of the image.
- IMAGE_CREATOR_ID_WRONG_CASE - Static variable in interface org.apache.tika.metadata.IPTC
-
Deprecated.
- IMAGE_CREATOR_NAME - Static variable in interface org.apache.tika.metadata.IPTC
-
The name of the creator or creators of the image.
- IMAGE_LENGTH - Static variable in interface org.apache.tika.metadata.TIFF
-
"Image height in pixels."
- IMAGE_REGISTRY_ENTRY - Static variable in interface org.apache.tika.metadata.IPTC
-
Both a Registry Item Id and a Registry Organisation Id to record any registration of this item with a registry.
- IMAGE_SUPPLIER - Static variable in interface org.apache.tika.metadata.IPTC
-
Identifies the most recent supplier of the item, who is not necessarily its owner or creator.
- IMAGE_SUPPLIER_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
Identifies the most recent supplier of the item, who is not necessarily its owner or creator.
- IMAGE_SUPPLIER_ID_WRONG_CASE - Static variable in interface org.apache.tika.metadata.IPTC
-
Deprecated.
- IMAGE_SUPPLIER_IMAGE_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
Optional identifier assigned by the Image Supplier to the image.
- IMAGE_SUPPLIER_NAME - Static variable in interface org.apache.tika.metadata.IPTC
-
Identifies the most recent supplier of the item, who is not necessarily its owner or creator.
- IMAGE_WIDTH - Static variable in interface org.apache.tika.metadata.TIFF
-
"Image width in pixels."
- INFO - Static variable in interface org.apache.tika.config.InitializableProblemHandler
-
Strategy that logs warnings of all problems using a
Logger
created using the given class name. - init(DataInputStream, DataOutputStream) - Method in interface org.apache.tika.fork.ForkProxy
- INITIAL_AUTHOR - Static variable in interface org.apache.tika.metadata.Office
-
Name of the initial creator/author of a document
- Initializable - Interface in org.apache.tika.config
-
Components that must do special processing across multiple fields at initialization time should implement this interface.
- InitializableProblemHandler - Interface in org.apache.tika.config
-
This is to be used to handle potential recoverable problems that might arise during initialization.
- initialize(Map<String, Param>) - Method in interface org.apache.tika.config.Initializable
- initProfiles() - Static method in class org.apache.tika.language.LanguageIdentifier
-
Deprecated.Builds the language profiles.
- initProfiles(Map<String, LanguageProfile>) - Static method in class org.apache.tika.language.LanguageIdentifier
-
Deprecated.Initializes the language profiles from a user supplied initialized Map.
- INLINE - org.apache.tika.metadata.TikaCoreProperties.EmbeddedResourceType
- INPUT_FILE_TOKEN - Static variable in class org.apache.tika.parser.external.ExternalParser
-
The token, which if present in the Command string, will be replaced with the input filename.
- InputStreamDigester - Class in org.apache.tika.parser.digest
- InputStreamDigester(int, String, String, DigestingParser.Encoder) - Constructor for class org.apache.tika.parser.digest.InputStreamDigester
- InputStreamDigester(int, String, DigestingParser.Encoder) - Constructor for class org.apache.tika.parser.digest.InputStreamDigester
- INSTANCE - Static variable in class org.apache.tika.detect.EmptyDetector
-
Singleton instance of this class.
- INSTANCE - Static variable in class org.apache.tika.parser.EmptyParser
-
Singleton instance of this class.
- INSTANCE - Static variable in class org.apache.tika.parser.ErrorParser
-
Singleton instance of this class.
- INSTANCE - Static variable in class org.apache.tika.sax.xpath.AttributeMatcher
- INSTANCE - Static variable in class org.apache.tika.sax.xpath.ElementMatcher
- INSTANCE - Static variable in class org.apache.tika.sax.xpath.NodeMatcher
- INSTANCE - Static variable in class org.apache.tika.sax.xpath.TextMatcher
- INSTANCEID - Static variable in interface org.apache.tika.metadata.XMPMM
-
An identifier for a specific incarnation of a resource, updated each time a file is saved.
- inStartElement - Variable in class org.apache.tika.sax.ToXMLContentHandler
- INSTITUTION - Static variable in interface org.apache.tika.metadata.ClimateForcast
- INSTRUCTIONS - Static variable in interface org.apache.tika.metadata.IPTC
-
Any of a number of instructions from the provider or creator to the receiver of the item.
- INSTRUCTIONS - Static variable in interface org.apache.tika.metadata.Photoshop
- INSTRUMENT - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The musical instrument."
- INTEGER - org.apache.tika.metadata.Property.ValueType
- INTELLECTUAL_GENRE - Static variable in interface org.apache.tika.metadata.IPTC
-
Describes the nature, intellectual, artistic or journalistic characteristic of a item, not specifically its content.
- internalBoolean(String) - Static method in class org.apache.tika.metadata.Property
- internalClosedChoise(String, String...) - Static method in class org.apache.tika.metadata.Property
- internalDate(String) - Static method in class org.apache.tika.metadata.Property
- internalInteger(String) - Static method in class org.apache.tika.metadata.Property
- internalIntegerSequence(String) - Static method in class org.apache.tika.metadata.Property
- internalOpenChoise(String, String...) - Static method in class org.apache.tika.metadata.Property
- internalRational(String) - Static method in class org.apache.tika.metadata.Property
- internalReal(String) - Static method in class org.apache.tika.metadata.Property
- internalText(String) - Static method in class org.apache.tika.metadata.Property
- internalTextBag(String) - Static method in class org.apache.tika.metadata.Property
- internalURI(String) - Static method in class org.apache.tika.metadata.Property
- INTERPRETED_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- IOExceptionWithCause - Exception in org.apache.tika.io
-
Subclasses IOException with the
Throwable
constructors missing before Java 6. - IOExceptionWithCause(String, Throwable) - Constructor for exception org.apache.tika.io.IOExceptionWithCause
-
Constructs a new instance with the given message and cause.
- IOExceptionWithCause(Throwable) - Constructor for exception org.apache.tika.io.IOExceptionWithCause
-
Constructs a new instance with the given cause.
- IOUtils - Class in org.apache.tika.io
-
General IO stream manipulation utilities.
- IOUtils() - Constructor for class org.apache.tika.io.IOUtils
-
Instances should NOT be constructed in standard programming.
- IPTC - Interface in org.apache.tika.metadata
-
IPTC photo metadata schema.
- IPTC_LAST_EDITED - Static variable in interface org.apache.tika.metadata.IPTC
-
The date and optionally time when any of the IPTC photo metadata fields has been last edited
- IS_ENCRYPTED - Static variable in interface org.apache.tika.metadata.PDF
- IS_OS_AIX - Static variable in class org.apache.tika.utils.SystemUtils
- IS_OS_HP_UX - Static variable in class org.apache.tika.utils.SystemUtils
- IS_OS_IRIX - Static variable in class org.apache.tika.utils.SystemUtils
- IS_OS_LINUX - Static variable in class org.apache.tika.utils.SystemUtils
- IS_OS_MAC - Static variable in class org.apache.tika.utils.SystemUtils
- IS_OS_MAC_OSX - Static variable in class org.apache.tika.utils.SystemUtils
- IS_OS_OS2 - Static variable in class org.apache.tika.utils.SystemUtils
- IS_OS_SOLARIS - Static variable in class org.apache.tika.utils.SystemUtils
- IS_OS_SUN_OS - Static variable in class org.apache.tika.utils.SystemUtils
- IS_OS_UNIX - Static variable in class org.apache.tika.utils.SystemUtils
- IS_OS_WINDOWS - Static variable in class org.apache.tika.utils.SystemUtils
- isAnchor() - Method in class org.apache.tika.sax.Link
- isAvailable() - Method in class org.apache.tika.language.translate.DefaultTranslator
- isAvailable() - Method in class org.apache.tika.language.translate.EmptyTranslator
- isAvailable() - Method in interface org.apache.tika.language.translate.Translator
- isCauseOf(IOException) - Method in class org.apache.tika.io.TaggedInputStream
-
Tests if the given exception was caused by this stream.
- isCauseOf(SAXException) - Method in class org.apache.tika.sax.TaggedContentHandler
-
Tests if the given exception was caused by this handler.
- isDynamic() - Method in class org.apache.tika.config.ServiceLoader
-
Returns if the service loader is static or dynamic
- isExternal() - Method in class org.apache.tika.metadata.Property
- isIframe() - Method in class org.apache.tika.sax.Link
- isImage() - Method in class org.apache.tika.sax.Link
- isInstanceOf(String, MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
-
Parses and normalises the given media type string and checks whether the result equals the given base type or is a specialization of it.
- isInstanceOf(MediaType, MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
-
Checks whether the given media type equals the given base type or is a specialization of it.
- isInternal() - Method in class org.apache.tika.metadata.Property
- isInvalid(int) - Method in class org.apache.tika.sax.SafeContentHandler
-
Checks whether the given Unicode character is an invalid XML character and should be replaced for output.
- isInvalid(int) - Method in class org.apache.tika.sax.XHTMLContentHandler
- isLanguage(String) - Method in class org.apache.tika.language.detect.LanguageResult
-
Return true if the target language matches the detected language.
- isLink() - Method in class org.apache.tika.sax.Link
- isMacroLanguage(String) - Static method in class org.apache.tika.language.detect.LanguageNames
- isMixedLanguages() - Method in class org.apache.tika.language.detect.LanguageDetector
- isMostlyAscii() - Method in class org.apache.tika.detect.TextStatistics
-
Checks whether at least one byte was seen and that the bytes that were seen were mostly plain text (i.e.
- isMultiValued(String) - Method in class org.apache.tika.metadata.Metadata
-
Returns true if named value is multivalued.
- isMultiValued(Property) - Method in class org.apache.tika.metadata.Metadata
-
Returns true if named value is multivalued.
- isMultiValuePermitted() - Method in class org.apache.tika.metadata.Property
-
Is the PropertyType one which accepts multiple values?
- ISO_SPEED_RATINGS - Static variable in interface org.apache.tika.metadata.TIFF
-
"ISO Speed and ISO Latitude of the input device as specified in ISO 12232"
- isQuoteAssignmentValues() - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Gets whether or not to quote assignment values, i.e.
- isReasonablyCertain() - Method in class org.apache.tika.language.detect.LanguageResult
- isReasonablyCertain() - Method in class org.apache.tika.language.LanguageIdentifier
-
Deprecated.Tries to judge whether the identification is certain enough to be trusted.
- ISREGEX_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- isRequired() - Method in class org.apache.tika.config.ParamField
- isScript() - Method in class org.apache.tika.sax.Link
- isShortText() - Method in class org.apache.tika.language.detect.LanguageDetector
- isSpecializationOf(MediaType, MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
-
Checks whether the given media type a is a specialization of a more generic type b.
- isSupported(String) - Static method in class org.apache.tika.utils.CharsetUtils
-
Safely return whether
is supported, without throwing exceptions - isSupported(TikaInputStream) - Method in interface org.apache.tika.extractor.ContainerExtractor
-
Is this Container Extractor able to process the supplied container?
- isSupported(TikaInputStream) - Method in class org.apache.tika.extractor.ParserContainerExtractor
- isTikaInputStream(InputStream) - Static method in class org.apache.tika.io.TikaInputStream
-
Checks whether the given stream is a TikaInputStream instance.
- isUnknown() - Method in class org.apache.tika.language.detect.LanguageResult
- isValid(String) - Static method in class org.apache.tika.mime.MimeType
-
Checks that the given string is a valid Internet media type name based on rules from RFC 2054 section 5.3.
- isWriteLimitReached(Throwable) - Method in class org.apache.tika.sax.WriteOutContentHandler
-
Checks whether the given exception (or any of it's root causes) was thrown by this handler as a signal of reaching the write limit.
J
- JOB_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
Number or identifier for the purpose of improved workflow handling.
K
- KEY - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The audio's musical key."
- KEYWORDS - Static variable in interface org.apache.tika.metadata.IPTC
-
Keywords to express the subject of the content.
- KEYWORDS - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- KEYWORDS - Static variable in interface org.apache.tika.metadata.Office
-
Keywords pertaining to a document.
- KEYWORDS - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
DublinCore.SUBJECT
; should include both subject and keywords if a document format has both.
L
- LABEL - Static variable in interface org.apache.tika.metadata.XMP
-
A word or short phrase that identifies a resource as a member of a userdefined collection.
- LANGUAGE - Static variable in interface org.apache.tika.metadata.DublinCore
-
A language of the intellectual content of the resource.
- LANGUAGE - Static variable in class org.apache.tika.metadata.Metadata
-
Deprecated.use TikaCoreProperties#LANGUAGE
- LANGUAGE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- LanguageConfidence - Enum in org.apache.tika.language.detect
- LanguageDetector - Class in org.apache.tika.language.detect
- LanguageDetector() - Constructor for class org.apache.tika.language.detect.LanguageDetector
- LanguageHandler - Class in org.apache.tika.language.detect
-
SAX content handler that updates a language detector based on all the received character content.
- LanguageHandler() - Constructor for class org.apache.tika.language.detect.LanguageHandler
- LanguageHandler(LanguageDetector) - Constructor for class org.apache.tika.language.detect.LanguageHandler
- LanguageHandler(LanguageWriter) - Constructor for class org.apache.tika.language.detect.LanguageHandler
- LanguageIdentifier - Class in org.apache.tika.language
-
Deprecated.use a concrete class of
LanguageDetector
- LanguageIdentifier(String) - Constructor for class org.apache.tika.language.LanguageIdentifier
-
Deprecated.Constructs a language identifier based on a String of text content
- LanguageIdentifier(LanguageProfile) - Constructor for class org.apache.tika.language.LanguageIdentifier
-
Deprecated.Constructs a language identifier based on a LanguageProfile
- LanguageNames - Class in org.apache.tika.language.detect
-
Support for language tags (as defined by https://tools.ietf.org/html/bcp47) See https://en.wikipedia.org/wiki/List_of_ISO_639-3_codes for a list of three character language codes.
- LanguageNames() - Constructor for class org.apache.tika.language.detect.LanguageNames
- LanguageProfile - Class in org.apache.tika.language
-
Deprecated.
- LanguageProfile() - Constructor for class org.apache.tika.language.LanguageProfile
-
Deprecated.
- LanguageProfile(int) - Constructor for class org.apache.tika.language.LanguageProfile
-
Deprecated.
- LanguageProfile(String) - Constructor for class org.apache.tika.language.LanguageProfile
-
Deprecated.
- LanguageProfile(String, int) - Constructor for class org.apache.tika.language.LanguageProfile
-
Deprecated.
- LanguageProfilerBuilder - Class in org.apache.tika.language
-
Deprecated.
- LanguageProfilerBuilder(String) - Constructor for class org.apache.tika.language.LanguageProfilerBuilder
-
Deprecated.Constructs a new ngram profile where minlen=3, maxlen=3
- LanguageProfilerBuilder(String, int, int) - Constructor for class org.apache.tika.language.LanguageProfilerBuilder
-
Deprecated.Constructs a new ngram profile
- LanguageResult - Class in org.apache.tika.language.detect
- LanguageResult(String, LanguageConfidence, float) - Constructor for class org.apache.tika.language.detect.LanguageResult
- LanguageWriter - Class in org.apache.tika.language.detect
-
Writer that builds a language profile based on all the written content.
- LanguageWriter(LanguageDetector) - Constructor for class org.apache.tika.language.detect.LanguageWriter
- LAST_AUTHOR - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- LAST_AUTHOR - Static variable in interface org.apache.tika.metadata.Office
-
Name of the last (most recent) author of a document
- LAST_MODIFIED - Static variable in interface org.apache.tika.metadata.HttpHeaders
- LAST_MODIFIED_BY - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
-
The user who performed the last modification.
- LAST_PRINTED - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- LAST_PRINTED - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
-
The date and time of the last printing.
- LAST_SAVED - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- LATITUDE - Static variable in interface org.apache.tika.metadata.Geographic
-
The WGS84 Latitude of the Point
- LATITUDE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- LICENSE_LOCATION - Static variable in interface org.apache.tika.metadata.CreativeCommons
- LICENSE_URL - Static variable in interface org.apache.tika.metadata.CreativeCommons
- LICENSOR - Static variable in interface org.apache.tika.metadata.IPTC
-
A person or company that should be contacted to obtain a licence for using the item or who has licensed the item.
- LICENSOR_CITY - Static variable in interface org.apache.tika.metadata.IPTC
-
The city of a person or company that should be contacted to obtain a licence for using the item or who has licensed the item.
- LICENSOR_COUNTRY - Static variable in interface org.apache.tika.metadata.IPTC
-
The country of a person or company that should be contacted to obtain a licence for using the item or who has licensed the item.
- LICENSOR_EMAIL - Static variable in interface org.apache.tika.metadata.IPTC
-
The email of a person or company that should be contacted to obtain a licence for using the item or who has licensed the item.
- LICENSOR_EXTENDED_ADDRESS - Static variable in interface org.apache.tika.metadata.IPTC
-
The extended address of a person or company that should be contacted to obtain a licence for using the item or who has licensed the item.
- LICENSOR_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
The ID of the person or company that should be contacted to obtain a licence for using the item or who has licensed the item.
- LICENSOR_ID_WRONG_CASE - Static variable in interface org.apache.tika.metadata.IPTC
-
Deprecated.use
IPTC.LICENSOR_ID
- LICENSOR_NAME - Static variable in interface org.apache.tika.metadata.IPTC
-
The name of the person or company that should be contacted to obtain a licence for using the item or who has licensed the item.
- LICENSOR_POSTAL_CODE - Static variable in interface org.apache.tika.metadata.IPTC
-
The postal code of a person or company that should be contacted to obtain a licence for using the item or who has licensed the item.
- LICENSOR_REGION - Static variable in interface org.apache.tika.metadata.IPTC
-
The region of a person or company that should be contacted to obtain a licence for using the item or who has licensed the item.
- LICENSOR_STREET_ADDRESS - Static variable in interface org.apache.tika.metadata.IPTC
-
The street address of a person or company that should be contacted to obtain a licence for using the item or who has licensed the item.
- LICENSOR_TELEPHONE_1 - Static variable in interface org.apache.tika.metadata.IPTC
-
The phone number of a person or company that should be contacted to obtain a licence for using the item or who has licensed the item.
- LICENSOR_TELEPHONE_2 - Static variable in interface org.apache.tika.metadata.IPTC
-
The phone number of a person or company that should be contacted to obtain a licence for using the item or who has licensed the item.
- LICENSOR_URL - Static variable in interface org.apache.tika.metadata.IPTC
-
The URL of a person or company that should be contacted to obtain a licence for using the item or who has licensed the item.
- LINE_COUNT - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- LINE_COUNT - Static variable in interface org.apache.tika.metadata.Office
-
The number of lines in the document
- Link - Class in org.apache.tika.sax
- Link(String, String, String, String) - Constructor for class org.apache.tika.sax.Link
- Link(String, String, String, String, String) - Constructor for class org.apache.tika.sax.Link
- LinkContentHandler - Class in org.apache.tika.sax
-
Content handler that collects links from an XHTML document.
- LinkContentHandler() - Constructor for class org.apache.tika.sax.LinkContentHandler
-
Default constructor
- LinkContentHandler(boolean) - Constructor for class org.apache.tika.sax.LinkContentHandler
-
Default constructor
- load(InputStream) - Static method in class org.apache.tika.config.Param
- load(InputStream) - Method in class org.apache.tika.language.LanguageProfilerBuilder
-
Deprecated.Loads a ngram profile from an InputStream (assumes UTF-8 encoded content)
- load(Node) - Static method in class org.apache.tika.config.Param
- loadDefaultModels(File) - Method in class org.apache.tika.detect.TrainedModelDetector
- loadDefaultModels(InputStream) - Method in class org.apache.tika.detect.NNExampleModelDetector
- loadDefaultModels(InputStream) - Method in class org.apache.tika.detect.TrainedModelDetector
- loadDefaultModels(ClassLoader) - Method in class org.apache.tika.detect.NNExampleModelDetector
-
this method gets overwritten to register load neural network models
- loadDefaultModels(ClassLoader) - Method in class org.apache.tika.detect.TrainedModelDetector
- loadDefaultModels(Path) - Method in class org.apache.tika.detect.TrainedModelDetector
- loadDynamicServiceProviders(Class<T>) - Method in class org.apache.tika.config.ServiceLoader
-
Returns the available dynamic service providers of the given type.
- LoadErrorHandler - Interface in org.apache.tika.config
-
Interface for error handling strategies in service class loading.
- loadModels() - Method in class org.apache.tika.language.detect.LanguageDetector
-
Load (or re-load) all available language models.
- loadModels(Set<String>) - Method in class org.apache.tika.language.detect.LanguageDetector
-
Load (or re-load) the models specified in
. - loadServiceProviders(Class<T>) - Method in class org.apache.tika.config.ServiceLoader
-
Returns all the available service providers of the given type.
- loadStaticServiceProviders(Class<T>) - Method in class org.apache.tika.config.ServiceLoader
-
Returns the available static service providers of the given type.
- LOCAL_NAME_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- LOCALE - org.apache.tika.metadata.Property.ValueType
- LOCATION - Static variable in interface org.apache.tika.metadata.HttpHeaders
- LOCATION_CREATED - Static variable in interface org.apache.tika.metadata.IPTC
-
The location the content of the item was created.
- LOCATION_CREATED_CITY - Static variable in interface org.apache.tika.metadata.IPTC
-
Name of the city of a location.
- LOCATION_CREATED_COUNTRY_CODE - Static variable in interface org.apache.tika.metadata.IPTC
-
The ISO code of a country of a location.
- LOCATION_CREATED_COUNTRY_NAME - Static variable in interface org.apache.tika.metadata.IPTC
-
The name of a country of a location.
- LOCATION_CREATED_PROVINCE_OR_STATE - Static variable in interface org.apache.tika.metadata.IPTC
-
The name of a subregion of a country - a province or state - of a location.
- LOCATION_CREATED_SUBLOCATION - Static variable in interface org.apache.tika.metadata.IPTC
-
Name of a sublocation.
- LOCATION_CREATED_WORLD_REGION - Static variable in interface org.apache.tika.metadata.IPTC
-
The name of a world region of a location.
- LOCATION_SHOWN - Static variable in interface org.apache.tika.metadata.IPTC
-
A location the content of the item is about.
- LOCATION_SHOWN_CITY - Static variable in interface org.apache.tika.metadata.IPTC
-
Name of the city of a location.
- LOCATION_SHOWN_COUNTRY_CODE - Static variable in interface org.apache.tika.metadata.IPTC
-
The ISO code of a country of a location.
- LOCATION_SHOWN_COUNTRY_NAME - Static variable in interface org.apache.tika.metadata.IPTC
-
The name of a country of a location.
- LOCATION_SHOWN_PROVINCE_OR_STATE - Static variable in interface org.apache.tika.metadata.IPTC
-
The name of a subregion of a country - a province or state - of a location.
- LOCATION_SHOWN_SUBLOCATION - Static variable in interface org.apache.tika.metadata.IPTC
-
Name of a sublocation.
- LOCATION_SHOWN_WORLD_REGION - Static variable in interface org.apache.tika.metadata.IPTC
-
The name of a world region of a location.
- LOG_COMMENT - Static variable in interface org.apache.tika.metadata.XMPDM
-
"User's log comments."
- LONGITUDE - Static variable in interface org.apache.tika.metadata.Geographic
-
The WGS84 Longitude of the Point
- LONGITUDE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- LookaheadInputStream - Class in org.apache.tika.io
-
Stream wrapper that make it easy to read up to n bytes ahead from a stream that supports the mark feature.
- LookaheadInputStream(InputStream, int) - Constructor for class org.apache.tika.io.LookaheadInputStream
-
Creates a lookahead wrapper for the given input stream.
- looksLikeUTF8() - Method in class org.apache.tika.detect.TextStatistics
-
Checks whether the observed byte stream looks like UTF-8 encoded text.
- LOOP - Static variable in interface org.apache.tika.metadata.XMPDM
-
"When true, the clip can be looped seamlessly."
- LOW - org.apache.tika.language.detect.LanguageConfidence
- LOWEST_VERSION - Static variable in interface org.apache.tika.metadata.QuattroPro
-
Lowest version.
M
- MACRO - org.apache.tika.metadata.TikaCoreProperties.EmbeddedResourceType
- magic_neg(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
- MAGIC_PRIORITY_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- MAGIC_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- magic_trust(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
- MagicDetector - Class in org.apache.tika.detect
-
Content type detection based on magic bytes, i.e.
- MagicDetector(MediaType, byte[]) - Constructor for class org.apache.tika.detect.MagicDetector
-
Creates a detector for input documents that have the exact given byte pattern at the beginning of the document stream.
- MagicDetector(MediaType, byte[], byte[], boolean, boolean, int, int) - Constructor for class org.apache.tika.detect.MagicDetector
-
Creates a detector for input documents that meet the specified magic match.
- MagicDetector(MediaType, byte[], byte[], boolean, int, int) - Constructor for class org.apache.tika.detect.MagicDetector
-
Creates a detector for input documents that meet the specified magic match.
- MagicDetector(MediaType, byte[], byte[], int, int) - Constructor for class org.apache.tika.detect.MagicDetector
-
Creates a detector for input documents that meet the specified magic match.
- MagicDetector(MediaType, byte[], int) - Constructor for class org.apache.tika.detect.MagicDetector
-
Creates a detector for input documents that have the exact given byte pattern at the given offset of the document stream.
- main(String[]) - Static method in class org.apache.tika.language.LanguageProfilerBuilder
-
Deprecated.main method used for testing only
- MAJOR_VERSION - Static variable in interface org.apache.tika.metadata.WordPerfect
-
Major version.
- makeName(String, String, String) - Static method in class org.apache.tika.language.detect.LanguageNames
- MANAGER - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- MANAGER - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- mapAttributes(Attributes) - Method in class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
- MAPI_FROM_REPRESENTING_EMAIL - Static variable in interface org.apache.tika.metadata.Office
- MAPI_FROM_REPRESENTING_NAME - Static variable in interface org.apache.tika.metadata.Office
- MAPI_MESSAGE_CLASS - Static variable in interface org.apache.tika.metadata.Office
-
MAPI message class.
- MAPI_MESSAGE_CLIENT_SUBMIT_TIME - Static variable in interface org.apache.tika.metadata.Office
- MAPI_SENT_BY_SERVER_TYPE - Static variable in interface org.apache.tika.metadata.Office
- MappedBufferCleaner - Class in org.apache.tika.io
-
Copied/pasted from the Apache Lucene/Solr project.
- MappedBufferCleaner() - Constructor for class org.apache.tika.io.MappedBufferCleaner
- mark(int) - Method in class org.apache.tika.io.BoundedInputStream
- mark(int) - Method in class org.apache.tika.io.LookaheadInputStream
- mark(int) - Method in class org.apache.tika.io.NullInputStream
-
Mark the current position.
- mark(int) - Method in class org.apache.tika.io.ProxyInputStream
-
Invokes the delegate's
mark(int)
method. - mark(int) - Method in class org.apache.tika.io.TailStream
-
This implementation saves the internal state including the content of the tail buffer so that it can be restored when ''reset()'' is called later.
- mark(int) - Method in class org.apache.tika.io.TikaInputStream
- MARKED - Static variable in interface org.apache.tika.metadata.XMPRights
-
When true, indicates that this is a rights-managed resource.
- markSupported() - Method in class org.apache.tika.io.LookaheadInputStream
- markSupported() - Method in class org.apache.tika.io.NullInputStream
-
Indicates whether mark is supported.
- markSupported() - Method in class org.apache.tika.io.ProxyInputStream
-
Invokes the delegate's
markSupported()
method. - markSupported() - Method in class org.apache.tika.io.TikaInputStream
- MATCH_MASK_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- MATCH_OFFSET_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- MATCH_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- MATCH_TYPE_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- MATCH_VALUE_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- Matcher - Class in org.apache.tika.sax.xpath
-
XPath element matcher.
- Matcher() - Constructor for class org.apache.tika.sax.xpath.Matcher
- matches(byte[]) - Method in class org.apache.tika.mime.MimeType
- matchesAttribute(String, String) - Method in class org.apache.tika.sax.xpath.AttributeMatcher
- matchesAttribute(String, String) - Method in class org.apache.tika.sax.xpath.CompositeMatcher
- matchesAttribute(String, String) - Method in class org.apache.tika.sax.xpath.Matcher
-
Returns
true
if the XPath expression matches the named attribute of the element associated with this evaluation state. - matchesAttribute(String, String) - Method in class org.apache.tika.sax.xpath.NamedAttributeMatcher
- matchesAttribute(String, String) - Method in class org.apache.tika.sax.xpath.NodeMatcher
- matchesAttribute(String, String) - Method in class org.apache.tika.sax.xpath.SubtreeMatcher
- matchesElement() - Method in class org.apache.tika.sax.xpath.CompositeMatcher
- matchesElement() - Method in class org.apache.tika.sax.xpath.ElementMatcher
- matchesElement() - Method in class org.apache.tika.sax.xpath.Matcher
-
Returns
true
if the XPath expression matches the element associated with this evaluation state. - matchesElement() - Method in class org.apache.tika.sax.xpath.NodeMatcher
- matchesElement() - Method in class org.apache.tika.sax.xpath.SubtreeMatcher
- matchesMagic(byte[]) - Method in class org.apache.tika.mime.MimeType
- matchesText() - Method in class org.apache.tika.sax.xpath.CompositeMatcher
- matchesText() - Method in class org.apache.tika.sax.xpath.Matcher
-
Returns
true
if the XPath expression matches all text nodes whose parent is the element associated with this evaluation state. - matchesText() - Method in class org.apache.tika.sax.xpath.NodeMatcher
- matchesText() - Method in class org.apache.tika.sax.xpath.SubtreeMatcher
- matchesText() - Method in class org.apache.tika.sax.xpath.TextMatcher
- MatchingContentHandler - Class in org.apache.tika.sax.xpath
-
Content handler decorator that only passes the elements, attributes, and text nodes that match the given XPath expression.
- MatchingContentHandler(ContentHandler, Matcher) - Constructor for class org.apache.tika.sax.xpath.MatchingContentHandler
- MAX_AVAIL_HEIGHT - Static variable in interface org.apache.tika.metadata.IPTC
-
The maximum available height in pixels of the original photo from which this photo has been derived by downsizing.
- MAX_AVAIL_WIDTH - Static variable in interface org.apache.tika.metadata.IPTC
-
The maximum available width in pixels of the original photo from which this photo has been derived by downsizing.
- MediaType - Class in org.apache.tika.mime
-
Internet media type.
- MediaType(String, String) - Constructor for class org.apache.tika.mime.MediaType
- MediaType(String, String, Map<String, String>) - Constructor for class org.apache.tika.mime.MediaType
- MediaType(MediaType, String, String) - Constructor for class org.apache.tika.mime.MediaType
-
Creates a media type by adding a parameter to a base type.
- MediaType(MediaType, Charset) - Constructor for class org.apache.tika.mime.MediaType
-
Creates a media type by adding the "charset" parameter to a base type.
- MediaType(MediaType, Map<String, String>) - Constructor for class org.apache.tika.mime.MediaType
- MediaTypeRegistry - Class in org.apache.tika.mime
-
Registry of known Internet media types.
- MediaTypeRegistry() - Constructor for class org.apache.tika.mime.MediaTypeRegistry
- MEDIUM - org.apache.tika.language.detect.LanguageConfidence
- Message - Interface in org.apache.tika.metadata
-
A collection of Message related property names.
- MESSAGE_BCC - Static variable in interface org.apache.tika.metadata.Message
- MESSAGE_BCC_DISPLAY_NAME - Static variable in interface org.apache.tika.metadata.Message
- MESSAGE_BCC_EMAIL - Static variable in interface org.apache.tika.metadata.Message
-
Where possible, this records the email value in the bcc field.
- MESSAGE_BCC_NAME - Static variable in interface org.apache.tika.metadata.Message
-
In Outlook messages, there are sometimes separate fields for "bcc-name" and "bcc-display-name" name.
- MESSAGE_CC - Static variable in interface org.apache.tika.metadata.Message
- MESSAGE_CC_DISPLAY_NAME - Static variable in interface org.apache.tika.metadata.Message
- MESSAGE_CC_EMAIL - Static variable in interface org.apache.tika.metadata.Message
-
Where possible, this records the email value in the cc field.
- MESSAGE_CC_NAME - Static variable in interface org.apache.tika.metadata.Message
-
In Outlook messages, there are sometimes separate fields for "cc-name" and "cc-display-name" name.
- MESSAGE_FROM - Static variable in interface org.apache.tika.metadata.Message
- MESSAGE_FROM_EMAIL - Static variable in interface org.apache.tika.metadata.Message
-
Where possible, this records the value from the name field.
- MESSAGE_FROM_NAME - Static variable in interface org.apache.tika.metadata.Message
-
Where possible, this records the value from the name field.
- MESSAGE_PREFIX - Static variable in interface org.apache.tika.metadata.Message
- MESSAGE_RAW_HEADER_PREFIX - Static variable in interface org.apache.tika.metadata.Message
- MESSAGE_RECIPIENT_ADDRESS - Static variable in interface org.apache.tika.metadata.Message
- MESSAGE_TO - Static variable in interface org.apache.tika.metadata.Message
- MESSAGE_TO_DISPLAY_NAME - Static variable in interface org.apache.tika.metadata.Message
- MESSAGE_TO_EMAIL - Static variable in interface org.apache.tika.metadata.Message
-
Where possible, this records the email value in the to field.
- MESSAGE_TO_NAME - Static variable in interface org.apache.tika.metadata.Message
-
In Outlook messages, there are sometimes separate fields for "to-name" and "to-display-name" name.
- meta_neg(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
- meta_trust(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
- metadata(Metadata) - Method in class org.apache.tika.sax.XMPContentHandler
- Metadata - Class in org.apache.tika.metadata
-
A multi-valued metadata container.
- Metadata() - Constructor for class org.apache.tika.metadata.Metadata
-
Constructs a new, empty metadata.
- METADATA_COMMAND_ARGUMENTS_SERIALIZED_TOKEN - Static variable in class org.apache.tika.embedder.ExternalEmbedder
-
Token to be replaced with a String array of metadata assignment command arguments
- METADATA_COMMAND_ARGUMENTS_TOKEN - Static variable in class org.apache.tika.embedder.ExternalEmbedder
-
Token to be replaced with a String array of metadata assignment command arguments
- METADATA_DATE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- METADATA_DATE - Static variable in interface org.apache.tika.metadata.XMP
-
The date and time that any metadata for this resource was last changed.
- METADATA_KEY_ATTR - Static variable in interface org.apache.tika.parser.external.ExternalParsersConfigReaderMetKeys
- METADATA_MATCH_TAG - Static variable in interface org.apache.tika.parser.external.ExternalParsersConfigReaderMetKeys
- METADATA_MOD_DATE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The date and time when the metadata was last modified."
- METADATA_TAG - Static variable in interface org.apache.tika.parser.external.ExternalParsersConfigReaderMetKeys
- metadataList - Variable in class org.apache.tika.sax.RecursiveParserWrapperHandler
- MIDDAY - Static variable in class org.apache.tika.utils.DateUtils
-
Custom time zone used to interpret date values without a time component in a way that most likely falls within the same day regardless of in which time zone it is later interpreted.
- MIME_INFO_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- MIME_TYPE - org.apache.tika.metadata.Property.ValueType
- MIME_TYPE_MAGIC - Static variable in interface org.apache.tika.metadata.TikaMimeKeys
- MIME_TYPE_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- MIME_TYPE_TYPE_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- MimeType - Class in org.apache.tika.mime
-
Internet media type.
- MIMETYPE_TAG - Static variable in interface org.apache.tika.parser.external.ExternalParsersConfigReaderMetKeys
- MimeTypeException - Exception in org.apache.tika.mime
-
A class to encapsulate MimeType related exceptions.
- MimeTypeException(String) - Constructor for exception org.apache.tika.mime.MimeTypeException
-
Constructs a MimeTypeException with the specified detail message.
- MimeTypeException(String, Throwable) - Constructor for exception org.apache.tika.mime.MimeTypeException
-
Constructs a MimeTypeException with the specified detail message and root cause.
- MimeTypes - Class in org.apache.tika.mime
-
This class is a MimeType repository.
- MimeTypes() - Constructor for class org.apache.tika.mime.MimeTypes
- MIMETYPES_TAG - Static variable in interface org.apache.tika.parser.external.ExternalParsersConfigReaderMetKeys
- MimeTypesFactory - Class in org.apache.tika.mime
-
Creates instances of MimeTypes.
- MimeTypesFactory() - Constructor for class org.apache.tika.mime.MimeTypesFactory
- MimeTypesReader - Class in org.apache.tika.mime
-
A reader for XML files compliant with the freedesktop MIME-info DTD.
- MimeTypesReader(MimeTypes) - Constructor for class org.apache.tika.mime.MimeTypesReader
- MimeTypesReaderMetKeys - Interface in org.apache.tika.mime
-
Met Keys used by the
MimeTypesReader
. - MINIMAL - org.apache.tika.config.TikaConfigSerializer.Mode
-
Minimal version of the config, defaults where possible
- MINOR_MODEL_AGE_DISCLOSURE - Static variable in interface org.apache.tika.metadata.IPTC
-
Age of the youngest model pictured in the image, at the time that the image was made.
- MINOR_VERSION - Static variable in interface org.apache.tika.metadata.WordPerfect
-
Minor version.
- mixedLanguages - Variable in class org.apache.tika.language.detect.LanguageDetector
- MODEL_AGE - Static variable in interface org.apache.tika.metadata.IPTC
-
Age of the human model(s) at the time this image was taken in a model released image.
- MODEL_NAME_ENGLISH - Static variable in interface org.apache.tika.metadata.ClimateForcast
- MODEL_RELEASE_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
Optional identifier associated with each Model Release.
- MODEL_RELEASE_STATUS - Static variable in interface org.apache.tika.metadata.IPTC
-
Summarizes the availability and scope of model releases authorizing usage of the likenesses of persons appearing in the photograph.
- MODIFIED - Static variable in interface org.apache.tika.metadata.DublinCore
-
Date on which the resource was changed.
- MODIFIED - Static variable in class org.apache.tika.metadata.Metadata
-
Deprecated.use TikaCoreProperties#MODIFIED
- MODIFIED - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- modifiedService(ServiceReference, Object) - Method in class org.apache.tika.config.TikaActivator
- MODIFIER - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- MODIFY_DATE - Static variable in interface org.apache.tika.metadata.XMP
-
The date and time the resource was last modified.
- MSOffice - Interface in org.apache.tika.metadata
-
A collection of Microsoft Office and Open Document property names.
- MULTIPART_BOUNDARY - Static variable in interface org.apache.tika.metadata.Message
- MULTIPART_SUBTYPE - Static variable in interface org.apache.tika.metadata.Message
N
- N_PAGES - Static variable in interface org.apache.tika.metadata.PagedText
-
"The number of pages in the document (including any in contained documents)."
- name() - Method in annotation type org.apache.tika.config.Field
- NamedAttributeMatcher - Class in org.apache.tika.sax.xpath
-
Final evaluation state of a
.../@name
XPath expression. - NamedAttributeMatcher(String, String) - Constructor for class org.apache.tika.sax.xpath.NamedAttributeMatcher
- NamedElementMatcher - Class in org.apache.tika.sax.xpath
-
Intermediate evaluation state of a
.../name...
XPath expression. - NamedElementMatcher(String, String, Matcher) - Constructor for class org.apache.tika.sax.xpath.NamedElementMatcher
- NameDetector - Class in org.apache.tika.detect
-
Content type detection based on the resource name.
- NameDetector(Map<Pattern, MediaType>) - Constructor for class org.apache.tika.detect.NameDetector
-
Creates a new content type detector based on the given name patterns.
- names() - Method in class org.apache.tika.metadata.Metadata
-
Returns an array of the names contained in the metadata.
- NAMESPACE_PREFIX_DELIMITER - Static variable in class org.apache.tika.metadata.Metadata
-
The common delimiter used between the namespace abbreviation and the property name
- NAMESPACE_URI - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
- NAMESPACE_URI - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- NAMESPACE_URI - Static variable in interface org.apache.tika.metadata.XMP
- NAMESPACE_URI - Static variable in interface org.apache.tika.metadata.XMPIdq
- NAMESPACE_URI - Static variable in interface org.apache.tika.metadata.XMPMM
- NAMESPACE_URI_DC - Static variable in interface org.apache.tika.metadata.DublinCore
- NAMESPACE_URI_DC_TERMS - Static variable in interface org.apache.tika.metadata.DublinCore
- NAMESPACE_URI_DOC_META - Static variable in interface org.apache.tika.metadata.Office
- NAMESPACE_URI_IPTC_CORE - Static variable in interface org.apache.tika.metadata.IPTC
- NAMESPACE_URI_IPTC_EXT - Static variable in interface org.apache.tika.metadata.IPTC
- NAMESPACE_URI_PHOTOSHOP - Static variable in interface org.apache.tika.metadata.Photoshop
- NAMESPACE_URI_PLUS - Static variable in interface org.apache.tika.metadata.IPTC
- NAMESPACE_URI_XMP_RIGHTS - Static variable in interface org.apache.tika.metadata.XMPRights
- namespaces - Variable in class org.apache.tika.sax.ToXMLContentHandler
- NetworkParser - Class in org.apache.tika.parser
- NetworkParser(URI) - Constructor for class org.apache.tika.parser.NetworkParser
- NetworkParser(URI, Set<MediaType>) - Constructor for class org.apache.tika.parser.NetworkParser
- newInstance(String) - Static method in class org.apache.tika.utils.ServiceLoaderUtils
-
Loads a class and instantiates it
- newInstance(String, ClassLoader) - Static method in class org.apache.tika.utils.ServiceLoaderUtils
-
Loads a class and instantiates it
- newline() - Method in class org.apache.tika.sax.XHTMLContentHandler
- NNExampleModelDetector - Class in org.apache.tika.detect
- NNExampleModelDetector() - Constructor for class org.apache.tika.detect.NNExampleModelDetector
- NNExampleModelDetector(File) - Constructor for class org.apache.tika.detect.NNExampleModelDetector
- NNExampleModelDetector(Path) - Constructor for class org.apache.tika.detect.NNExampleModelDetector
- NNTrainedModel - Class in org.apache.tika.detect
- NNTrainedModel(int, int, int, float[]) - Constructor for class org.apache.tika.detect.NNTrainedModel
- NNTrainedModelBuilder - Class in org.apache.tika.detect
- NNTrainedModelBuilder() - Constructor for class org.apache.tika.detect.NNTrainedModelBuilder
- NodeMatcher - Class in org.apache.tika.sax.xpath
-
Final evaluation state of a
.../node()
XPath expression. - NodeMatcher() - Constructor for class org.apache.tika.sax.xpath.NodeMatcher
- NonDetectingEncodingDetector - Class in org.apache.tika.detect
-
Always returns the charset passed in via the initializer
- NonDetectingEncodingDetector() - Constructor for class org.apache.tika.detect.NonDetectingEncodingDetector
-
Sets charset to UTF-8.
- NonDetectingEncodingDetector(Charset) - Constructor for class org.apache.tika.detect.NonDetectingEncodingDetector
- NONE - org.apache.tika.language.detect.LanguageConfidence
- normalize() - Method in class org.apache.tika.language.LanguageProfilerBuilder
-
Deprecated.Normalizes the profile (calculates the ngrams frequencies)
- normalize(String) - Static method in class org.apache.tika.io.FilenameUtils
-
Scans the given file name for reserved characters on different OSs and file systems and returns a sanitized version of the name with the reserved chars replaced by their hexadecimal value.
- normalize(MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
- normalizeName(String) - Static method in class org.apache.tika.language.detect.LanguageNames
- NOTES - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- NOTES - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- NS_URI_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- NULL - Static variable in class org.apache.tika.language.detect.LanguageResult
- NULL - Static variable in interface org.apache.tika.parser.external.ExternalParser.LineConsumer
-
A null consumer
- NULL_OUTPUT_STREAM - Static variable in class org.apache.tika.io.NullOutputStream
-
A singleton.
- NullInputStream - Class in org.apache.tika.io
-
A functional, light weight
InputStream
that emulates a stream of a specified size. - NullInputStream(long) - Constructor for class org.apache.tika.io.NullInputStream
-
Create an
InputStream
that emulates a specified size which supports marking and does not throw EOFException. - NullInputStream(long, boolean, boolean) - Constructor for class org.apache.tika.io.NullInputStream
-
Create an
InputStream
that emulates a specified size with option settings. - NullOutputStream - Class in org.apache.tika.io
-
This OutputStream writes all data to the famous /dev/null.
- NullOutputStream() - Constructor for class org.apache.tika.io.NullOutputStream
- NUMBER_OF_BEATS - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The number of beats."
O
- OBJECT_COUNT - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- OBJECT_COUNT - Static variable in interface org.apache.tika.metadata.Office
-
The number of Objects in the document.
- OCTET_STREAM - Static variable in class org.apache.tika.mime.MediaType
- OCTET_STREAM - Static variable in class org.apache.tika.mime.MimeTypes
-
Name of the
root
type, application/octet-stream. - Office - Interface in org.apache.tika.metadata
-
Office Document properties collection.
- OfficeOpenXMLCore - Interface in org.apache.tika.metadata
-
Core properties as defined in the Office Open XML specification part Two that are not in the DublinCore namespace.
- OfficeOpenXMLExtended - Interface in org.apache.tika.metadata
-
Extended properties as defined in the Office Open XML specification part Four.
- OfflineContentHandler - Class in org.apache.tika.sax
-
Content handler decorator that always returns an empty stream from the
OfflineContentHandler.resolveEntity(String, String)
method to prevent potential network or other external resources from being accessed by an XML parser. - OfflineContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.OfflineContentHandler
- OPEN_CHOICE - org.apache.tika.metadata.Property.ValueType
- org.apache.tika - package org.apache.tika
-
Apache Tika.
- org.apache.tika.concurrent - package org.apache.tika.concurrent
- org.apache.tika.config - package org.apache.tika.config
-
Tika configuration tools.
- org.apache.tika.detect - package org.apache.tika.detect
-
Media type detection.
- org.apache.tika.embedder - package org.apache.tika.embedder
- org.apache.tika.exception - package org.apache.tika.exception
-
Tika exception.
- org.apache.tika.extractor - package org.apache.tika.extractor
-
Extraction of component documents.
- org.apache.tika.fork - package org.apache.tika.fork
-
Forked parser.
- org.apache.tika.io - package org.apache.tika.io
-
IO utilities.
- org.apache.tika.language - package org.apache.tika.language
- org.apache.tika.language.detect - package org.apache.tika.language.detect
- org.apache.tika.language.translate - package org.apache.tika.language.translate
- org.apache.tika.metadata - package org.apache.tika.metadata
-
Multi-valued metadata container, and set of constant metadata fields.
- org.apache.tika.mime - package org.apache.tika.mime
-
Media type information.
- org.apache.tika.parser - package org.apache.tika.parser
-
Tika parsers.
- org.apache.tika.parser.digest - package org.apache.tika.parser.digest
- org.apache.tika.parser.external - package org.apache.tika.parser.external
-
External parser process.
- org.apache.tika.sax - package org.apache.tika.sax
-
SAX utilities.
- org.apache.tika.sax.xpath - package org.apache.tika.sax.xpath
-
XPath utilities
- org.apache.tika.utils - package org.apache.tika.utils
-
Utilities.
- ORGANISATION_CODE - Static variable in interface org.apache.tika.metadata.IPTC
-
A set of metadata about artwork or an object in the item
- ORGANISATION_NAME - Static variable in interface org.apache.tika.metadata.IPTC
-
Name of the organisation or company which is featured in the content.
- ORIENTATION - Static variable in interface org.apache.tika.metadata.TIFF
-
"The Orientation of the image." 1 = 0th row at top, 0th column at left 2 = 0th row at top, 0th column at right 3 = 0th row at bottom, 0th column at right 4 = 0th row at bottom, 0th column at left 5 = 0th row at left, 0th column at top 6 = 0th row at right, 0th column at top 7 = 0th row at right, 0th column at bottom 8 = 0th row at left, 0th column at bottom
- ORIGINAL_DATE - Static variable in interface org.apache.tika.metadata.TIFF
-
"Date and time when original image was generated"
- ORIGINAL_DOCUMENTID - Static variable in interface org.apache.tika.metadata.XMPMM
-
The common identifier for the original resource from which the current resource is derived.
- ORIGINAL_RESOURCE_NAME - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
Some file formats can store information about their original file name/location or about their attachment's original file name/location.
- OS_NAME - Static variable in class org.apache.tika.utils.SystemUtils
- OS_VERSION - Static variable in class org.apache.tika.utils.SystemUtils
- OUTPUT_FILE_TOKEN - Static variable in class org.apache.tika.parser.external.ExternalParser
-
The token, which if present in the Command string, will be replaced with the output filename.
- OverrideDetector - Class in org.apache.tika.detect
-
Use this to force a content type detection via the
TikaCoreProperties.CONTENT_TYPE_OVERRIDE
key in the metadata object. - OverrideDetector() - Constructor for class org.apache.tika.detect.OverrideDetector
- OWNER - Static variable in interface org.apache.tika.metadata.XMPRights
-
A list of legal owners of the resource.
P
- PAGE_COUNT - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- PAGE_COUNT - Static variable in interface org.apache.tika.metadata.Office
-
The number of Pages are there in the (paged) document
- PagedText - Interface in org.apache.tika.metadata
-
XMP Paged-text schema.
- PARAGRAPH_COUNT - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- PARAGRAPH_COUNT - Static variable in interface org.apache.tika.metadata.Office
-
The number of individual Paragraphs in the document
- Param<T> - Class in org.apache.tika.config
-
This is a serializable model class for parameters from configuration file.
- Param() - Constructor for class org.apache.tika.config.Param
- Param(String, Class<T>, T) - Constructor for class org.apache.tika.config.Param
- Param(String, T) - Constructor for class org.apache.tika.config.Param
- ParamField - Class in org.apache.tika.config
- ParamField(AccessibleObject) - Constructor for class org.apache.tika.config.ParamField
-
Creates a ParamField object
- parse(File) - Method in class org.apache.tika.Tika
-
Parses the given file and returns the extracted text content.
- parse(File, Metadata) - Method in class org.apache.tika.Tika
-
Parses the given file and returns the extracted text content.
- parse(InputStream) - Method in class org.apache.tika.Tika
-
Parses the given document and returns the extracted text content.
- parse(InputStream, Metadata) - Method in class org.apache.tika.Tika
-
Parses the given document and returns the extracted text content.
- parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.AbstractParser
-
Deprecated.use the
Parser.parse(InputStream, ContentHandler, Metadata, ParseContext)
method instead - parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.AutoDetectParser
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.fork.ForkParser
-
This sends the objects to the server for parsing, and the server via the proxies acts on the handler as if it were updating it directly.
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.AutoDetectParser
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.CompositeParser
-
Delegates the call to the matching component parser.
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.CryptoParser
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.DelegatingParser
-
Looks up the delegate parser from the parsing context and delegates the parse operation to it.
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.DigestingParser
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.EmptyParser
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ErrorParser
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.external.ExternalParser
-
Executes the configured external command and passes the given document stream as a simple XHTML document to the given SAX content handler.
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.NetworkParser
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in interface org.apache.tika.parser.Parser
-
Parses a document stream into a sequence of XHTML SAX events.
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ParserDecorator
-
Delegates the method call to the decorated parser.
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ParserPostProcessor
-
Forwards the call to the delegated parser and post-processes the results as described above.
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.RecursiveParserWrapper
-
Acts like a regular parser except it ignores the ContentHandler and it automatically sets/overwrites the embedded Parser in the ParseContext object.
- parse(String) - Static method in class org.apache.tika.mime.MediaType
-
Parses the given string to a media type.
- parse(String) - Method in class org.apache.tika.sax.xpath.XPathParser
-
Parses the given simple XPath expression to an evaluation state initialized at the document node.
- parse(URL) - Method in class org.apache.tika.Tika
-
Parses the resource at the given URL and returns the extracted text content.
- parse(Path) - Method in class org.apache.tika.Tika
-
Parses the file at the given path and returns the extracted text content.
- parse(Path, Metadata) - Method in class org.apache.tika.Tika
-
Parses the file at the given path and returns the extracted text content.
- parse(MediaType, String, String, String, String) - Static method in class org.apache.tika.detect.MagicDetector
- PARSE_TIME_MILLIS - Static variable in class org.apache.tika.parser.RecursiveParserWrapper
-
Deprecated.
- PARSE_TIME_MILLIS - Static variable in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
- ParseContext - Class in org.apache.tika.parser
-
Parse context.
- ParseContext() - Constructor for class org.apache.tika.parser.ParseContext
- parseEmbedded(InputStream, ContentHandler, Metadata, boolean) - Method in interface org.apache.tika.extractor.EmbeddedDocumentExtractor
-
Processes the supplied embedded resource, calling the delegating parser with the appropriate details.
- parseEmbedded(InputStream, ContentHandler, Metadata, boolean) - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
- parseEmbedded(InputStream, ContentHandler, Metadata, boolean) - Method in class org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor
- parseHandlerType(String, BasicContentHandlerFactory.HANDLER_TYPE) - Static method in class org.apache.tika.sax.BasicContentHandlerFactory
-
Tries to parse string into handler type.
- Parser - Interface in org.apache.tika.parser
-
Tika parser interface.
- PARSER_TAG - Static variable in interface org.apache.tika.parser.external.ExternalParsersConfigReaderMetKeys
- ParserContainerExtractor - Class in org.apache.tika.extractor
-
An implementation of
ContainerExtractor
powered by the regularParser
API. - ParserContainerExtractor() - Constructor for class org.apache.tika.extractor.ParserContainerExtractor
- ParserContainerExtractor(TikaConfig) - Constructor for class org.apache.tika.extractor.ParserContainerExtractor
- ParserContainerExtractor(Parser, Detector) - Constructor for class org.apache.tika.extractor.ParserContainerExtractor
- ParserDecorator - Class in org.apache.tika.parser
-
Decorator base class for the
Parser
interface. - ParserDecorator(Parser) - Constructor for class org.apache.tika.parser.ParserDecorator
-
Creates a decorator for the given parser.
- ParserFactory - Class in org.apache.tika.parser
- ParserFactory(Map<String, String>) - Constructor for class org.apache.tika.parser.ParserFactory
- ParserFactoryFactory - Class in org.apache.tika.fork
-
Lightweight, easily serializable class that contains enough information to build a
ParserFactory
- ParserFactoryFactory(String, Map<String, String>) - Constructor for class org.apache.tika.fork.ParserFactoryFactory
- ParserPostProcessor - Class in org.apache.tika.parser
-
Parser decorator that post-processes the results from a decorated parser.
- ParserPostProcessor(Parser) - Constructor for class org.apache.tika.parser.ParserPostProcessor
-
Creates a post-processing decorator for the given parser.
- ParserUtils - Class in org.apache.tika.utils
-
Helper util methods for Parsers themselves.
- ParserUtils() - Constructor for class org.apache.tika.utils.ParserUtils
- parseSAX(InputStream, DefaultHandler, ParseContext) - Static method in class org.apache.tika.utils.XMLReaderUtils
-
This checks context for a user specified
SAXParser
. - parseToString(File) - Method in class org.apache.tika.Tika
-
Parses the given file and returns the extracted text content.
- parseToString(InputStream) - Method in class org.apache.tika.Tika
-
Parses the given document and returns the extracted text content.
- parseToString(InputStream, Metadata) - Method in class org.apache.tika.Tika
-
Parses the given document and returns the extracted text content.
- parseToString(InputStream, Metadata, int) - Method in class org.apache.tika.Tika
-
Parses the given document and returns the extracted text content.
- parseToString(URL) - Method in class org.apache.tika.Tika
-
Parses the resource at the given URL and returns the extracted text content.
- parseToString(Path) - Method in class org.apache.tika.Tika
-
Parses the file at the given path and returns the extracted text content.
- ParsingEmbeddedDocumentExtractor - Class in org.apache.tika.extractor
-
Helper class for parsers of package archives or other compound document formats that support embedded or attached component documents.
- ParsingEmbeddedDocumentExtractor(ParseContext) - Constructor for class org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor
- ParsingReader - Class in org.apache.tika.parser
-
Reader for the text content from a given binary stream.
- ParsingReader(File) - Constructor for class org.apache.tika.parser.ParsingReader
-
Creates a reader for the text content of the given file.
- ParsingReader(InputStream) - Constructor for class org.apache.tika.parser.ParsingReader
-
Creates a reader for the text content of the given binary stream.
- ParsingReader(InputStream, String) - Constructor for class org.apache.tika.parser.ParsingReader
-
Creates a reader for the text content of the given binary stream with the given name.
- ParsingReader(Path) - Constructor for class org.apache.tika.parser.ParsingReader
-
Creates a reader for the text content of the file at the given path.
- ParsingReader(Parser, InputStream, Metadata, ParseContext) - Constructor for class org.apache.tika.parser.ParsingReader
-
Creates a reader for the text content of the given binary stream with the given document metadata.
- ParsingReader(Parser, InputStream, Metadata, ParseContext, Executor) - Constructor for class org.apache.tika.parser.ParsingReader
-
Creates a reader for the text content of the given binary stream with the given document metadata.
- PasswordProvider - Interface in org.apache.tika.parser
-
Interface for providing a password to a Parser for handling Encrypted and Password Protected Documents.
- PATTERN_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- PDF - Interface in org.apache.tika.metadata
-
PDF properties collection.
- PDF_DOC_INFO_CUSTOM_PREFIX - Static variable in interface org.apache.tika.metadata.PDF
- PDF_DOC_INFO_PREFIX - Static variable in interface org.apache.tika.metadata.PDF
-
Prefix to be used for properties that record what was stored in the docinfo section (as opposed to XMP)
- PDF_EXTENSION_VERSION - Static variable in interface org.apache.tika.metadata.PDF
- PDF_PREFIX - Static variable in interface org.apache.tika.metadata.PDF
- PDF_VERSION - Static variable in interface org.apache.tika.metadata.PDF
- PDFA_PREFIX - Static variable in interface org.apache.tika.metadata.PDF
- PDFA_VERSION - Static variable in interface org.apache.tika.metadata.PDF
- PDFAID_CONFORMANCE - Static variable in interface org.apache.tika.metadata.PDF
- PDFAID_PART - Static variable in interface org.apache.tika.metadata.PDF
- PDFAID_PREFIX - Static variable in interface org.apache.tika.metadata.PDF
- peek(byte[]) - Method in class org.apache.tika.io.TikaInputStream
-
Fills the given buffer with upcoming bytes from this stream without advancing the current stream position.
- PERSON - Static variable in interface org.apache.tika.metadata.IPTC
-
Name of a person the content of the item is about.
- PhoneExtractingContentHandler - Class in org.apache.tika.sax
-
Class used to extract phone numbers while parsing.
- PhoneExtractingContentHandler() - Constructor for class org.apache.tika.sax.PhoneExtractingContentHandler
-
Creates a decorator that by default forwards incoming SAX events to a dummy content handler that simply ignores all the events.
- PhoneExtractingContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.sax.PhoneExtractingContentHandler
-
Creates a decorator for the given SAX event handler and Metadata object.
- Photoshop - Interface in org.apache.tika.metadata
-
XMP Photoshop metadata schema.
- PLAIN_TEXT - Static variable in class org.apache.tika.mime.MimeTypes
-
Name of the
text
type, text/plain. - PLUS_VERSION - Static variable in interface org.apache.tika.metadata.IPTC
-
The version number of the PLUS standards in place at the time of the transaction.
- predict(double[]) - Method in class org.apache.tika.detect.NNTrainedModel
- predict(double[]) - Method in class org.apache.tika.detect.TrainedModel
- predict(float[]) - Method in class org.apache.tika.detect.NNTrainedModel
-
The given input vector of unseen is m=(256 + 1) * n= 1 this returns a prediction probability
- predict(float[]) - Method in class org.apache.tika.detect.TrainedModel
- PREFIX - Static variable in interface org.apache.tika.metadata.AccessPermissions
- PREFIX - Static variable in interface org.apache.tika.metadata.Database
- PREFIX - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
- PREFIX - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- PREFIX - Static variable in interface org.apache.tika.metadata.XMP
- PREFIX - Static variable in interface org.apache.tika.metadata.XMPIdq
- PREFIX - Static variable in interface org.apache.tika.metadata.XMPMM
- PREFIX_ - Static variable in interface org.apache.tika.metadata.XMP
-
The xmp prefix followed by the colon delimiter
- PREFIX_ - Static variable in interface org.apache.tika.metadata.XMPIdq
-
The xmpidq prefix followed by the colon delimiter
- PREFIX_ - Static variable in interface org.apache.tika.metadata.XMPMM
-
The xmpMM prefix followed by the colon delimiter
- PREFIX_ - Static variable in interface org.apache.tika.metadata.XMPRights
-
The xmpRights prefix followed by the colon delimiter
- PREFIX_DC - Static variable in interface org.apache.tika.metadata.DublinCore
- PREFIX_DC_TERMS - Static variable in interface org.apache.tika.metadata.DublinCore
- PREFIX_DOC_META - Static variable in interface org.apache.tika.metadata.Office
- PREFIX_FONT_META - Static variable in interface org.apache.tika.metadata.Font
- PREFIX_HTML_META - Static variable in interface org.apache.tika.metadata.HTML
- PREFIX_IPTC_CORE - Static variable in interface org.apache.tika.metadata.IPTC
- PREFIX_IPTC_EXT - Static variable in interface org.apache.tika.metadata.IPTC
- PREFIX_PHOTOSHOP - Static variable in interface org.apache.tika.metadata.Photoshop
- PREFIX_PLUS - Static variable in interface org.apache.tika.metadata.IPTC
- PREFIX_RTF_META - Static variable in interface org.apache.tika.metadata.RTFMetadata
- PREFIX_XMP_RIGHTS - Static variable in interface org.apache.tika.metadata.XMPRights
- PRESENTATION_FORMAT - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- PRESENTATION_FORMAT - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- PRINT_DATE - Static variable in interface org.apache.tika.metadata.Office
-
When was the document last printed?
- PRINT_DATE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- priorExtensionFileType(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
- priority - Variable in class org.apache.tika.mime.MimeTypesReader
- priorMagicFileType(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
- priorMetaFileType(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
- ProbabilisticMimeDetectionSelector - Class in org.apache.tika.mime
-
Selector for combining different mime detection results based on probability
- ProbabilisticMimeDetectionSelector() - Constructor for class org.apache.tika.mime.ProbabilisticMimeDetectionSelector
- ProbabilisticMimeDetectionSelector(MimeTypes) - Constructor for class org.apache.tika.mime.ProbabilisticMimeDetectionSelector
- ProbabilisticMimeDetectionSelector(MimeTypes, ProbabilisticMimeDetectionSelector.Builder) - Constructor for class org.apache.tika.mime.ProbabilisticMimeDetectionSelector
- ProbabilisticMimeDetectionSelector(ProbabilisticMimeDetectionSelector.Builder) - Constructor for class org.apache.tika.mime.ProbabilisticMimeDetectionSelector
- ProbabilisticMimeDetectionSelector.Builder - Class in org.apache.tika.mime
-
build class for probability parameters setting
- process(DataInputStream, DataOutputStream) - Method in interface org.apache.tika.fork.ForkResource
- processByte() - Method in class org.apache.tika.io.NullInputStream
-
Return a byte value for the
read()
method. - processBytes(byte[], int, int) - Method in class org.apache.tika.io.NullInputStream
-
Process the bytes for the
read(byte[], offset, length)
method. - processingInstruction(String, String) - Method in class org.apache.tika.sax.ContentHandlerDecorator
- processingInstruction(String, String) - Method in class org.apache.tika.sax.TeeContentHandler
- processingInstruction(String, String) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
- ProcessUtils - Class in org.apache.tika.utils
- ProcessUtils() - Constructor for class org.apache.tika.utils.ProcessUtils
- PRODUCT_TYPE - Static variable in interface org.apache.tika.metadata.WordPerfect
-
Product type.
- ProfilingHandler - Class in org.apache.tika.language
-
Deprecated.use
LanguageHandler
- ProfilingHandler() - Constructor for class org.apache.tika.language.ProfilingHandler
-
Deprecated.
- ProfilingHandler(LanguageProfile) - Constructor for class org.apache.tika.language.ProfilingHandler
-
Deprecated.
- ProfilingHandler(ProfilingWriter) - Constructor for class org.apache.tika.language.ProfilingHandler
-
Deprecated.
- ProfilingWriter - Class in org.apache.tika.language
-
Deprecated.use
LanguageWriter
- ProfilingWriter() - Constructor for class org.apache.tika.language.ProfilingWriter
-
Deprecated.
- ProfilingWriter(LanguageProfile) - Constructor for class org.apache.tika.language.ProfilingWriter
-
Deprecated.
- PROGRAM_ID - Static variable in interface org.apache.tika.metadata.ClimateForcast
- PROJECT_ID - Static variable in interface org.apache.tika.metadata.ClimateForcast
- PROPER_NAME - org.apache.tika.metadata.Property.ValueType
- property(String, String) - Method in class org.apache.tika.sax.XMPContentHandler
- Property - Class in org.apache.tika.metadata
-
XMP property definition.
- PROPERTY - org.apache.tika.metadata.Property.ValueType
- PROPERTY_GROUP_IPTC_CORE - Static variable in interface org.apache.tika.metadata.IPTC
- PROPERTY_GROUP_IPTC_EXT - Static variable in interface org.apache.tika.metadata.IPTC
- PROPERTY_RELEASE_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
Optional identifier associated with each Property Release.
- PROPERTY_RELEASE_STATUS - Static variable in interface org.apache.tika.metadata.IPTC
-
Summarises the availability and scope of property releases authorizing usage of the properties appearing in the photograph.
- Property.PropertyType - Enum in org.apache.tika.metadata
- Property.ValueType - Enum in org.apache.tika.metadata
- PropertyTypeException - Exception in org.apache.tika.metadata
-
XMP property definition violation exception.
- PropertyTypeException(String) - Constructor for exception org.apache.tika.metadata.PropertyTypeException
- PropertyTypeException(Property.PropertyType) - Constructor for exception org.apache.tika.metadata.PropertyTypeException
- PropertyTypeException(Property.PropertyType, Property.PropertyType) - Constructor for exception org.apache.tika.metadata.PropertyTypeException
- PropertyTypeException(Property.ValueType, Property.ValueType) - Constructor for exception org.apache.tika.metadata.PropertyTypeException
- PROTECTED - Static variable in interface org.apache.tika.metadata.TikaMetadataKeys
- PROVINCE_OR_STATE - Static variable in interface org.apache.tika.metadata.IPTC
-
Name of the subregion of a country -- either called province or state or anything else -- the content is focussing on -- either the subregion shown in visual media or referenced by text or audio media.
- ProxyInputStream - Class in org.apache.tika.io
-
A Proxy stream which acts as expected, that is it passes the method calls on to the proxied stream and doesn't change which methods are being called.
- ProxyInputStream(InputStream) - Constructor for class org.apache.tika.io.ProxyInputStream
-
Constructs a new ProxyInputStream.
- PUBLISHER - Static variable in interface org.apache.tika.metadata.DublinCore
-
An entity responsible for making the resource available.
- PUBLISHER - Static variable in class org.apache.tika.metadata.Metadata
-
Deprecated.use TikaCoreProperties#PUBLISHER
- PUBLISHER - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- PULL_DOWN - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The sampling phase of film to be converted to video (pull-down)."
Q
- QuattroPro - Interface in org.apache.tika.metadata
-
QuattroPro properties collection.
- QUATTROPRO_METADATA_NAME_PREFIX - Static variable in interface org.apache.tika.metadata.QuattroPro
R
- RATING - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- RATING - Static variable in interface org.apache.tika.metadata.XMP
-
A user-assigned rating for this file.
- RATIONAL - org.apache.tika.metadata.Property.ValueType
- RDF - Static variable in class org.apache.tika.sax.XMPContentHandler
-
The RDF namespace URI
- read() - Method in class org.apache.tika.io.BoundedInputStream
- read() - Method in class org.apache.tika.io.ClosedInputStream
-
Returns -1 to indicate that the stream is closed.
- read() - Method in class org.apache.tika.io.CountingInputStream
-
Reads the next byte of data adding to the count of bytes received if a byte is successfully read.
- read() - Method in class org.apache.tika.io.LookaheadInputStream
- read() - Method in class org.apache.tika.io.NullInputStream
-
Read a byte.
- read() - Method in class org.apache.tika.io.ProxyInputStream
-
Invokes the delegate's
read()
method. - read() - Method in class org.apache.tika.io.TailStream
-
This implementation adds the read byte to the internal tail buffer.
- read() - Method in class org.apache.tika.utils.RereadableInputStream
-
Reads a byte from the stream, saving it in the store if it is being read from the original stream.
- read(byte[]) - Method in class org.apache.tika.io.BoundedInputStream
-
Invokes the delegate's
read(byte[])
method. - read(byte[]) - Method in class org.apache.tika.io.CountingInputStream
-
Reads a number of bytes into the byte array, keeping count of the number read.
- read(byte[]) - Method in class org.apache.tika.io.NullInputStream
-
Read some bytes into the specified array.
- read(byte[]) - Method in class org.apache.tika.io.ProxyInputStream
-
Invokes the delegate's
read(byte[])
method. - read(byte[]) - Method in class org.apache.tika.io.TailStream
-
This implementation delegates to the underlying stream and then adds the correct portion of the read buffer to the internal tail buffer.
- read(byte[], int, int) - Method in class org.apache.tika.io.BoundedInputStream
-
Invokes the delegate's
read(byte[], int, int)
method. - read(byte[], int, int) - Method in class org.apache.tika.io.CountingInputStream
-
Reads a number of bytes into the byte array at a specific offset, keeping count of the number read.
- read(byte[], int, int) - Method in class org.apache.tika.io.LookaheadInputStream
- read(byte[], int, int) - Method in class org.apache.tika.io.NullInputStream
-
Read the specified number bytes into an array.
- read(byte[], int, int) - Method in class org.apache.tika.io.ProxyInputStream
-
Invokes the delegate's
read(byte[], int, int)
method. - read(byte[], int, int) - Method in class org.apache.tika.io.TailStream
-
This implementation delegates to the underlying stream and then adds the correct portion of the read buffer to the internal tail buffer.
- read(char[], int, int) - Method in class org.apache.tika.parser.ParsingReader
-
Reads parsed text from the pipe connected to the parsing thread.
- read(InputStream) - Method in class org.apache.tika.mime.MimeTypesReader
- read(InputStream) - Static method in class org.apache.tika.parser.external.ExternalParsersConfigReader
- read(InputStream, byte[], int, int) - Static method in class org.apache.tika.io.IOUtils
-
Reads bytes from an input stream.
- read(Document) - Method in class org.apache.tika.mime.MimeTypesReader
- read(Document) - Static method in class org.apache.tika.parser.external.ExternalParsersConfigReader
- read(Element) - Static method in class org.apache.tika.parser.external.ExternalParsersConfigReader
- readByteFrequencies(InputStream) - Method in class org.apache.tika.detect.TrainedModelDetector
-
Read the
inputstream
and build a byte frequency histogram - readIntBE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE int value from an InputStream
- readIntLE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE int value from an InputStream
- readLines(InputStream) - Static method in class org.apache.tika.io.IOUtils
-
Get the contents of an
InputStream
as a list of Strings, one entry per line, using the default character encoding of the platform. - readLines(InputStream, String) - Static method in class org.apache.tika.io.IOUtils
-
Get the contents of an
InputStream
as a list of Strings, one entry per line, using the specified character encoding. - readLines(Reader) - Static method in class org.apache.tika.io.IOUtils
-
Get the contents of a
Reader
as a list of Strings, one entry per line. - readLongBE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Get a NE long value from an InputStream
- readLongLE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE long value from an InputStream
- readShortBE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE short value from an InputStream
- readShortLE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE short value from an InputStream
- readUE7(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Gets the integer value that is stored in UTF-8 like fashion, in Big Endian but with the high bit on each number indicating if it continues or not
- readUIntBE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE unsigned int value from an InputStream
- readUIntLE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE unsigned int value from an InputStream
- readUShortBE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
- readUShortLE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
- REAL - org.apache.tika.metadata.Property.ValueType
- REALIZATION - Static variable in interface org.apache.tika.metadata.ClimateForcast
- reallyEndDocument() - Method in class org.apache.tika.sax.EndDocumentShieldingContentHandler
- recordEmbeddedStreamException(Throwable, Metadata) - Static method in class org.apache.tika.extractor.EmbeddedDocumentUtil
- recordException(Throwable, Metadata) - Static method in class org.apache.tika.extractor.EmbeddedDocumentUtil
- recordParserDetails(Parser, Metadata) - Static method in class org.apache.tika.utils.ParserUtils
- recordParserFailure(Parser, Throwable, Metadata) - Static method in class org.apache.tika.utils.ParserUtils
- RecursiveParserWrapper - Class in org.apache.tika.parser
-
This is a helper class that wraps a parser in a recursive handler.
- RecursiveParserWrapper(Parser) - Constructor for class org.apache.tika.parser.RecursiveParserWrapper
-
Initialize the wrapper with
RecursiveParserWrapper.catchEmbeddedExceptions
set totrue
as default. - RecursiveParserWrapper(Parser, boolean) - Constructor for class org.apache.tika.parser.RecursiveParserWrapper
- RecursiveParserWrapper(Parser, ContentHandlerFactory) - Constructor for class org.apache.tika.parser.RecursiveParserWrapper
-
Deprecated.
- RecursiveParserWrapper(Parser, ContentHandlerFactory, boolean) - Constructor for class org.apache.tika.parser.RecursiveParserWrapper
-
Deprecated.
- RecursiveParserWrapperHandler - Class in org.apache.tika.sax
-
This is the default implementation of
AbstractRecursiveParserWrapperHandler
. - RecursiveParserWrapperHandler(ContentHandlerFactory) - Constructor for class org.apache.tika.sax.RecursiveParserWrapperHandler
-
Create a handler with no limit on the number of embedded resources
- RecursiveParserWrapperHandler(ContentHandlerFactory, int) - Constructor for class org.apache.tika.sax.RecursiveParserWrapperHandler
-
Create a handler that limits the number of embedded resources that will be parsed
- REFERENCES - Static variable in interface org.apache.tika.metadata.ClimateForcast
- RegexUtils - Class in org.apache.tika.utils
-
Inspired from Nutch code class OutlinkExtractor.
- RegexUtils() - Constructor for class org.apache.tika.utils.RegexUtils
- registerModels(MediaType, TrainedModel) - Method in class org.apache.tika.detect.TrainedModelDetector
- REGISTRY_ENTRY_CREATED_ITEM_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
A unique identifier created by a registry and applied by the creator of the item.
- REGISTRY_ENTRY_CREATED_ORGANISATION_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
An identifier for the registry which issued the corresponding Registry Image Id.
- RELATION - Static variable in interface org.apache.tika.metadata.DublinCore
-
A reference to a related resource.
- RELATION - Static variable in class org.apache.tika.metadata.Metadata
-
Deprecated.use TikaCoreProperties#RELATION
- RELATION - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- RELATIVE_PEAK_AUDIO_FILE_PATH - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The relative path to the file's peak audio file.
- RELEASE_DATE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The date the title was released."
- remove(String) - Method in class org.apache.tika.metadata.Metadata
-
Remove a metadata and all its associated values.
- removedService(ServiceReference, Object) - Method in class org.apache.tika.config.TikaActivator
- RENDITION_CLASS - Static variable in interface org.apache.tika.metadata.XMPMM
-
The rendition class name for this resource.
- RENDITION_PARAMS - Static variable in interface org.apache.tika.metadata.XMPMM
-
Can be used to provide additional rendition parameters that are too complex or verbose to encode in xmpMM:RenditionClass
- required() - Method in annotation type org.apache.tika.config.Field
- RereadableInputStream - Class in org.apache.tika.utils
-
Wraps an input stream, reading it only once, but making it available for rereading an arbitrary number of times.
- RereadableInputStream(InputStream, int, boolean, boolean) - Constructor for class org.apache.tika.utils.RereadableInputStream
-
Creates a rereadable input stream.
- RESERVED_FILENAME_CHARACTERS - Static variable in class org.apache.tika.io.FilenameUtils
-
Reserved characters
- reset() - Method in class org.apache.tika.io.BoundedInputStream
- reset() - Method in class org.apache.tika.io.LookaheadInputStream
- reset() - Method in class org.apache.tika.io.NullInputStream
-
Reset the stream to the point when mark was last called.
- reset() - Method in class org.apache.tika.io.ProxyInputStream
-
Invokes the delegate's
reset()
method. - reset() - Method in class org.apache.tika.io.TailStream
-
This implementation restores this stream's state to the state when ''mark()'' was called the last time.
- reset() - Method in class org.apache.tika.io.TikaInputStream
- reset() - Method in class org.apache.tika.language.detect.LanguageDetector
-
Reset statistics about the current document being processed
- reset() - Method in class org.apache.tika.language.detect.LanguageWriter
- reset() - Method in class org.apache.tika.parser.RecursiveParserWrapper
-
Deprecated.use a
RecursiveParserWrapperHandler
instead - resetByteCount() - Method in class org.apache.tika.io.CountingInputStream
-
Set the byte count back to 0.
- resetCount() - Method in class org.apache.tika.io.CountingInputStream
-
Set the byte count back to 0.
- RESOLUTION_HORIZONTAL - Static variable in interface org.apache.tika.metadata.TIFF
-
"Horizontal resolution in pixels per unit."
- RESOLUTION_UNIT - Static variable in interface org.apache.tika.metadata.TIFF
-
"Units used for Horizontal and Vertical Resolutions." One of "Inch" or "cm"
- RESOLUTION_VERTICAL - Static variable in interface org.apache.tika.metadata.TIFF
-
"Vertical resolution in pixels per unit."
- resolveEntity(String, String) - Method in class org.apache.tika.mime.MimeTypesReader
- resolveEntity(String, String) - Method in class org.apache.tika.sax.OfflineContentHandler
-
Returns an empty stream.
- RESOURCE_NAME_KEY - Static variable in interface org.apache.tika.metadata.TikaMetadataKeys
- REVISION - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
-
The revision number.
- REVISION_NUMBER - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- rewind() - Method in class org.apache.tika.utils.RereadableInputStream
-
"Rewinds" the stream to the beginning for rereading.
- RichTextContentHandler - Class in org.apache.tika.sax
-
Content handler for Rich Text, it will extract XHTML <img/> tag <alt/> attribute and XHTML <a/> tag <name/> attribute into the output.
- RichTextContentHandler(Writer) - Constructor for class org.apache.tika.sax.RichTextContentHandler
-
Creates a content handler that writes XHTML body character events to the given writer.
- RIGHTS - Static variable in interface org.apache.tika.metadata.DublinCore
-
Information about rights held in and over the resource.
- RIGHTS - Static variable in class org.apache.tika.metadata.Metadata
-
Deprecated.use TikaCoreProperties#RIGHTS
- RIGHTS - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- RIGHTS_USAGE_TERMS - Static variable in interface org.apache.tika.metadata.IPTC
-
The licensing parameters of the item expressed in free-text.
- ROOT_XML_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- ROW_COUNT - Static variable in interface org.apache.tika.metadata.Database
- RTF_PICT_META_PREFIX - Static variable in interface org.apache.tika.metadata.RTFMetadata
- RTFMetadata - Interface in org.apache.tika.metadata
S
- SafeContentHandler - Class in org.apache.tika.sax
-
Content handler decorator that makes sure that the character events (
SafeContentHandler.characters(char[], int, int)
orSafeContentHandler.ignorableWhitespace(char[], int, int)
) passed to the decorated content handler contain only valid XML characters. - SafeContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.SafeContentHandler
- SafeContentHandler.Output - Interface in org.apache.tika.sax
-
Internal interface that allows both character and ignorable whitespace content to be filtered the same way.
- SAMPLES_PER_PIXEL - Static variable in interface org.apache.tika.metadata.TIFF
-
"Number of components per pixel."
- save(OutputStream) - Method in class org.apache.tika.config.Param
- save(OutputStream) - Method in class org.apache.tika.language.LanguageProfilerBuilder
-
Deprecated.Writes NGramProfile content into OutputStream, content is outputted with UTF-8 encoding
- save(Node) - Method in class org.apache.tika.config.Param
- SAVE_DATE - Static variable in interface org.apache.tika.metadata.Office
-
When was the document last saved?
- SCALE_TYPE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The musical scale used in the music.
- SCENE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The name of the scene."
- SCENE_CODE - Static variable in interface org.apache.tika.metadata.IPTC
-
Describes the scene of a news content.
- SCHEME - Static variable in interface org.apache.tika.metadata.XMPIdq
-
A qualifier providing the name of the formal identification scheme used for an item in the xmp:Identifier array.
- SCRIPT_SOURCE - Static variable in interface org.apache.tika.metadata.HTML
-
If a script element contains a src value, this value is set in the embedded document's metadata
- SecureContentHandler - Class in org.apache.tika.sax
-
Content handler decorator that attempts to prevent denial of service attacks against Tika parsers.
- SecureContentHandler(ContentHandler, TikaInputStream) - Constructor for class org.apache.tika.sax.SecureContentHandler
-
Decorates the given content handler with zip bomb prevention based on the count of bytes read from the given counting input stream.
- SECURITY - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- select(Metadata) - Method in interface org.apache.tika.extractor.DocumentSelector
-
Checks if a document with the given metadata matches the specified selection criteria.
- SEQ - org.apache.tika.metadata.Property.PropertyType
-
An ordered array
- serialize(TikaConfig, TikaConfigSerializer.Mode, Writer, Charset) - Static method in class org.apache.tika.config.TikaConfigSerializer
- serializeMetadata(List<String>) - Static method in class org.apache.tika.embedder.ExternalEmbedder
-
Serializes a collection of metadata command line arguments into a single string.
- ServiceLoader - Class in org.apache.tika.config
-
Internal utility class that Tika uses to look up service providers.
- ServiceLoader() - Constructor for class org.apache.tika.config.ServiceLoader
- ServiceLoader(ClassLoader) - Constructor for class org.apache.tika.config.ServiceLoader
- ServiceLoader(ClassLoader, LoadErrorHandler) - Constructor for class org.apache.tika.config.ServiceLoader
- ServiceLoader(ClassLoader, LoadErrorHandler, boolean) - Constructor for class org.apache.tika.config.ServiceLoader
- ServiceLoader(ClassLoader, LoadErrorHandler, InitializableProblemHandler, boolean) - Constructor for class org.apache.tika.config.ServiceLoader
- ServiceLoaderUtils - Class in org.apache.tika.utils
-
Service Loading and Ordering related utils
- ServiceLoaderUtils() - Constructor for class org.apache.tika.utils.ServiceLoaderUtils
- set(Class<T>, T) - Method in class org.apache.tika.parser.ParseContext
-
Adds the given value to the context as an implementation of the given interface.
- set(String...) - Static method in class org.apache.tika.mime.MediaType
-
Convenience method that parses the given media type strings and returns an unmodifiable set that contains all the parsed types.
- set(String, String) - Method in class org.apache.tika.metadata.Metadata
-
Set metadata name/value.
- set(Property, double) - Method in class org.apache.tika.metadata.Metadata
-
Sets the real or rational value of the identified metadata property.
- set(Property, int) - Method in class org.apache.tika.metadata.Metadata
-
Sets the integer value of the identified metadata property.
- set(Property, String) - Method in class org.apache.tika.metadata.Metadata
-
Sets the value of the identified metadata property.
- set(Property, String[]) - Method in class org.apache.tika.metadata.Metadata
-
Sets the values of the identified metadata property.
- set(Property, Calendar) - Method in class org.apache.tika.metadata.Metadata
-
Sets the date value of the identified metadata property.
- set(Property, Date) - Method in class org.apache.tika.metadata.Metadata
-
Sets the date value of the identified metadata property.
- set(MediaType...) - Static method in class org.apache.tika.mime.MediaType
-
Convenience method that returns an unmodifiable set that contains all the given media types.
- setAll(Properties) - Method in class org.apache.tika.metadata.Metadata
-
Copy All key-value pairs from properties.
- setCommand(String...) - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Sets the command to be run.
- setCommand(String...) - Method in class org.apache.tika.parser.external.ExternalParser
-
Sets the command to be run.
- setCommandAppendOperator(String) - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Sets the operator to append rather than replace a value for the command line tool, i.e.
- setCommandAssignmentDelimeter(String) - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Sets the delimiter for multiple assignments for the command line tool, i.e.
- setCommandAssignmentOperator(String) - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Sets the assignment operator for the command line tool, i.e.
- setContentHandler(ContentHandler) - Method in class org.apache.tika.sax.ContentHandlerDecorator
-
Sets the underlying content handler.
- setContextClassLoader(ClassLoader) - Static method in class org.apache.tika.config.ServiceLoader
-
Sets the context class loader to use for all threads that access this class.
- setCorePoolSize(int) - Method in interface org.apache.tika.concurrent.ConfigurableThreadPoolExecutor
- setDescription(String) - Method in class org.apache.tika.mime.MimeType
-
Set the description of this media type.
- setDetector(Detector) - Method in class org.apache.tika.parser.AutoDetectParser
-
Sets the type detector used by this parser to auto-detect the type of a document.
- setDocumentLocator(Locator) - Method in class org.apache.tika.sax.ContentHandlerDecorator
- setDocumentLocator(Locator) - Method in class org.apache.tika.sax.DIFContentHandler
- setDocumentLocator(Locator) - Method in class org.apache.tika.sax.TeeContentHandler
- setDocumentLocator(Locator) - Method in class org.apache.tika.sax.TextContentHandler
- setEncodingDetector(EncodingDetector) - Method in class org.apache.tika.parser.AbstractEncodingDetectorParser
- setFallback(Parser) - Method in class org.apache.tika.parser.CompositeParser
-
Sets the fallback parser.
- setIdentifier(String) - Method in class org.apache.tika.sax.StandardReference
- setIgnoredLineConsumer(ExternalParser.LineConsumer) - Method in class org.apache.tika.parser.external.ExternalParser
-
Set a consumer for the lines ignored by the parse functions
- setJavaCommand(String) - Method in class org.apache.tika.fork.ForkParser
-
Deprecated.since 1.8
- setJavaCommand(List<String>) - Method in class org.apache.tika.fork.ForkParser
-
Sets the command used to start the forked server process.
- setMainOrganizationAcronym(String) - Method in class org.apache.tika.sax.StandardReference
- setMaxEmbeddedResources(int) - Method in class org.apache.tika.parser.RecursiveParserWrapper
-
Deprecated.set this on a
RecursiveParserWrapperHandler
- setMaxEntityExpansions(int) - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Set the maximum number of entity expansions allowable in SAX/DOM/StAX parsing.
- setMaxFilesProcessedPerServer(int) - Method in class org.apache.tika.fork.ForkParser
-
If there is a slowly building memory leak in one of the parsers, it is useful to set a limit on the number of files processed by a server before it is shutdown and restarted.
- setMaximumCompressionRatio(long) - Method in class org.apache.tika.sax.SecureContentHandler
-
Sets the ratio between output characters and input bytes.
- setMaximumDepth(int) - Method in class org.apache.tika.sax.SecureContentHandler
-
Sets the maximum XML element nesting level.
- setMaximumPackageEntryDepth(int) - Method in class org.apache.tika.sax.SecureContentHandler
-
Sets the maximum package entry nesting level.
- setMaximumPoolSize(int) - Method in interface org.apache.tika.concurrent.ConfigurableThreadPoolExecutor
- setMaxStringLength(int) - Method in class org.apache.tika.Tika
-
Sets the maximum length of strings returned by the parseToString methods.
- setMediaTypeRegistry(MediaTypeRegistry) - Method in class org.apache.tika.parser.CompositeParser
-
Sets the media type registry used to infer type relationships.
- setMetadataCommandArguments(Map<Property, String[]>) - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Sets the map of Metadata keys to command line parameters.
- setMetadataExtractionPatterns(Map<Pattern, String>) - Method in class org.apache.tika.parser.external.ExternalParser
-
Sets the map of regular expression patterns and Metadata keys.
- setMixedLanguages(boolean) - Method in class org.apache.tika.language.detect.LanguageDetector
- setName(String) - Method in class org.apache.tika.config.Param
- setNumOfHidden(int) - Method in class org.apache.tika.detect.NNTrainedModelBuilder
- setNumOfInputs(int) - Method in class org.apache.tika.detect.NNTrainedModelBuilder
- setNumOfOutputs(int) - Method in class org.apache.tika.detect.NNTrainedModelBuilder
- setOpenContainer(Object) - Method in class org.apache.tika.io.TikaInputStream
-
Stores the open container object against the stream, eg after a Zip contents detector has loaded the file to decide what it contains.
- setOutputThreshold(long) - Method in class org.apache.tika.sax.SecureContentHandler
-
Sets the threshold for output characters before the zip bomb prevention is activated.
- setParams(float[]) - Method in class org.apache.tika.detect.NNTrainedModelBuilder
- setParsers(Map<MediaType, Parser>) - Method in class org.apache.tika.parser.CompositeParser
-
Sets the component parsers.
- setPoolSize(int) - Method in class org.apache.tika.fork.ForkParser
-
Sets the size of the process pool.
- setPoolSize(int) - Static method in class org.apache.tika.mime.MimeTypesReader
-
Set the pool size for cached XML parsers.
- setPoolSize(int) - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Set the pool size for cached XML parsers.
- setPriors(Map<String, Float>) - Method in class org.apache.tika.language.detect.LanguageDetector
-
Set the a-priori probabilities for these languages.
- setQuoteAssignmentValues(boolean) - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Sets whether or not to quote assignment values, i.e.
- setScore(double) - Method in class org.apache.tika.sax.StandardReference
- setScore(double) - Method in class org.apache.tika.sax.StandardReference.StandardReferenceBuilder
- setSecondOrganization(String, String) - Method in class org.apache.tika.sax.StandardReference.StandardReferenceBuilder
- setSecondOrganizationAcronym(String) - Method in class org.apache.tika.sax.StandardReference
- setSeparator(String) - Method in class org.apache.tika.sax.StandardReference
- setServerParseTimeoutMillis(long) - Method in class org.apache.tika.fork.ForkParser
-
The maximum amount of time allowed for the server to try to parse a file.
- setServerPulseMillis(long) - Method in class org.apache.tika.fork.ForkParser
-
The amount of time in milliseconds that the server should wait before checking to see if the parse has timed out or if the wait has timed out The default is 5 seconds.
- setServerWaitTimeoutMillis(long) - Method in class org.apache.tika.fork.ForkParser
-
The maximum amount of time allowed for the server to wait for a new request to parse a file.
- setShortText(boolean) - Method in class org.apache.tika.language.detect.LanguageDetector
- setSuperType(MimeType, MediaType) - Method in class org.apache.tika.mime.MimeTypes
- setSupportedEmbedTypes(Set<MediaType>) - Method in class org.apache.tika.embedder.ExternalEmbedder
- setSupportedTypes(Set<MediaType>) - Method in class org.apache.tika.parser.external.ExternalParser
- setTemporaryFileDirectory(File) - Method in class org.apache.tika.io.TemporaryResources
-
Sets the directory to be used for the temporary files created by the
TemporaryResources.createTempFile()
method. - setTemporaryFileDirectory(Path) - Method in class org.apache.tika.io.TemporaryResources
-
Sets the directory to be used for the temporary files created by the
TemporaryResources.createTempFile()
method. - setThreshold(double) - Method in class org.apache.tika.sax.StandardsExtractingContentHandler
-
Sets the score to be used as threshold.
- setType(Class<T>) - Method in class org.apache.tika.config.Param
- setType(MediaType) - Method in class org.apache.tika.detect.NNTrainedModelBuilder
- setTypeString(String) - Method in class org.apache.tika.config.Param
- shortText - Variable in class org.apache.tika.language.detect.LanguageDetector
- SHOT_DATE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The date and time when the video was shot."
- SHOT_LOCATION - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The name of the location where the video was shot.
- SHOT_NAME - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The name of the shot or take."
- shouldParseEmbedded(Metadata) - Method in interface org.apache.tika.extractor.EmbeddedDocumentExtractor
- shouldParseEmbedded(Metadata) - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
- shouldParseEmbedded(Metadata) - Method in class org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor
- SIMPLE - org.apache.tika.metadata.Property.PropertyType
-
A single value
- SimpleThreadPoolExecutor - Class in org.apache.tika.concurrent
-
Simple Thread Pool Executor
- SimpleThreadPoolExecutor() - Constructor for class org.apache.tika.concurrent.SimpleThreadPoolExecutor
- size() - Method in class org.apache.tika.metadata.Metadata
-
Returns the number of metadata names in this metadata.
- skip(long) - Method in class org.apache.tika.io.BoundedInputStream
-
Invokes the delegate's
skip(long)
method. - skip(long) - Method in class org.apache.tika.io.CountingInputStream
-
Skips the stream over the specified number of bytes, adding the skipped amount to the count.
- skip(long) - Method in class org.apache.tika.io.LookaheadInputStream
- skip(long) - Method in class org.apache.tika.io.NullInputStream
-
Skip a specified number of bytes.
- skip(long) - Method in class org.apache.tika.io.ProxyInputStream
-
Invokes the delegate's
skip(long)
method. - skip(long) - Method in class org.apache.tika.io.TailStream
-
This implementation delegates to the
read()
method to ensure that the tail buffer is also filled if data is skipped. - skip(long) - Method in class org.apache.tika.io.TikaInputStream
- skippedEntity(String) - Method in class org.apache.tika.sax.ContentHandlerDecorator
- skippedEntity(String) - Method in class org.apache.tika.sax.TeeContentHandler
- skippedEntity(String) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
- SLIDE_COUNT - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- SLIDE_COUNT - Static variable in interface org.apache.tika.metadata.Office
-
The number of Slides are there in the (presentation) document
- SOFTWARE - Static variable in interface org.apache.tika.metadata.TIFF
-
"Software or firmware used to generate the image."
- sortLoadedClasses(List<T>) - Static method in class org.apache.tika.utils.ServiceLoaderUtils
-
Sorts a list of loaded classes, so that non-Tika ones come before Tika ones, and otherwise in reverse alphabetical order
- SOURCE - Static variable in interface org.apache.tika.metadata.ClimateForcast
- SOURCE - Static variable in interface org.apache.tika.metadata.DublinCore
-
A reference to a resource from which the present resource is derived.
- SOURCE - Static variable in interface org.apache.tika.metadata.IPTC
-
Identifies the original owner of the copyright for the intellectual content of the item.
- SOURCE - Static variable in class org.apache.tika.metadata.Metadata
-
Deprecated.use TikaCoreProperties#SOURCE
- SOURCE - Static variable in interface org.apache.tika.metadata.Photoshop
- SOURCE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- SPEAKER_PLACEMENT - Static variable in interface org.apache.tika.metadata.XMPDM
-
"A description of the speaker angles from center front in degrees.
- STANDARD_REFERENCES - Static variable in class org.apache.tika.sax.StandardsExtractingContentHandler
- StandardOrganizations - Class in org.apache.tika.sax
-
This class provides a collection of the most important technical standard organizations.
- StandardOrganizations() - Constructor for class org.apache.tika.sax.StandardOrganizations
- StandardReference - Class in org.apache.tika.sax
-
Class that represents a standard reference.
- StandardReference.StandardReferenceBuilder - Class in org.apache.tika.sax
- StandardReferenceBuilder(String, String) - Constructor for class org.apache.tika.sax.StandardReference.StandardReferenceBuilder
- StandardsExtractingContentHandler - Class in org.apache.tika.sax
-
StandardsExtractingContentHandler is a Content Handler used to extract standard references while parsing.
- StandardsExtractingContentHandler() - Constructor for class org.apache.tika.sax.StandardsExtractingContentHandler
-
Creates a decorator that by default forwards incoming SAX events to a dummy content handler that simply ignores all the events.
- StandardsExtractingContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.sax.StandardsExtractingContentHandler
-
Creates a decorator for the given SAX event handler and Metadata object.
- StandardsText - Class in org.apache.tika.sax
-
StandardText relies on regular expressions to extract standard references from text.
- StandardsText() - Constructor for class org.apache.tika.sax.StandardsText
- start(BundleContext) - Method in class org.apache.tika.config.TikaActivator
- startDescription(String, String, String) - Method in class org.apache.tika.sax.XMPContentHandler
- startDocument() - Method in class org.apache.tika.sax.ContentHandlerDecorator
- startDocument() - Method in class org.apache.tika.sax.DIFContentHandler
- startDocument() - Method in class org.apache.tika.sax.EmbeddedContentHandler
-
Ignored.
- startDocument() - Method in class org.apache.tika.sax.ExpandedTitleContentHandler
- startDocument() - Method in class org.apache.tika.sax.TeeContentHandler
- startDocument() - Method in class org.apache.tika.sax.TextContentHandler
- startDocument() - Method in class org.apache.tika.sax.ToHTMLContentHandler
- startDocument() - Method in class org.apache.tika.sax.ToXMLContentHandler
-
Writes the XML prefix.
- startDocument() - Method in class org.apache.tika.sax.XHTMLContentHandler
-
Starts an XHTML document by setting up the namespace mappings when called for the first time.
- startDocument() - Method in class org.apache.tika.sax.XMPContentHandler
-
Starts an XMP document by setting up the namespace mappings and writing out the following header:
- startElement(String) - Method in class org.apache.tika.sax.XHTMLContentHandler
- startElement(String, String, String) - Method in class org.apache.tika.sax.XHTMLContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.mime.MimeTypesReader
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.ContentHandlerDecorator
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.DIFContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.ElementMappingContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.ExpandedTitleContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.LinkContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.RichTextContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.SafeContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.SecureContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.TeeContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.TextContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.ToTextContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.ToXMLContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.XHTMLContentHandler
-
Starts the given element.
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
- startElement(String, AttributesImpl) - Method in class org.apache.tika.sax.XHTMLContentHandler
- startEmbeddedDocument(ContentHandler, Metadata) - Method in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
-
This is called before parsing each embedded document.
- startEmbeddedDocument(ContentHandler, Metadata) - Method in class org.apache.tika.sax.RecursiveParserWrapperHandler
-
This is called before parsing an embedded document
- startPrefixMapping(String, String) - Method in class org.apache.tika.sax.ContentHandlerDecorator
- startPrefixMapping(String, String) - Method in class org.apache.tika.sax.TeeContentHandler
- startPrefixMapping(String, String) - Method in class org.apache.tika.sax.ToXMLContentHandler
- STATE - Static variable in interface org.apache.tika.metadata.Photoshop
- STATIC - org.apache.tika.config.TikaConfigSerializer.Mode
-
Static version of the config, with explicit lists of parsers/decorators/etc
- STATIC_FULL - org.apache.tika.config.TikaConfigSerializer.Mode
-
Static version of the config, with explicit lists of decorators etc, and all parsers given with their detected supported mime types
- stop(BundleContext) - Method in class org.apache.tika.config.TikaActivator
- STRETCH_MODE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The audio stretch mode."
- STRUCTURE - org.apache.tika.metadata.Property.PropertyType
- SUB_CLASS_OF_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- SUB_CLASS_TYPE_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- SUBJECT - Static variable in interface org.apache.tika.metadata.DublinCore
-
The topic of the content of the resource.
- SUBJECT - Static variable in class org.apache.tika.metadata.Metadata
-
Deprecated.use TikaCoreProperties#KEYWORDS
- SUBJECT - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
-
The document's subject.
- SUBJECT_CODE - Static variable in interface org.apache.tika.metadata.IPTC
-
Specifies one or more Subjects from the IPTC Subject-NewsCodes taxonomy to categorise the content.
- SUBLOCATION - Static variable in interface org.apache.tika.metadata.IPTC
-
Name of a sublocation the content is focussing on -- either the location shown in visual media or referenced by text or audio media.
- SubtreeMatcher - Class in org.apache.tika.sax.xpath
-
Evaluation state of a
...//...
XPath expression. - SubtreeMatcher(Matcher) - Constructor for class org.apache.tika.sax.xpath.SubtreeMatcher
- SUPPLEMENTAL_CATEGORIES - Static variable in interface org.apache.tika.metadata.IPTC
-
Deprecated.
- SUPPLEMENTAL_CATEGORIES - Static variable in interface org.apache.tika.metadata.Photoshop
- SystemUtils - Class in org.apache.tika.utils
-
Copied from commons-lang to avoid requiring the dependency
- SystemUtils() - Constructor for class org.apache.tika.utils.SystemUtils
T
- TABLE_COUNT - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- TABLE_COUNT - Static variable in interface org.apache.tika.metadata.Office
-
The number of Tables in the document
- TABLE_ID - Static variable in interface org.apache.tika.metadata.ClimateForcast
- TABLE_NAME - Static variable in interface org.apache.tika.metadata.Database
- TaggedContentHandler - Class in org.apache.tika.sax
-
A content handler decorator that tags potential exceptions so that the handler that caused the exception can easily be identified.
- TaggedContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.TaggedContentHandler
-
Creates a tagging decorator for the given content handler.
- TaggedInputStream - Class in org.apache.tika.io
-
An input stream decorator that tags potential exceptions so that the stream that caused the exception can easily be identified.
- TaggedInputStream(InputStream) - Constructor for class org.apache.tika.io.TaggedInputStream
-
Creates a tagging decorator for the given input stream.
- TaggedIOException - Exception in org.apache.tika.io
-
An
IOException
wrapper that tags the wrapped exception with a given object reference. - TaggedIOException(IOException, Object) - Constructor for exception org.apache.tika.io.TaggedIOException
-
Creates a tagged wrapper for the given exception.
- TaggedSAXException - Exception in org.apache.tika.sax
-
A
SAXException
wrapper that tags the wrapped exception with a given object reference. - TaggedSAXException(SAXException, Object) - Constructor for exception org.apache.tika.sax.TaggedSAXException
-
Creates a tagged wrapper for the given exception.
- TailStream - Class in org.apache.tika.io
-
A specialized input stream implementation which records the last portion read from an underlying stream.
- TailStream(InputStream, int) - Constructor for class org.apache.tika.io.TailStream
-
Creates a new instance of
TailStream
. - TAPE_NAME - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The name of the tape from which the clip was captured, as set during the capture process."
- TargetElement(String, String) - Constructor for class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
-
A shortcut that automatically creates the QName object
- TargetElement(String, String, Map<QName, QName>) - Constructor for class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
-
A shortcut that automatically creates the QName object
- TargetElement(QName) - Constructor for class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
-
Creates an TargetElement with no attributes, all attributes will be deleted from SAX stream
- TargetElement(QName, Map<QName, QName>) - Constructor for class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
-
Creates an TargetElement, attributes of this element will be mapped as specified
- TeeContentHandler - Class in org.apache.tika.sax
-
Content handler proxy that forwards the received SAX events to zero or more underlying content handlers.
- TeeContentHandler(ContentHandler...) - Constructor for class org.apache.tika.sax.TeeContentHandler
- TEMPLATE - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- TEMPLATE - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- TEMPO - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The audio's tempo."
- TemporaryResources - Class in org.apache.tika.io
-
Utility class for tracking and ultimately closing or otherwise disposing a collection of temporary resources.
- TemporaryResources() - Constructor for class org.apache.tika.io.TemporaryResources
- text(String) - Static method in class org.apache.tika.mime.MediaType
- TEXT - org.apache.tika.metadata.Property.ValueType
- TEXT - org.apache.tika.sax.BasicContentHandlerFactory.HANDLER_TYPE
- TEXT_HTML - Static variable in class org.apache.tika.mime.MediaType
- TEXT_PLAIN - Static variable in class org.apache.tika.mime.MediaType
- TextContentHandler - Class in org.apache.tika.sax
-
Content handler decorator that only passes the
TextContentHandler.characters(char[], int, int)
and (@linkTextContentHandler.ignorableWhitespace(char[], int, int)
(plusTextContentHandler.startDocument()
andTextContentHandler.endDocument()
events to the decorated content handler. - TextContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.TextContentHandler
- TextContentHandler(ContentHandler, boolean) - Constructor for class org.apache.tika.sax.TextContentHandler
- TextDetector - Class in org.apache.tika.detect
-
Content type detection of plain text documents.
- TextDetector() - Constructor for class org.apache.tika.detect.TextDetector
-
Constructs a
TextDetector
which will look at the default number of bytes from the beginning of the document. - TextDetector(int) - Constructor for class org.apache.tika.detect.TextDetector
-
Constructs a
TextDetector
which will look at a given number of bytes from the beginning of the document. - TextMatcher - Class in org.apache.tika.sax.xpath
-
Final evaluation state of a
.../text()
XPath expression. - TextMatcher() - Constructor for class org.apache.tika.sax.xpath.TextMatcher
- TextStatistics - Class in org.apache.tika.detect
-
Utility class for computing a histogram of the bytes seen in a stream.
- TextStatistics() - Constructor for class org.apache.tika.detect.TextStatistics
- threshold(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
- THROW - Static variable in interface org.apache.tika.config.InitializableProblemHandler
- THROW - Static variable in interface org.apache.tika.config.LoadErrorHandler
-
Strategy that throws a
RuntimeException
with the given throwable as the root cause, thus interrupting the entire service loading operation. - throwIfCauseOf(Exception) - Method in class org.apache.tika.io.TaggedInputStream
-
Re-throws the original exception thrown by this stream.
- throwIfCauseOf(Exception) - Method in class org.apache.tika.sax.TaggedContentHandler
-
Re-throws the original exception thrown by this handler.
- throwIfCauseOf(SAXException) - Method in class org.apache.tika.sax.SecureContentHandler
-
Converts the given
SAXException
to a correspondingTikaException
if it's caused by this instance detecting a zip bomb. - THUMBNAIL - Static variable in interface org.apache.tika.metadata.RTFMetadata
-
if set to true, this means that an image file is probably a "thumbnail" any time a pict/emf/wmf is in an object
- TIFF - Interface in org.apache.tika.metadata
-
XMP Exif TIFF schema.
- Tika - Class in org.apache.tika
-
Facade class for accessing Tika functionality.
- Tika() - Constructor for class org.apache.tika.Tika
-
Creates a Tika facade using the default configuration.
- Tika(TikaConfig) - Constructor for class org.apache.tika.Tika
-
Creates a Tika facade using the given configuration.
- Tika(Detector) - Constructor for class org.apache.tika.Tika
-
Creates a Tika facade using the given detector instance, the default parser configuration, and the default Translator.
- Tika(Detector, Parser) - Constructor for class org.apache.tika.Tika
-
Creates a Tika facade using the given detector and parser instances, but the default Translator.
- Tika(Detector, Parser, Translator) - Constructor for class org.apache.tika.Tika
-
Creates a Tika facade using the given detector, parser, and translator instances.
- TIKA_CONFIG_PATH - Static variable in class org.apache.tika.parser.AutoDetectParserFactory
-
Path to a tika-config file.
- TIKA_CONTENT - Static variable in class org.apache.tika.parser.RecursiveParserWrapper
-
Deprecated.
- TIKA_CONTENT - Static variable in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
- TIKA_CONTENT_HANDLER - Static variable in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
-
Simple class name of the content handler
- TIKA_LINK_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- TIKA_META_EXCEPTION_EMBEDDED_STREAM - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
Use this to store exceptions caught while trying to read the stream of an embedded resource.
- TIKA_META_EXCEPTION_PREFIX - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
Use this to store parse exception information in the Metadata object.
- TIKA_META_EXCEPTION_WARNING - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
Use this to store exceptions caught during a parse that are non-fatal, e.g.
- TIKA_META_PREFIX - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
Use this to prefix metadata properties that store information about the parsing process.
- TIKA_MIME_FILE - Static variable in interface org.apache.tika.metadata.TikaMimeKeys
- TIKA_UTI_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- TikaActivator - Class in org.apache.tika.config
-
Bundle activator that adjust the class loading mechanism of the
ServiceLoader
class to work correctly in an OSGi environment. - TikaActivator() - Constructor for class org.apache.tika.config.TikaActivator
- TikaConfig - Class in org.apache.tika.config
-
Parse xml config file.
- TikaConfig() - Constructor for class org.apache.tika.config.TikaConfig
-
Creates a default Tika configuration.
- TikaConfig(File) - Constructor for class org.apache.tika.config.TikaConfig
- TikaConfig(File, ServiceLoader) - Constructor for class org.apache.tika.config.TikaConfig
- TikaConfig(InputStream) - Constructor for class org.apache.tika.config.TikaConfig
- TikaConfig(ClassLoader) - Constructor for class org.apache.tika.config.TikaConfig
-
Creates a Tika configuration from the built-in media type rules and all the
Parser
implementations available through theservice provider mechanism
in the given class loader. - TikaConfig(String) - Constructor for class org.apache.tika.config.TikaConfig
- TikaConfig(URL) - Constructor for class org.apache.tika.config.TikaConfig
- TikaConfig(URL, ClassLoader) - Constructor for class org.apache.tika.config.TikaConfig
- TikaConfig(URL, ServiceLoader) - Constructor for class org.apache.tika.config.TikaConfig
- TikaConfig(Path) - Constructor for class org.apache.tika.config.TikaConfig
- TikaConfig(Path, ServiceLoader) - Constructor for class org.apache.tika.config.TikaConfig
- TikaConfig(Document) - Constructor for class org.apache.tika.config.TikaConfig
- TikaConfig(Document, ServiceLoader) - Constructor for class org.apache.tika.config.TikaConfig
- TikaConfig(Element) - Constructor for class org.apache.tika.config.TikaConfig
- TikaConfig(Element, ClassLoader) - Constructor for class org.apache.tika.config.TikaConfig
- TikaConfigException - Exception in org.apache.tika.exception
-
Tika Config Exception is an exception to occur when there is an error in Tika config file and/or one or more of the parsers failed to initialize from that erroneous config.
- TikaConfigException(String) - Constructor for exception org.apache.tika.exception.TikaConfigException
-
Creates an instance of exception
- TikaConfigException(String, Throwable) - Constructor for exception org.apache.tika.exception.TikaConfigException
- TikaConfigSerializer - Class in org.apache.tika.config
- TikaConfigSerializer() - Constructor for class org.apache.tika.config.TikaConfigSerializer
- TikaConfigSerializer.Mode - Enum in org.apache.tika.config
- TikaCoreProperties - Interface in org.apache.tika.metadata
-
Contains a core set of basic Tika metadata properties, which all parsers will attempt to supply (where the file format permits).
- TikaCoreProperties.EmbeddedResourceType - Enum in org.apache.tika.metadata
-
A file might contain different types of embedded documents.
- TikaException - Exception in org.apache.tika.exception
-
Tika exception
- TikaException(String) - Constructor for exception org.apache.tika.exception.TikaException
- TikaException(String, Throwable) - Constructor for exception org.apache.tika.exception.TikaException
- TikaInputStream - Class in org.apache.tika.io
-
Input stream with extended capabilities.
- TikaMemoryLimitException - Exception in org.apache.tika.exception
- TikaMemoryLimitException(String) - Constructor for exception org.apache.tika.exception.TikaMemoryLimitException
- TikaMetadataKeys - Interface in org.apache.tika.metadata
-
Contains keys to properties in Metadata instances.
- TikaMimeKeys - Interface in org.apache.tika.metadata
-
A collection of Tika metadata keys used in Mime Type resolution
- TIME_SIGNATURE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The time signature of the music."
- TIMES_INSTANTIATED - Static variable in class org.apache.tika.config.TikaConfig
- TITLE - Static variable in interface org.apache.tika.metadata.DublinCore
-
A name given to the resource.
- TITLE - Static variable in interface org.apache.tika.metadata.IPTC
-
A shorthand reference for the item.
- TITLE - Static variable in class org.apache.tika.metadata.Metadata
-
Deprecated.use TikaCoreProperties#TITLE
- TITLE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- toByteArray(InputStream) - Static method in class org.apache.tika.io.IOUtils
-
Get the contents of an
InputStream
as abyte[]
. - toByteArray(Reader) - Static method in class org.apache.tika.io.IOUtils
-
Get the contents of a
Reader
as abyte[]
using the default character encoding of the platform. - toByteArray(Reader, String) - Static method in class org.apache.tika.io.IOUtils
-
Get the contents of a
Reader
as abyte[]
using the specified character encoding. - toByteArray(String) - Static method in class org.apache.tika.io.IOUtils
-
Deprecated.
- toCharArray(InputStream) - Static method in class org.apache.tika.io.IOUtils
-
Get the contents of an
InputStream
as a character array using the default character encoding of the platform. - toCharArray(InputStream, String) - Static method in class org.apache.tika.io.IOUtils
-
Get the contents of an
InputStream
as a character array using the specified character encoding. - toCharArray(Reader) - Static method in class org.apache.tika.io.IOUtils
-
Get the contents of a
Reader
as a character array. - ToHTMLContentHandler - Class in org.apache.tika.sax
-
SAX event handler that serializes the HTML document to a character stream.
- ToHTMLContentHandler() - Constructor for class org.apache.tika.sax.ToHTMLContentHandler
- ToHTMLContentHandler(OutputStream, String) - Constructor for class org.apache.tika.sax.ToHTMLContentHandler
- toInputStream(CharSequence) - Static method in class org.apache.tika.io.IOUtils
-
Convert the specified CharSequence to an input stream, encoded as bytes using the default character encoding of the platform.
- toInputStream(CharSequence, String) - Static method in class org.apache.tika.io.IOUtils
-
Convert the specified CharSequence to an input stream, encoded as bytes using the specified character encoding.
- toInputStream(String) - Static method in class org.apache.tika.io.IOUtils
-
Convert the specified string to an input stream, encoded as bytes using the default character encoding of the platform.
- toInputStream(String, String) - Static method in class org.apache.tika.io.IOUtils
-
Convert the specified string to an input stream, encoded as bytes using the specified character encoding.
- toString() - Method in class org.apache.tika.config.Param
- toString() - Method in class org.apache.tika.config.ParamField
- toString() - Method in class org.apache.tika.detect.MagicDetector
-
Returns a string representation of the Detection Rule.
- toString() - Method in class org.apache.tika.io.CountingInputStream
- toString() - Method in class org.apache.tika.io.TaggedInputStream
- toString() - Method in class org.apache.tika.io.TikaInputStream
- toString() - Method in class org.apache.tika.language.detect.LanguageResult
- toString() - Method in class org.apache.tika.language.LanguageIdentifier
-
Deprecated.
- toString() - Method in class org.apache.tika.language.LanguageProfile
-
Deprecated.
- toString() - Method in class org.apache.tika.language.LanguageProfilerBuilder
-
Deprecated.
- toString() - Method in class org.apache.tika.metadata.Metadata
- toString() - Method in class org.apache.tika.mime.MediaType
- toString() - Method in class org.apache.tika.mime.MimeType
-
Returns the name of this media type.
- toString() - Method in class org.apache.tika.sax.ContentHandlerDecorator
- toString() - Method in class org.apache.tika.sax.DIFContentHandler
- toString() - Method in class org.apache.tika.sax.Link
- toString() - Method in class org.apache.tika.sax.StandardReference
- toString() - Method in class org.apache.tika.sax.TextContentHandler
- toString() - Method in class org.apache.tika.sax.ToTextContentHandler
-
Returns the contents of the internal string buffer where all the received characters have been collected.
- toString() - Method in class org.apache.tika.Tika
- toString(byte[]) - Static method in class org.apache.tika.io.IOUtils
-
Deprecated.Use
String(byte[])
- toString(byte[], String) - Static method in class org.apache.tika.io.IOUtils
-
Deprecated.
- toString(InputStream) - Static method in class org.apache.tika.io.IOUtils
-
Get the contents of an
InputStream
as a String using the default character encoding of the platform. - toString(InputStream, String) - Static method in class org.apache.tika.io.IOUtils
-
Get the contents of an
InputStream
as a String using the specified character encoding. - toString(Reader) - Static method in class org.apache.tika.io.IOUtils
-
Get the contents of a
Reader
as a String. - TOTAL_TIME - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- TOTAL_TIME - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- ToTextContentHandler - Class in org.apache.tika.sax
-
SAX event handler that writes all character content out to a character stream.
- ToTextContentHandler() - Constructor for class org.apache.tika.sax.ToTextContentHandler
-
Creates a content handler that writes character events to an internal string buffer.
- ToTextContentHandler(OutputStream) - Constructor for class org.apache.tika.sax.ToTextContentHandler
-
Creates a content handler that writes character events to the given output stream using the platform default encoding.
- ToTextContentHandler(OutputStream, String) - Constructor for class org.apache.tika.sax.ToTextContentHandler
-
Creates a content handler that writes character events to the given output stream using the given encoding.
- ToTextContentHandler(Writer) - Constructor for class org.apache.tika.sax.ToTextContentHandler
-
Creates a content handler that writes character events to the given writer.
- ToXMLContentHandler - Class in org.apache.tika.sax
-
SAX event handler that serializes the XML document to a character stream.
- ToXMLContentHandler() - Constructor for class org.apache.tika.sax.ToXMLContentHandler
- ToXMLContentHandler(OutputStream, String) - Constructor for class org.apache.tika.sax.ToXMLContentHandler
-
Creates an XML serializer that writes to the given byte stream using the given character encoding.
- ToXMLContentHandler(String) - Constructor for class org.apache.tika.sax.ToXMLContentHandler
- TRACK_NUMBER - Static variable in interface org.apache.tika.metadata.XMPDM
-
"A numeric value indicating the order of the audio file within its original recording."
- TrainedModel - Class in org.apache.tika.detect
- TrainedModel() - Constructor for class org.apache.tika.detect.TrainedModel
- TrainedModelDetector - Class in org.apache.tika.detect
- TrainedModelDetector() - Constructor for class org.apache.tika.detect.TrainedModelDetector
- TRANSITION_KEYWORDS_TO_DC_SUBJECT - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
Deprecated.use TikaCoreProperties#KEYWORDS
- TRANSITION_SUBJECT_TO_DC_DESCRIPTION - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
Deprecated.use TikaCoreProperties#DESCRIPTION
- TRANSITION_SUBJECT_TO_DC_TITLE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
Deprecated.use TikaCoreProperties#TITLE
- TRANSITION_SUBJECT_TO_OO_SUBJECT - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
Deprecated.use OfficeOpenXMLCore#SUBJECT
- translate(InputStream, String) - Method in class org.apache.tika.Tika
-
Translate the given text InputStream to the given language, attempting to auto-detect the source language.
- translate(InputStream, String, String) - Method in class org.apache.tika.Tika
-
Translate the given text InputStream to and from the given languages.
- translate(String, String) - Method in class org.apache.tika.language.translate.DefaultTranslator
-
Translate, using the first available service-loaded translator
- translate(String, String) - Method in class org.apache.tika.language.translate.EmptyTranslator
- translate(String, String) - Method in interface org.apache.tika.language.translate.Translator
-
Translate text to the given language This method attempts to auto-detect the source language of the text.
- translate(String, String) - Method in class org.apache.tika.Tika
-
Translate the given text String to the given language, attempting to auto-detect the source language.
- translate(String, String, String) - Method in class org.apache.tika.language.translate.DefaultTranslator
-
Translate, using the first available service-loaded translator
- translate(String, String, String) - Method in class org.apache.tika.language.translate.EmptyTranslator
- translate(String, String, String) - Method in interface org.apache.tika.language.translate.Translator
-
Translate text between given languages.
- translate(String, String, String) - Method in class org.apache.tika.Tika
-
Translate the given text String to and from the given languages.
- Translator - Interface in org.apache.tika.language.translate
-
Interface for Translator services.
- TRANSMISSION_REFERENCE - Static variable in interface org.apache.tika.metadata.Photoshop
- trimMessage(String) - Static method in class org.apache.tika.utils.ExceptionUtils
-
Utility method to trim the message from a stack trace string.
- tryToFindExistingLeafParser(Class, ParseContext) - Static method in class org.apache.tika.extractor.EmbeddedDocumentUtil
-
Tries to find an existing parser within the ParseContext.
- tryToParse(String) - Method in class org.apache.tika.utils.DateUtils
-
Tries to parse the date string; returns null if no parse was possible.
- type - Variable in class org.apache.tika.mime.MimeTypesReader
-
Current type
- TYPE - Static variable in interface org.apache.tika.metadata.DublinCore
-
The nature or genre of the content of the resource.
- TYPE - Static variable in class org.apache.tika.metadata.Metadata
-
Deprecated.use TikaCoreProperties#TYPE
- TYPE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- TypeDetector - Class in org.apache.tika.detect
-
Content type detection based on a content type hint.
- TypeDetector() - Constructor for class org.apache.tika.detect.TypeDetector
- types - Variable in class org.apache.tika.mime.MimeTypesReader
U
- ubyteToInt(byte) - Static method in class org.apache.tika.io.EndianUtils
-
Convert an 'unsigned' byte to an integer.
- unescapeCommandLine(String) - Static method in class org.apache.tika.utils.ProcessUtils
- UNMAP_NOT_SUPPORTED_REASON - Static variable in class org.apache.tika.io.MappedBufferCleaner
-
if
MappedBufferCleaner.UNMAP_SUPPORTED
isfalse
, this contains the reason why unmapping is not supported. - UNMAP_SUPPORTED - Static variable in class org.apache.tika.io.MappedBufferCleaner
-
true
, if this platform supports unmapping mmapped files. - UNMAPPED_UNICODE_CHARS_PER_PAGE - Static variable in interface org.apache.tika.metadata.PDF
- UnsupportedFormatException - Exception in org.apache.tika.exception
-
Parsers should throw this exception when they encounter a file format that they do not support.
- UnsupportedFormatException(String) - Constructor for exception org.apache.tika.exception.UnsupportedFormatException
- URGENCY - Static variable in interface org.apache.tika.metadata.IPTC
-
Deprecated.
- URGENCY - Static variable in interface org.apache.tika.metadata.Photoshop
- URI - org.apache.tika.metadata.Property.ValueType
- URL - org.apache.tika.metadata.Property.ValueType
- USAGE_TERMS - Static variable in interface org.apache.tika.metadata.XMPRights
-
A word or short phrase that identifies a resource as a member of a userdefined collection.
- useInterleaved - Static variable in class org.apache.tika.language.LanguageProfile
-
Deprecated.
- USER_DEFINED_METADATA_NAME_PREFIX - Static variable in interface org.apache.tika.metadata.MSOffice
-
For user defined metadata entries in the document, what prefix should be attached to the key names.
- USER_DEFINED_METADATA_NAME_PREFIX - Static variable in interface org.apache.tika.metadata.Office
-
For user defined metadata entries in the document, what prefix should be attached to the key names.
- UTC - Static variable in class org.apache.tika.utils.DateUtils
-
The UTC time zone.
- UTF_8 - Static variable in class org.apache.tika.io.IOUtils
V
- valueOf(String) - Static method in enum org.apache.tika.config.TikaConfigSerializer.Mode
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.language.detect.LanguageConfidence
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.metadata.Property.PropertyType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.metadata.Property.ValueType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.metadata.TikaCoreProperties.EmbeddedResourceType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.sax.BasicContentHandlerFactory.HANDLER_TYPE
-
Returns the enum constant of this type with the specified name.
- values() - Static method in enum org.apache.tika.config.TikaConfigSerializer.Mode
-
Returns an array containing the constants of this enum type, in the order they are declared.
- values() - Static method in enum org.apache.tika.language.detect.LanguageConfidence
-
Returns an array containing the constants of this enum type, in the order they are declared.
- values() - Static method in enum org.apache.tika.metadata.Property.PropertyType
-
Returns an array containing the constants of this enum type, in the order they are declared.
- values() - Static method in enum org.apache.tika.metadata.Property.ValueType
-
Returns an array containing the constants of this enum type, in the order they are declared.
- values() - Static method in enum org.apache.tika.metadata.TikaCoreProperties.EmbeddedResourceType
-
Returns an array containing the constants of this enum type, in the order they are declared.
- values() - Static method in enum org.apache.tika.sax.BasicContentHandlerFactory.HANDLER_TYPE
-
Returns an array containing the constants of this enum type, in the order they are declared.
- VERSION - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- VERSION - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
-
The version number.
- VERSION - Static variable in interface org.apache.tika.metadata.QuattroPro
-
Version.
- video(String) - Static method in class org.apache.tika.mime.MediaType
- VIDEO_ALPHA_MODE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The alpha mode."
- VIDEO_ALPHA_UNITY_IS_TRANSPARENT - Static variable in interface org.apache.tika.metadata.XMPDM
-
"When true, unity is clear, when false, it is opaque."
- VIDEO_COLOR_SPACE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The color space."
- VIDEO_COMPRESSOR - Static variable in interface org.apache.tika.metadata.XMPDM
-
"Video compression used.
- VIDEO_FIELD_ORDER - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The field order for video."
- VIDEO_FRAME_RATE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The video frame rate."
- VIDEO_MOD_DATE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The date and time when the video was last modified."
- VIDEO_PIXEL_ASPECT_RATIO - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The aspect ratio, expressed as wd/ht.
- VIDEO_PIXEL_DEPTH - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The size in bits of each color component of a pixel.
W
- WARN - Static variable in interface org.apache.tika.config.InitializableProblemHandler
-
Strategy that logs warnings of all problems using a
Logger
created using the given class name. - WARN - Static variable in interface org.apache.tika.config.LoadErrorHandler
-
Strategy that logs warnings of all problems using a
Logger
created using the given class name. - WEB_STATEMENT - Static variable in interface org.apache.tika.metadata.XMPRights
-
A Web URL for a statement of the ownership and usage rights for this resource.
- withFallbacks(Collection<? extends Parser>, Set<MediaType>) - Static method in class org.apache.tika.parser.ParserDecorator
-
Deprecated.Do not use until the TODOs are resolved, see TIKA-1509
- withoutTypes(Parser, Set<MediaType>) - Static method in class org.apache.tika.parser.ParserDecorator
-
Decorates the given parser so that it never claims to support parsing of the given media types, but will work for all others.
- withTypes(Parser, Set<MediaType>) - Static method in class org.apache.tika.parser.ParserDecorator
-
Decorates the given parser so that it always claims to support parsing of the given media types.
- WORD_COUNT - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- WORD_COUNT - Static variable in interface org.apache.tika.metadata.Office
-
The number of Words in the document
- WORD_PROCESSING_NAMESPACE_URI - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- WORD_PROCESSING_PREFIX - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- WordPerfect - Interface in org.apache.tika.metadata
-
WordPerfect properties collection.
- WORDPERFECT_METADATA_NAME_PREFIX - Static variable in interface org.apache.tika.metadata.WordPerfect
- WORK_TYPE - Static variable in interface org.apache.tika.metadata.CreativeCommons
- write(byte[]) - Method in class org.apache.tika.io.NullOutputStream
-
Does nothing - output to
/dev/null
. - write(byte[], int, int) - Method in class org.apache.tika.io.NullOutputStream
-
Does nothing - output to
/dev/null
. - write(byte[], OutputStream) - Static method in class org.apache.tika.io.IOUtils
-
Writes bytes from a
byte[]
to anOutputStream
. - write(byte[], Writer) - Static method in class org.apache.tika.io.IOUtils
-
Writes bytes from a
byte[]
to chars on aWriter
using the default character encoding of the platform. - write(byte[], Writer, String) - Static method in class org.apache.tika.io.IOUtils
-
Writes bytes from a
byte[]
to chars on aWriter
using the specified character encoding. - write(char) - Method in class org.apache.tika.sax.ToXMLContentHandler
-
Writes the given character as-is.
- write(char[], int, int) - Method in class org.apache.tika.language.detect.LanguageWriter
- write(char[], int, int) - Method in class org.apache.tika.language.ProfilingWriter
-
Deprecated.
- write(char[], int, int) - Method in interface org.apache.tika.sax.SafeContentHandler.Output
- write(char[], OutputStream) - Static method in class org.apache.tika.io.IOUtils
-
Writes chars from a
char[]
to bytes on anOutputStream
. - write(char[], OutputStream, String) - Static method in class org.apache.tika.io.IOUtils
-
Writes chars from a
char[]
to bytes on anOutputStream
using the specified character encoding. - write(char[], Writer) - Static method in class org.apache.tika.io.IOUtils
-
Writes chars from a
char[]
to aWriter
using the default character encoding of the platform. - write(int) - Method in class org.apache.tika.io.NullOutputStream
-
Does nothing - output to
/dev/null
. - write(CharSequence, OutputStream) - Static method in class org.apache.tika.io.IOUtils
-
Writes chars from a
CharSequence
to bytes on anOutputStream
using the default character encoding of the platform. - write(CharSequence, OutputStream, String) - Static method in class org.apache.tika.io.IOUtils
-
Writes chars from a
CharSequence
to bytes on anOutputStream
using the specified character encoding. - write(CharSequence, Writer) - Static method in class org.apache.tika.io.IOUtils
-
Writes chars from a
CharSequence
to aWriter
. - write(String) - Method in class org.apache.tika.sax.ToXMLContentHandler
-
Writes the given string of character as-is.
- write(StringBuffer, OutputStream) - Static method in class org.apache.tika.io.IOUtils
-
Deprecated.replaced by write(CharSequence, OutputStream)
- write(StringBuffer, OutputStream, String) - Static method in class org.apache.tika.io.IOUtils
-
Deprecated.replaced by write(CharSequence, OutputStream, String)
- write(StringBuffer, Writer) - Static method in class org.apache.tika.io.IOUtils
-
Deprecated.replaced by write(CharSequence, Writer)
- write(String, OutputStream) - Static method in class org.apache.tika.io.IOUtils
-
Writes chars from a
String
to bytes on anOutputStream
using the default character encoding of the platform. - write(String, OutputStream, String) - Static method in class org.apache.tika.io.IOUtils
-
Writes chars from a
String
to bytes on anOutputStream
using the specified character encoding. - write(String, Writer) - Static method in class org.apache.tika.io.IOUtils
-
Writes chars from a
String
to aWriter
. - WRITE_LIMIT_REACHED - Static variable in class org.apache.tika.parser.RecursiveParserWrapper
-
Deprecated.
- WRITE_LIMIT_REACHED - Static variable in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
- WriteOutContentHandler - Class in org.apache.tika.sax
-
SAX event handler that writes content up to an optional write limit out to a character stream or other decorated handler.
- WriteOutContentHandler() - Constructor for class org.apache.tika.sax.WriteOutContentHandler
-
Creates a content handler that writes character events to an internal string buffer.
- WriteOutContentHandler(int) - Constructor for class org.apache.tika.sax.WriteOutContentHandler
-
Creates a content handler that writes character events to an internal string buffer.
- WriteOutContentHandler(OutputStream) - Constructor for class org.apache.tika.sax.WriteOutContentHandler
-
Creates a content handler that writes character events to the given output stream using the default encoding.
- WriteOutContentHandler(Writer) - Constructor for class org.apache.tika.sax.WriteOutContentHandler
-
Creates a content handler that writes character events to the given writer.
- WriteOutContentHandler(Writer, int) - Constructor for class org.apache.tika.sax.WriteOutContentHandler
-
Creates a content handler that writes content up to the given write limit to the given character stream.
- WriteOutContentHandler(ContentHandler, int) - Constructor for class org.apache.tika.sax.WriteOutContentHandler
-
Creates a content handler that writes content up to the given write limit to the given content handler.
- writeReplacement(SafeContentHandler.Output) - Method in class org.apache.tika.sax.SafeContentHandler
-
Outputs the replacement for an invalid character.
X
- X_PARSED_BY - Static variable in class org.apache.tika.utils.ParserUtils
- XHTML - Static variable in class org.apache.tika.sax.XHTMLContentHandler
-
The XHTML namespace URI
- XHTMLContentHandler - Class in org.apache.tika.sax
-
Content handler decorator that simplifies the task of producing XHTML events for Tika content parsers.
- XHTMLContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.sax.XHTMLContentHandler
- XML - org.apache.tika.sax.BasicContentHandlerFactory.HANDLER_TYPE
- XML - Static variable in class org.apache.tika.mime.MimeTypes
-
Name of the
xml
type, application/xml. - XMLReaderUtils - Class in org.apache.tika.utils
-
Utility functions for reading XML.
- XMLReaderUtils() - Constructor for class org.apache.tika.utils.XMLReaderUtils
- XmlRootExtractor - Class in org.apache.tika.detect
-
Utility class that uses a
SAXParser
to determine the namespace URI and local name of the root element of an XML file. - XmlRootExtractor() - Constructor for class org.apache.tika.detect.XmlRootExtractor
- XMP - Interface in org.apache.tika.metadata
- XMP - Static variable in class org.apache.tika.sax.XMPContentHandler
-
The XMP namespace URI
- XMPContentHandler - Class in org.apache.tika.sax
-
Content handler decorator that simplifies the task of producing XMP output.
- XMPContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.XMPContentHandler
- XMPDM - Interface in org.apache.tika.metadata
-
XMP Dynamic Media schema.
- XMPDM.ChannelTypePropertyConverter - Class in org.apache.tika.metadata
-
Deprecated.Experimental method, will change shortly
- XMPIdq - Interface in org.apache.tika.metadata
- XMPMM - Interface in org.apache.tika.metadata
- XMPRights - Interface in org.apache.tika.metadata
-
XMP Rights management schema.
- XPATH - org.apache.tika.metadata.Property.ValueType
- XPathParser - Class in org.apache.tika.sax.xpath
-
Parser for a very simple XPath subset.
- XPathParser() - Constructor for class org.apache.tika.sax.xpath.XPathParser
- XPathParser(String, String) - Constructor for class org.apache.tika.sax.xpath.XPathParser
Z
- ZeroByteFileException - Exception in org.apache.tika.exception
-
Exception thrown by the AutoDetectParser when a file contains zero-bytes.
- ZeroByteFileException(String) - Constructor for exception org.apache.tika.exception.ZeroByteFileException
- ZeroSizeFileDetector - Class in org.apache.tika.detect
-
Detector to identify zero length files as application/x-zerovalue
- ZeroSizeFileDetector() - Constructor for class org.apache.tika.detect.ZeroSizeFileDetector
_
- _COLOR_MODE_CHOICES_INDEXED - Static variable in interface org.apache.tika.metadata.Photoshop
All Classes All Packages