Skip navigation links
A C D E F G H I L M N P R S T U V 

A

AbstractDetector - Class in com.yahoo.language.detect
 
AbstractDetector() - Constructor for class com.yahoo.language.detect.AbstractDetector
 
accentDrop(String, Language) - Method in interface com.yahoo.language.process.Transformer
Remove accents from input text.
add(int, String) - Method in class com.yahoo.language.process.StemList
 

C

CharacterClasses - Class in com.yahoo.language.process
Determines the class of a given character.
CharacterClasses() - Constructor for class com.yahoo.language.process.CharacterClasses
 
characterClasses - Variable in class com.yahoo.language.process.GramSplitter
 
characterClasses - Variable in class com.yahoo.language.process.GramSplitter.GramSplitterIterator
 
code - Variable in enum com.yahoo.language.Language
 
com.yahoo.language - package com.yahoo.language
 
com.yahoo.language.detect - package com.yahoo.language.detect
 
com.yahoo.language.process - package com.yahoo.language.process
 
Component() - Constructor for enum com.yahoo.language.Linguistics.Component
 
country - Variable in class com.yahoo.language.detect.Hint
 

D

detect(String, Hint) - Method in class com.yahoo.language.detect.AbstractDetector
 
detect(ByteBuffer, Hint) - Method in class com.yahoo.language.detect.AbstractDetector
 
detect(byte[], int, int, Hint) - Method in interface com.yahoo.language.detect.Detector
Detects language and encoding of the supplied byte array, possibly using a language/encoding hint.
detect(ByteBuffer, Hint) - Method in interface com.yahoo.language.detect.Detector
Detects language and encoding of the supplied ByteBuffer, possibly using a language/encoding hint.
detect(String, Hint) - Method in interface com.yahoo.language.detect.Detector
Detects language of the supplied String, possibly using a language hint.
Detection - Class in com.yahoo.language.detect
 
Detection(Language, String, boolean) - Constructor for class com.yahoo.language.detect.Detection
 
DetectionException - Exception in com.yahoo.language.detect
Exception that is thrown when detection fails.
DetectionException(String) - Constructor for exception com.yahoo.language.detect.DetectionException
 
Detector - Interface in com.yahoo.language.detect
Abstract superclass of all Detectors used for language and encoding detection.

E

encodingName - Variable in class com.yahoo.language.detect.Detection
 
equals(Object) - Method in class com.yahoo.language.process.GramSplitter.Gram
 
extractFrom(String) - Method in class com.yahoo.language.process.GramSplitter.Gram
Returns this gram as a string from the input string

F

findNext() - Method in class com.yahoo.language.process.GramSplitter.GramSplitterIterator
 
findSegments(Token, List<String>) - Method in class com.yahoo.language.process.SegmenterImpl
 
findStems(Token, List<StemList>) - Method in class com.yahoo.language.process.StemmerImpl
 
fromEncoding(String) - Static method in enum com.yahoo.language.Language
Returns the language from an encoding, or Language.UNKNOWN if it cannot be determined.
fromLanguageTag(String) - Static method in enum com.yahoo.language.Language
Convenience method for calling fromLocale(LocaleFactory.fromLanguageTag(languageTag)).
fromLanguageTag(String) - Static method in class com.yahoo.language.LocaleFactory
Implements a simple parser for RFC5646 language tags.
fromLocale(Locale) - Static method in enum com.yahoo.language.Language
Returns the Language whose Language.languageCode() is equal to locale.getLanguage(), with the following additions:
fromLowerCasedEncoding(String) - Static method in enum com.yahoo.language.Language
 

G

get(int) - Method in class com.yahoo.language.process.StemList
 
getCharacterClasses() - Method in interface com.yahoo.language.Linguistics
Returns a thread-unsafe character classes instance.
getComponent(int) - Method in interface com.yahoo.language.process.Token
Returns a component token of this
getCountry() - Method in class com.yahoo.language.detect.Hint
 
getDetector() - Method in interface com.yahoo.language.Linguistics
Returns a thread-unsafe detector.
getEncoding() - Method in class com.yahoo.language.detect.Detection
 
getEncodingName() - Method in class com.yahoo.language.detect.Detection
 
getGramSplitter() - Method in interface com.yahoo.language.Linguistics
Returns a thread-unsafe gram splitter.
getLanguage() - Method in class com.yahoo.language.detect.Detection
 
getLength() - Method in class com.yahoo.language.process.GramSplitter.Gram
 
getMarket() - Method in class com.yahoo.language.detect.Hint
 
getNormalizer() - Method in interface com.yahoo.language.Linguistics
Returns a thread-unsafe normalizer.
getNumComponents() - Method in interface com.yahoo.language.process.Token
Returns the number of components, if this token is a compound word (e.g.
getNumStems() - Method in interface com.yahoo.language.process.Token
Returns the number of stem forms available for this token.
getOffset() - Method in interface com.yahoo.language.process.Token
Returns the offset position of this token
getOrig() - Method in interface com.yahoo.language.process.Token
Returns the original form of this token
getReplacementTerm(String) - Method in interface com.yahoo.language.process.Tokenizer
Return a replacement for an input token string.
getScript() - Method in interface com.yahoo.language.process.Token
Returns the script of this token
getSegmenter() - Method in interface com.yahoo.language.Linguistics
Returns a thread-unsafe segmenter.
getStart() - Method in class com.yahoo.language.process.GramSplitter.Gram
 
getStem(int) - Method in interface com.yahoo.language.process.Token
Returns the stem at position i
getStemmer() - Method in interface com.yahoo.language.Linguistics
Returns a thread-unsafe stemmer or lemmatizer.
getTokenizer() - Method in interface com.yahoo.language.Linguistics
Returns a thread-unsafe tokenizer.
getTokenString() - Method in interface com.yahoo.language.process.Token
Returns token string in a form suitable for indexing: The most lowercased variant of the most processed token form available.
getTransformer() - Method in interface com.yahoo.language.Linguistics
Returns a thread-unsafe transformer.
getType() - Method in interface com.yahoo.language.process.Token
Returns the type of this token - word, space or punctuation etc.
getValue() - Method in enum com.yahoo.language.process.StemMode
Deprecated.
do not use
getValue() - Method in enum com.yahoo.language.process.TokenType
Returns an int code for this type
getVersion(Linguistics.Component) - Method in interface com.yahoo.language.Linguistics
Returns the name and version of a processor component returned by this instance.
Gram(int, int) - Constructor for class com.yahoo.language.process.GramSplitter.Gram
 
GramSplitter - Class in com.yahoo.language.process
A class which splits consecutive word character sequences into overlapping character n-grams.
GramSplitter(CharacterClasses) - Constructor for class com.yahoo.language.process.GramSplitter
 
GramSplitter.Gram - Class in com.yahoo.language.process
An immutable start index and length pair
GramSplitter.GramSplitterIterator - Class in com.yahoo.language.process
 
GramSplitterIterator(String, int, CharacterClasses) - Constructor for class com.yahoo.language.process.GramSplitter.GramSplitterIterator
 

H

hashCode() - Method in class com.yahoo.language.process.GramSplitter.Gram
 
hasNext() - Method in class com.yahoo.language.process.GramSplitter.GramSplitterIterator
 
Hint - Class in com.yahoo.language.detect
A hint that can be given to a Detector.
Hint(String, String) - Constructor for class com.yahoo.language.detect.Hint
 

I

i - Variable in class com.yahoo.language.process.GramSplitter.GramSplitterIterator
Current index
index - Static variable in enum com.yahoo.language.Language
 
indexOfNonWordChar(String) - Method in class com.yahoo.language.process.GramSplitter.GramSplitterIterator
 
input - Variable in class com.yahoo.language.process.GramSplitter.GramSplitterIterator
Text to split
isCjk() - Method in enum com.yahoo.language.Language
Returns whether this is a "cjk" language.
isDigit(int) - Method in class com.yahoo.language.process.CharacterClasses
Returns true for code points which should be considered digits - same as java.lang.Character.isDigit
isFirstAfterSeparator - Variable in class com.yahoo.language.process.GramSplitter.GramSplitterIterator
Whether the last thing that happened was being on a separator (including the start of the string)
isIndexable() - Method in interface com.yahoo.language.process.Token
Whether this token should be indexed
isIndexable() - Method in enum com.yahoo.language.process.TokenType
Marker for whether this type of token can be indexed for search.
isLatin(int) - Method in class com.yahoo.language.process.CharacterClasses
Returns true if this is a latin character
isLatinDigit(int) - Method in class com.yahoo.language.process.CharacterClasses
Returns true if this is a latin digit (other digits are not consistently parsed into numbers by Java)
isLetter(int) - Method in class com.yahoo.language.process.CharacterClasses
Returns true for code points which are letters in unicode 3 or 4, plus some additional characters which are useful to view as letters even though not defined as such in unicode.
isLetterOrDigit(int) - Method in class com.yahoo.language.process.CharacterClasses
Convenience, returns isLetter(c) || isDigit(c)
isLocal() - Method in class com.yahoo.language.detect.Detection
 
isSpecialToken() - Method in interface com.yahoo.language.process.Token
Returns whether this is an instance of a declared special token (e.g.

L

language - Variable in class com.yahoo.language.detect.Detection
 
Language - Enum in com.yahoo.language
 
Language(String) - Constructor for enum com.yahoo.language.Language
 
languageCode() - Method in enum com.yahoo.language.Language
 
length - Variable in class com.yahoo.language.process.GramSplitter.Gram
 
Linguistics - Interface in com.yahoo.language
Factory of linguistic processors.
Linguistics.Component - Enum in com.yahoo.language
 
LinguisticsCase - Class in com.yahoo.language
This class provides a case normalization operation to be used e.g.
LinguisticsCase() - Constructor for class com.yahoo.language.LinguisticsCase
 
local - Variable in class com.yahoo.language.detect.Detection
 
LocaleFactory - Class in com.yahoo.language
 
LocaleFactory() - Constructor for class com.yahoo.language.LocaleFactory
 

M

market - Variable in class com.yahoo.language.detect.Hint
 

N

n - Variable in class com.yahoo.language.process.GramSplitter.GramSplitterIterator
Gram size
newCountryHint(String) - Static method in class com.yahoo.language.detect.Hint
 
newInstance(String, String) - Static method in class com.yahoo.language.detect.Hint
 
newMarketHint(String) - Static method in class com.yahoo.language.detect.Hint
 
next() - Method in class com.yahoo.language.process.GramSplitter.GramSplitterIterator
 
nextGram - Variable in class com.yahoo.language.process.GramSplitter.GramSplitterIterator
The next gram or null if not determined yet
normalize(String) - Method in interface com.yahoo.language.process.Normalizer
NFKC normalizes a String.
Normalizer - Interface in com.yahoo.language.process
This interface provides NFKC normalization of Strings through the underlying linguistics library.

P

ProcessingException - Exception in com.yahoo.language.process
Exception class indicating that a fatal error occured during linguistic processing.
ProcessingException(String) - Constructor for exception com.yahoo.language.process.ProcessingException
 
ProcessingException(String, Throwable) - Constructor for exception com.yahoo.language.process.ProcessingException
 

R

remove() - Method in class com.yahoo.language.process.GramSplitter.GramSplitterIterator
 
remove(int) - Method in class com.yahoo.language.process.StemList
 

S

segment(String, Language) - Method in interface com.yahoo.language.process.Segmenter
Split input-string into tokens, and returned a list of tokens in unprocessed form (i.e.
segment(String, Language) - Method in class com.yahoo.language.process.SegmenterImpl
 
Segmenter - Interface in com.yahoo.language.process
Interface providing segmentation, i.e.
SegmenterImpl - Class in com.yahoo.language.process
 
SegmenterImpl(Tokenizer) - Constructor for class com.yahoo.language.process.SegmenterImpl
 
set(int, String) - Method in class com.yahoo.language.process.StemList
 
SIMPLE - Static variable in interface com.yahoo.language.Linguistics
The same as new com.yahoo.language.simple.SimpleLinguistics().
size() - Method in class com.yahoo.language.process.StemList
 
split(String, int) - Method in class com.yahoo.language.process.GramSplitter
Splits the input into grams of size n and returns an iterator over grams represented as [start index,length] pairs into the input string.
start - Variable in class com.yahoo.language.process.GramSplitter.Gram
 
stem(String, StemMode, Language) - Method in interface com.yahoo.language.process.Stemmer
Stem input according to specified stemming mode.
stem(String, StemMode, Language) - Method in class com.yahoo.language.process.StemmerImpl
 
StemList - Class in com.yahoo.language.process
A list of strings which does not allow for duplicate elements.
StemList() - Constructor for class com.yahoo.language.process.StemList
 
StemList(String...) - Constructor for class com.yahoo.language.process.StemList
 
Stemmer - Interface in com.yahoo.language.process
Interface providing stemming of single words.
StemmerImpl - Class in com.yahoo.language.process
 
StemmerImpl(Tokenizer) - Constructor for class com.yahoo.language.process.StemmerImpl
 
StemMode - Enum in com.yahoo.language.process
An enum of the stemming modes which can be requested.
StemMode(int) - Constructor for enum com.yahoo.language.process.StemMode
 
stems - Variable in class com.yahoo.language.process.StemList
 

T

toExtractedList() - Method in class com.yahoo.language.process.GramSplitter.GramSplitterIterator
Convenience list which splits the remaining items in this iterator into a list of gram strings
Token - Interface in com.yahoo.language.process
A single token produced by the tokenizer.
tokenize(String, Language, StemMode, boolean) - Method in interface com.yahoo.language.process.Tokenizer
Returns the tokens produced from an input string under the rules of the given Language and additional options
tokenizer - Variable in class com.yahoo.language.process.SegmenterImpl
 
tokenizer - Variable in class com.yahoo.language.process.StemmerImpl
 
Tokenizer - Interface in com.yahoo.language.process
Language-sensitive tokenization of a text string.
TokenScript - Enum in com.yahoo.language.process
List of token scripts (e.g.
TokenScript() - Constructor for enum com.yahoo.language.process.TokenScript
 
TokenType - Enum in com.yahoo.language.process
An enumeration of token types.
TokenType(int) - Constructor for enum com.yahoo.language.process.TokenType
 
toLowerCase(String) - Static method in class com.yahoo.language.LinguisticsCase
The lower casing method to use in Vespa when doing language independent processing of natural language data.
Transformer - Interface in com.yahoo.language.process
Interface for providers of text transformations such as accent removal.

U

UNKNOWN - Static variable in class com.yahoo.language.LocaleFactory
 

V

value - Variable in enum com.yahoo.language.process.StemMode
 
value - Variable in enum com.yahoo.language.process.TokenType
 
valueOf(String) - Static method in enum com.yahoo.language.Language
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum com.yahoo.language.Linguistics.Component
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum com.yahoo.language.process.StemMode
Returns the enum constant of this type with the specified name.
valueOf(int) - Static method in enum com.yahoo.language.process.StemMode
Deprecated.
valueOf(String) - Static method in enum com.yahoo.language.process.TokenScript
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum com.yahoo.language.process.TokenType
Returns the enum constant of this type with the specified name.
valueOf(int) - Static method in enum com.yahoo.language.process.TokenType
Translates this from the int code representation returned from TokenType.getValue()
values() - Static method in enum com.yahoo.language.Language
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum com.yahoo.language.Linguistics.Component
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum com.yahoo.language.process.StemMode
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum com.yahoo.language.process.TokenScript
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum com.yahoo.language.process.TokenType
Returns an array containing the constants of this enum type, in the order they are declared.
A C D E F G H I L M N P R S T U V 
Skip navigation links

Copyright © 2018. All rights reserved.