Index

A B C D F G I K L M N O P R S T W 
All Classes and Interfaces|All Packages|Constant Field Values

A

accept() - Method in class org.codelibs.analysis.en.ReloadableStopFilter
 
accept() - Method in class org.codelibs.analysis.ja.CharTypeFilter
 
accept() - Method in class org.codelibs.analysis.StopTokenFilter
 
accept(String, String) - Method in class org.codelibs.analysis.ja.StopTokenPrefixFilter
 
accept(String, String) - Method in class org.codelibs.analysis.ja.StopTokenSuffixFilter
 
accept(String, String) - Method in class org.codelibs.analysis.StopTokenFilter
Determines whether the given text should be accepted based on a comparison with a stop word.
add(char) - Method in class org.codelibs.analysis.en.FlexiblePorterStemmer
Add a character to the word being stemmed.
advance() - Method in class org.codelibs.analysis.ja.KanjiNumberFilter.NumberBuffer
Advances the position index by one.
ALPHANUM - Static variable in class org.codelibs.analysis.en.AlphaNumWordFilter
Token type constant for alphanumeric tokens
AlphaNumWordFilter - Class in org.codelibs.analysis.en
Token filter that concatenates adjacent alphanumeric and numeric tokens.
AlphaNumWordFilter(TokenStream) - Constructor for class org.codelibs.analysis.en.AlphaNumWordFilter
Creates a new AlphaNumWordFilter.

B

BufferedCharFilter - Class in org.codelibs.analysis
Abstract base class for character filters that buffer input before processing.
BufferedCharFilter(Reader) - Constructor for class org.codelibs.analysis.BufferedCharFilter
Creates a new BufferedCharFilter.
bufferedInput - Variable in class org.codelibs.analysis.BufferedCharFilter
The reader containing the processed buffered input

C

charAt(int) - Method in class org.codelibs.analysis.ja.KanjiNumberFilter.NumberBuffer
Returns the character at the specified index.
CharTypeFilter - Class in org.codelibs.analysis.ja
Token filter that accepts tokens based on character type criteria.
CharTypeFilter(TokenStream, boolean, boolean, boolean) - Constructor for class org.codelibs.analysis.ja.CharTypeFilter
Creates a new CharTypeFilter.
concatenateTerms(AttributeSource.State) - Method in class org.codelibs.analysis.ConcatenationFilter
Concatenates the current token with the previous token.
ConcatenationFilter - Class in org.codelibs.analysis
Abstract base class for token filters that concatenate adjacent tokens.
ConcatenationFilter(TokenStream) - Constructor for class org.codelibs.analysis.ConcatenationFilter
Creates a new ConcatenationFilter.
current - Variable in class org.codelibs.analysis.ConcatenationFilter
State for storing lookahead tokens
current - Variable in class org.codelibs.analysis.en.AlphaNumWordFilter
Current state of the token stream for lookahead processing

D

DEFAULT_MAX_TOKEN_LENGTH - Static variable in class org.codelibs.analysis.en.AlphaNumWordFilter
Default maximum token length (255 characters)

F

FlexiblePorterStemFilter - Class in org.codelibs.analysis.en
Token filter that applies the Porter stemming algorithm with configurable steps.
FlexiblePorterStemFilter(TokenStream, boolean, boolean, boolean, boolean, boolean, boolean) - Constructor for class org.codelibs.analysis.en.FlexiblePorterStemFilter
Creates a new FlexiblePorterStemFilter with configurable stemming steps.
FlexiblePorterStemmer - Class in org.codelibs.analysis.en
Stemmer, implementing the Porter Stemming Algorithm The Stemmer class transforms a word into its root form.
FlexiblePorterStemmer() - Constructor for class org.codelibs.analysis.en.FlexiblePorterStemmer
Creates a new FlexiblePorterStemmer with all steps enabled.
FlexiblePorterStemmer(boolean, boolean, boolean, boolean, boolean, boolean) - Constructor for class org.codelibs.analysis.en.FlexiblePorterStemmer
Creates a new FlexiblePorterStemmer with configurable steps.

G

get() - Method in interface org.codelibs.analysis.ja.PosConcatenationFilter.PartOfSpeechSupplier
Retrieves the part-of-speech tag for the current token.
getMaxTokenLength() - Method in class org.codelibs.analysis.en.AlphaNumWordFilter
Gets the maximum token length for concatenated tokens.
getResultBuffer() - Method in class org.codelibs.analysis.en.FlexiblePorterStemmer
Returns a reference to a character buffer containing the results of the stemming process.
getResultLength() - Method in class org.codelibs.analysis.en.FlexiblePorterStemmer
Returns the length of the word resulting from the stemming process.

I

ignoreCase - Variable in class org.codelibs.analysis.StopTokenFilter
Whether to ignore case when matching stop words
incrementToken() - Method in class org.codelibs.analysis.ConcatenationFilter
 
incrementToken() - Method in class org.codelibs.analysis.en.AlphaNumWordFilter
 
incrementToken() - Method in class org.codelibs.analysis.en.FlexiblePorterStemFilter
 
incrementToken() - Method in class org.codelibs.analysis.ja.KanjiNumberFilter
 
isArabicNumeral(char) - Method in class org.codelibs.analysis.ja.KanjiNumberFilter
Arabic numeral predicate.
isConcatenated() - Method in class org.codelibs.analysis.ConcatenationFilter
Determines if the next token should be concatenated with the current token.
isConcatenated() - Method in class org.codelibs.analysis.ja.NumberConcatenationFilter
 
isConcatenated() - Method in class org.codelibs.analysis.ja.PatternConcatenationFilter
 
isConcatenated() - Method in class org.codelibs.analysis.ja.PosConcatenationFilter
 
isKeyword() - Method in class org.codelibs.analysis.en.ReloadableKeywordMarkerFilter
 
isNumeral(char) - Method in class org.codelibs.analysis.ja.KanjiNumberFilter
Numeral predicate
isNumeral(String) - Method in class org.codelibs.analysis.ja.KanjiNumberFilter
Numeral predicate
isNumeralPunctuation(char) - Method in class org.codelibs.analysis.ja.KanjiNumberFilter
Numeral punctuation predicate
isNumeralPunctuation(String) - Method in class org.codelibs.analysis.ja.KanjiNumberFilter
Numeral punctuation predicate
isTarget() - Method in class org.codelibs.analysis.ConcatenationFilter
Determines if the current token should be processed for concatenation.
isTarget() - Method in class org.codelibs.analysis.ja.NumberConcatenationFilter
 
isTarget() - Method in class org.codelibs.analysis.ja.PatternConcatenationFilter
 
isTarget() - Method in class org.codelibs.analysis.ja.PosConcatenationFilter
 
IterationMarkCharFilter - Class in org.codelibs.analysis.ja
Character filter that expands Japanese iteration marks (odoriji).
IterationMarkCharFilter(Reader) - Constructor for class org.codelibs.analysis.ja.IterationMarkCharFilter
Creates a new IterationMarkCharFilter.

K

KanjiNumberFilter - Class in org.codelibs.analysis.ja
Normalizes Japanese numbers
KanjiNumberFilter(TokenStream) - Constructor for class org.codelibs.analysis.ja.KanjiNumberFilter
Creates a new KanjiNumberFilter.
KanjiNumberFilter.NumberBuffer - Class in org.codelibs.analysis.ja
Buffer that holds a Japanese number string and a position index used as a parsed-to marker

L

length() - Method in class org.codelibs.analysis.ja.KanjiNumberFilter.NumberBuffer
Returns the length of the buffer.

M

MAX_TOKEN_LENGTH_LIMIT - Static variable in class org.codelibs.analysis.en.AlphaNumWordFilter
Maximum allowed token length limit (1MB)
maxTokenLength - Variable in class org.codelibs.analysis.en.AlphaNumWordFilter
Maximum length for concatenated tokens

N

normalizedWords - Variable in class org.codelibs.analysis.StopTokenFilter
Array of stop words to match against (normalized to lowercase if ignoreCase is true)
normalizeNumber(String) - Method in class org.codelibs.analysis.ja.KanjiNumberFilter
Normalizes a Japanese number
NUM - Static variable in class org.codelibs.analysis.en.AlphaNumWordFilter
Token type constant for numeric tokens
NumberBuffer(String) - Constructor for class org.codelibs.analysis.ja.KanjiNumberFilter.NumberBuffer
Creates a new NumberBuffer.
NumberConcatenationFilter - Class in org.codelibs.analysis.ja
A token filter that concatenates tokens containing only numeric characters (digits).
NumberConcatenationFilter(TokenStream, CharArraySet) - Constructor for class org.codelibs.analysis.ja.NumberConcatenationFilter
Constructs a NumberConcatenationFilter with the specified input token stream and word set.

O

offsetAtt - Variable in class org.codelibs.analysis.ConcatenationFilter
The offset attribute for managing token offsets
org.codelibs.analysis - package org.codelibs.analysis
 
org.codelibs.analysis.en - package org.codelibs.analysis.en
 
org.codelibs.analysis.ja - package org.codelibs.analysis.ja
 

P

parseLargeKanjiNumeral(KanjiNumberFilter.NumberBuffer) - Method in class org.codelibs.analysis.ja.KanjiNumberFilter
Parse large kanji numerals (ten thousands or larger)
parseMediumKanjiNumeral(KanjiNumberFilter.NumberBuffer) - Method in class org.codelibs.analysis.ja.KanjiNumberFilter
Parse medium kanji numerals (tens, hundreds or thousands)
PatternConcatenationFilter - Class in org.codelibs.analysis.ja
A token filter that uses regular expression patterns to determine token concatenation behavior.
PatternConcatenationFilter(TokenStream, Pattern, Pattern) - Constructor for class org.codelibs.analysis.ja.PatternConcatenationFilter
Constructs a PatternConcatenationFilter with the specified input token stream and patterns.
PosConcatenationFilter - Class in org.codelibs.analysis.ja
A token filter that determines concatenation behavior based on part-of-speech (POS) tags.
PosConcatenationFilter(TokenStream, Set<String>, PosConcatenationFilter.PartOfSpeechSupplier) - Constructor for class org.codelibs.analysis.ja.PosConcatenationFilter
Constructs a PosConcatenationFilter with the specified input token stream, POS tags, and supplier.
PosConcatenationFilter.PartOfSpeechSupplier - Interface in org.codelibs.analysis.ja
Functional interface that supplies part-of-speech (POS) tag information for the current token.
position() - Method in class org.codelibs.analysis.ja.KanjiNumberFilter.NumberBuffer
Returns the current position index.
processInput(CharSequence) - Method in class org.codelibs.analysis.BufferedCharFilter
Processes the buffered input and returns the transformed character sequence.
processInput(CharSequence) - Method in class org.codelibs.analysis.ja.IterationMarkCharFilter
 
processInput(CharSequence) - Method in class org.codelibs.analysis.ja.ProlongedSoundMarkCharFilter
 
processToken() - Method in class org.codelibs.analysis.ConcatenationFilter
Processes the current token, potentially concatenating it with following tokens.
ProlongedSoundMarkCharFilter - Class in org.codelibs.analysis.ja
A character filter that normalizes various dash and hyphen characters to Japanese prolonged sound marks when they appear after Hiragana, Katakana, or Katakana phonetic extension characters.
ProlongedSoundMarkCharFilter(Reader) - Constructor for class org.codelibs.analysis.ja.ProlongedSoundMarkCharFilter
Constructs a ProlongedSoundMarkCharFilter with the default replacement character (U+30FC).
ProlongedSoundMarkCharFilter(Reader, char) - Constructor for class org.codelibs.analysis.ja.ProlongedSoundMarkCharFilter
Constructs a ProlongedSoundMarkCharFilter with a custom replacement character.

R

read(char[], int, int) - Method in class org.codelibs.analysis.BufferedCharFilter
 
ReloadableKeywordMarkerFilter - Class in org.codelibs.analysis.en
A keyword marker filter that can dynamically reload its keyword set from a file.
ReloadableKeywordMarkerFilter(TokenStream, Path, long) - Constructor for class org.codelibs.analysis.en.ReloadableKeywordMarkerFilter
Constructs a ReloadableKeywordMarkerFilter with the specified input stream, keyword file path, and reload interval.
ReloadableStopFilter - Class in org.codelibs.analysis.en
A stop word filter that can dynamically reload its stop word set from a file.
ReloadableStopFilter(TokenStream, Path, boolean, long) - Constructor for class org.codelibs.analysis.en.ReloadableStopFilter
Constructs a ReloadableStopFilter with the specified input stream, stop word file path, case sensitivity, and reload interval.
reset() - Method in class org.codelibs.analysis.en.FlexiblePorterStemmer
reset() resets the stemmer so it can stem another word.
reset() - Method in class org.codelibs.analysis.en.ReloadableKeywordMarkerFilter
 
reset() - Method in class org.codelibs.analysis.en.ReloadableStopFilter
 
reset() - Method in class org.codelibs.analysis.ja.KanjiNumberFilter
 

S

setMaxTokenLength(int) - Method in class org.codelibs.analysis.en.AlphaNumWordFilter
Sets the maximum token length for concatenated tokens.
stem() - Method in class org.codelibs.analysis.en.FlexiblePorterStemmer
Stem the word placed into the Stemmer buffer through calls to add().
stem(char[]) - Method in class org.codelibs.analysis.en.FlexiblePorterStemmer
Stem a word contained in a char[].
stem(char[], int) - Method in class org.codelibs.analysis.en.FlexiblePorterStemmer
Stem a word contained in a leading portion of a char[] array.
stem(char[], int, int) - Method in class org.codelibs.analysis.en.FlexiblePorterStemmer
Stem a word contained in a portion of a char[] array.
stem(int) - Method in class org.codelibs.analysis.en.FlexiblePorterStemmer
Stem the word in the buffer starting at the given offset.
stem(String) - Method in class org.codelibs.analysis.en.FlexiblePorterStemmer
Stem a word provided as a String.
StopTokenFilter - Class in org.codelibs.analysis
Abstract base class for stop token filters that match tokens against a word list.
StopTokenFilter(TokenStream, String[], boolean) - Constructor for class org.codelibs.analysis.StopTokenFilter
Constructs a StopTokenFilter with the specified input stream, stop words, and case sensitivity.
StopTokenPrefixFilter - Class in org.codelibs.analysis.ja
A stop token filter that removes tokens beginning with any of the specified prefix words.
StopTokenPrefixFilter(TokenStream, String[], boolean) - Constructor for class org.codelibs.analysis.ja.StopTokenPrefixFilter
Constructs a StopTokenPrefixFilter with the specified input stream, prefix words, and case sensitivity.
StopTokenSuffixFilter - Class in org.codelibs.analysis.ja
A stop token filter that removes tokens ending with any of the specified suffix words.
StopTokenSuffixFilter(TokenStream, String[], boolean) - Constructor for class org.codelibs.analysis.ja.StopTokenSuffixFilter
Constructs a StopTokenSuffixFilter with the specified input stream, suffix words, and case sensitivity.

T

termAtt - Variable in class org.codelibs.analysis.ConcatenationFilter
The term attribute for accessing and modifying token text
termAtt - Variable in class org.codelibs.analysis.StopTokenFilter
Character term attribute for accessing the current token's text
toString() - Method in class org.codelibs.analysis.en.FlexiblePorterStemmer
After a word has been stemmed, it can be retrieved by toString(), or a reference to the internal buffer can be retrieved by getResultBuffer and getResultLength (which is generally more efficient.)

W

words - Variable in class org.codelibs.analysis.ja.NumberConcatenationFilter
Set of words used to determine concatenation behavior
A B C D F G I K L M N O P R S T W 
All Classes and Interfaces|All Packages|Constant Field Values