Class Tokenizer
- All Implemented Interfaces:
Closeable
,AutoCloseable
- Direct Known Subclasses:
CharTokenizer
,ChineseTokenizer
,CJKTokenizer
,ClassicTokenizer
,KeywordTokenizer
,Lucene43EdgeNGramTokenizer
,Lucene43NGramTokenizer
,NGramTokenizer
,PathHierarchyTokenizer
,PatternTokenizer
,ReversePathHierarchyTokenizer
,StandardTokenizer
,UAX29URLEmailTokenizer
,WikipediaTokenizer
This is an abstract class; subclasses must override TokenStream.incrementToken()
NOTE: Subclasses overriding TokenStream.incrementToken()
must
call AttributeSource.clearAttributes()
before
setting attributes.
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource
AttributeSource.AttributeFactory, AttributeSource.State
-
Method Summary
Modifier and TypeMethodDescriptionvoid
close()
Releases resources associated with this stream.void
reset()
This method is called by a consumer before it begins consumption usingTokenStream.incrementToken()
.final void
Expert: Set a new reader on the Tokenizer.Methods inherited from class org.apache.lucene.analysis.TokenStream
end, incrementToken
Methods inherited from class org.apache.lucene.util.AttributeSource
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString
-
Method Details
-
close
Releases resources associated with this stream.If you override this method, always call
super.close()
, otherwise some internal state will not be correctly reset (e.g.,Tokenizer
will throwIllegalStateException
on reuse).NOTE: The default implementation closes the input Reader, so be sure to call
super.close()
when overriding this method.- Specified by:
close
in interfaceAutoCloseable
- Specified by:
close
in interfaceCloseable
- Overrides:
close
in classTokenStream
- Throws:
IOException
-
setReader
Expert: Set a new reader on the Tokenizer. Typically, an analyzer (in its tokenStream method) will use this to re-use a previously created tokenizer.- Throws:
IOException
-
reset
Description copied from class:TokenStream
This method is called by a consumer before it begins consumption usingTokenStream.incrementToken()
.Resets this stream to a clean state. Stateful implementations must implement this method so that they can be reused, just as if they had been created fresh.
If you override this method, always call
super.reset()
, otherwise some internal state will not be correctly reset (e.g.,Tokenizer
will throwIllegalStateException
on further usage).- Overrides:
reset
in classTokenStream
- Throws:
IOException
-