Package ai.djl.modality.nlp.preprocess
Interface TextProcessor
- All Known Subinterfaces:
- Tokenizer
- All Known Implementing Classes:
- BertFullTokenizer,- BertTokenizer,- HyphenNormalizer,- LambdaProcessor,- LowerCaseConvertor,- PunctuationSeparator,- SimpleTokenizer,- TextCleaner,- TextTerminator,- TextTruncator,- UnicodeNormalizer,- WordpieceTokenizer
public interface TextProcessor
TextProcessor allows applying pre-processing to input tokens for natural language
 applications. Multiple implementations of TextProcessor can be applied on the same input.
 The order of application of different implementations of TextProcessor can make a
 difference in the final output.- 
Method SummaryModifier and TypeMethodDescriptionpreprocess(List<String> tokens) Applies the preprocessing defined to the given input tokens.
- 
Method Details- 
preprocessApplies the preprocessing defined to the given input tokens.- Parameters:
- tokens- the tokens created after the input text is tokenized
- Returns:
- the preprocessed tokens
 
 
-