Package ai.djl.modality.nlp.preprocess
Interface TextProcessor
-
- All Known Subinterfaces:
Tokenizer
- All Known Implementing Classes:
BertFullTokenizer,BertTokenizer,HyphenNormalizer,LambdaProcessor,LowerCaseConvertor,PunctuationSeparator,SimpleTokenizer,TextCleaner,TextTerminator,TextTruncator,UnicodeNormalizer,WordpieceTokenizer
public interface TextProcessorTextProcessorallows applying pre-processing to input tokens for natural language applications. Multiple implementations ofTextProcessorcan be applied on the same input. The order of application of different implementations ofTextProcessorcan make a difference in the final output.
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description java.util.List<java.lang.String>preprocess(java.util.List<java.lang.String> tokens)Applies the preprocessing defined to the given input tokens.
-