Package ai.djl.modality.nlp.preprocess
Class TextCleaner
- java.lang.Object
-
- ai.djl.modality.nlp.preprocess.TextCleaner
-
- All Implemented Interfaces:
TextProcessor
public class TextCleaner extends java.lang.Object implements TextProcessor
Applies remove or replace of certain characters based on condition.
-
-
Constructor Summary
Constructors Constructor Description TextCleaner(java.util.function.Function<java.lang.Character,java.lang.Boolean> condition)
Remove a character if it meets the condition supplied.TextCleaner(java.util.function.Function<java.lang.Character,java.lang.Boolean> condition, char replace)
Replace a character if it meets the condition supplied.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description java.util.List<java.lang.String>
preprocess(java.util.List<java.lang.String> tokens)
Applies the preprocessing defined to the given input tokens.
-
-
-
Constructor Detail
-
TextCleaner
public TextCleaner(java.util.function.Function<java.lang.Character,java.lang.Boolean> condition)
Remove a character if it meets the condition supplied.- Parameters:
condition
- lambda function that defines whether a character meets condition
-
TextCleaner
public TextCleaner(java.util.function.Function<java.lang.Character,java.lang.Boolean> condition, char replace)
Replace a character if it meets the condition supplied.- Parameters:
condition
- lambda function that defines whether a character meets conditionreplace
- the character to replace
-
-
Method Detail
-
preprocess
public java.util.List<java.lang.String> preprocess(java.util.List<java.lang.String> tokens)
Applies the preprocessing defined to the given input tokens.- Specified by:
preprocess
in interfaceTextProcessor
- Parameters:
tokens
- the tokens created after the input text is tokenized- Returns:
- the preprocessed tokens
-
-