Class TextCleaner

java.lang.Object
ai.djl.modality.nlp.preprocess.TextCleaner
All Implemented Interfaces:
TextProcessor

public class TextCleaner extends Object implements TextProcessor
Applies remove or replace of certain characters based on condition.
  • Constructor Details

    • TextCleaner

      public TextCleaner(Function<Character,Boolean> condition)
      Remove a character if it meets the condition supplied.
      Parameters:
      condition - lambda function that defines whether a character meets condition
    • TextCleaner

      public TextCleaner(Function<Character,Boolean> condition, char replace)
      Replace a character if it meets the condition supplied.
      Parameters:
      condition - lambda function that defines whether a character meets condition
      replace - the character to replace
  • Method Details

    • preprocess

      public List<String> preprocess(List<String> tokens)
      Applies the preprocessing defined to the given input tokens.
      Specified by:
      preprocess in interface TextProcessor
      Parameters:
      tokens - the tokens created after the input text is tokenized
      Returns:
      the preprocessed tokens