Class SimpleTokenizer

    • Constructor Summary

      Constructors 
      Constructor Description
      SimpleTokenizer()
      Creates an instance of SimpleTokenizer with the default delimiter.
      SimpleTokenizer​(java.lang.String delimiter)
      Creates an instance of SimpleTokenizer with the given delimiter.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      java.lang.String buildSentence​(java.util.List<java.lang.String> tokens)
      Combines a list of tokens to form a sentence.
      java.util.List<java.lang.String> tokenize​(java.lang.String sentence)
      Breaks down the given sentence into a list of tokens that can be represented by embeddings.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • SimpleTokenizer

        public SimpleTokenizer​(java.lang.String delimiter)
        Creates an instance of SimpleTokenizer with the given delimiter.
        Parameters:
        delimiter - the delimiter
      • SimpleTokenizer

        public SimpleTokenizer()
        Creates an instance of SimpleTokenizer with the default delimiter.
    • Method Detail

      • tokenize

        public java.util.List<java.lang.String> tokenize​(java.lang.String sentence)
        Breaks down the given sentence into a list of tokens that can be represented by embeddings.
        Specified by:
        tokenize in interface Tokenizer
        Parameters:
        sentence - the sentence to tokenize
        Returns:
        a List of tokens
      • buildSentence

        public java.lang.String buildSentence​(java.util.List<java.lang.String> tokens)
        Combines a list of tokens to form a sentence.
        Specified by:
        buildSentence in interface Tokenizer
        Parameters:
        tokens - the List of tokens
        Returns:
        the sentence built from the given tokens