Interface TextEmbedding

  • All Known Implementing Classes:
    ModelZooTextEmbedding, SimpleTextEmbedding, TrainableTextEmbedding

    public interface TextEmbedding
    A class to manage 1-D NDArray representations of multiple words.

    A text embedding differs from a WordEmbedding because the text embedding does not have to be applied to each word independently.

    A text embedding maps text to a NDArray that attempts to represent the key ideas in the words. Each of the values in the dimension can represent different pieces of meaning such as young-old, object-living, etc.

    These text embeddings can be used in two different ways in models. First, they can be used purely for preprocessing the model. In this case, it is a requirement for most models that use text as an input. The model is not trained. For this use case, use embedText(ai.djl.ndarray.NDManager, java.util.List<java.lang.String>).

    In the second option, the embedding can be trained using the standard deep learning techniques to better handle the current dataset. For this case, you need two methods. First, call preprocessTextToEmbed(List) within your dataset. Then, the first step in your model should be to call embedText(NDManager, long[]).

    • Method Detail

      • preprocessTextToEmbed

        long[] preprocessTextToEmbed​(java.util.List<java.lang.String> text)
        Preprocesses the text to embed into an array to pass into the model.

        Make sure to call embedText(NDManager, long[]) after this.

        Parameters:
        text - the text to embed
        Returns:
        the indices of text that is ready to embed
      • embedText

        default NDArray embedText​(NDManager manager,
                                  java.util.List<java.lang.String> text)
                           throws EmbeddingException
        Embeds a text.
        Parameters:
        manager - the manager for the embedding array
        text - the text to embed
        Returns:
        the embedded text
        Throws:
        EmbeddingException - if there is an error while trying to embed
      • unembedText

        java.util.List<java.lang.String> unembedText​(NDArray textEmbedding)
                                              throws EmbeddingException
        Returns the closest matching text for a given embedding.
        Parameters:
        textEmbedding - the text embedding to find the matching string text for.
        Returns:
        text similar to the passed in embedding
        Throws:
        EmbeddingException - if the input is not unembeddable