public interface WordEmbedding
NDArray
representations of words.
A word embedding maps words to a NDArray
that attempts to represent the key ideas in
the words. Each of the values in the dimension can represent different pieces of meaning such as
young-old, object-living, etc.
These word embeddings can be used in two different ways in models. First, they can be used
purely for preprocessing the model. In this case, it is a requirement for most models that use
text as an input. The model is not trained. For this use case, use embedWord(ai.djl.ndarray.NDManager, java.lang.String)
.
In the second option, the embedding can be trained using the standard deep learning techniques
to better handle the current dataset. For this case, you need two methods. First, call preprocessWordToEmbed(NDManager, String)
within your dataset. Then, the first step in your
model should be to call embedWord(NDArray)
.
Modifier and Type | Method and Description |
---|---|
NDArray |
embedWord(NDArray word)
Embeds the word after preprocessed using
preprocessWordToEmbed(NDManager, String) . |
default NDArray |
embedWord(NDManager manager,
java.lang.String word)
Embeds a word.
|
NDArray |
preprocessWordToEmbed(NDManager manager,
java.lang.String word)
Preprocesses the word to embed into an
NDArray to pass into the model. |
java.lang.String |
unembedWord(NDArray wordEmbedding)
Returns the closest matching word for a given embedding.
|
boolean |
vocabularyContains(java.lang.String word)
Returns whether an embedding exists for a word.
|
boolean vocabularyContains(java.lang.String word)
word
- the word to checkNDArray preprocessWordToEmbed(NDManager manager, java.lang.String word)
NDArray
to pass into the model.
Make sure to call embedWord(NDArray)
after this.
manager
- the manager for the new arrayword
- the word to embeddefault NDArray embedWord(NDManager manager, java.lang.String word) throws EmbeddingException
manager
- the manager for the embedding arrayword
- the word to embedEmbeddingException
- if there is an error while trying to embedNDArray embedWord(NDArray word) throws EmbeddingException
preprocessWordToEmbed(NDManager, String)
.word
- the word to embedEmbeddingException
- if there is an error while trying to embedjava.lang.String unembedWord(NDArray wordEmbedding)
wordEmbedding
- the word embedding to find the matching string word for.