public interface WordEmbedding
NDArray
representations of words.
A word embedding maps words to a NDArray
that attempts to represent the key ideas in
the words. Each of the values in the dimension can represent different pieces of meaning such as
young-old, object-living, etc.
These word embeddings can be used in two different ways in models. First, they can be used
purely for preprocessing the model. In this case, it is a requirement for most models that use
text as an input. The model is not trained. For this use case, use embedWord(NDManager,
String)
.
In the second option, the embedding can be trained using the standard deep learning techniques
to better handle the current dataset. For this case, you need two methods. First, call preprocessWordToEmbed(String)
within your dataset. Then, the first step in your model should be
to call embedWord(NDManager, long)
.
Modifier and Type | Method and Description |
---|---|
NDArray |
embedWord(NDArray index)
Embeds the word after preprocessed using
preprocessWordToEmbed(String) . |
default NDArray |
embedWord(NDManager manager,
long index)
Embeds the word after preprocessed using
preprocessWordToEmbed(String) . |
default NDArray |
embedWord(NDManager manager,
java.lang.String word)
Embeds a word.
|
long |
preprocessWordToEmbed(java.lang.String word)
Pre-processes the word to embed into an array to pass into the model.
|
java.lang.String |
unembedWord(NDArray word)
Returns the closest matching word for the given index.
|
boolean |
vocabularyContains(java.lang.String word)
Returns whether an embedding exists for a word.
|
boolean vocabularyContains(java.lang.String word)
word
- the word to checklong preprocessWordToEmbed(java.lang.String word)
Make sure to call embedWord(NDManager, long)
after this.
word
- the word to embeddefault NDArray embedWord(NDManager manager, java.lang.String word) throws EmbeddingException
manager
- the manager for the embedding arrayword
- the word to embedEmbeddingException
- if there is an error while trying to embeddefault NDArray embedWord(NDManager manager, long index) throws EmbeddingException
preprocessWordToEmbed(String)
.manager
- the manager for the embedding arrayindex
- the index of the word to embedEmbeddingException
- if there is an error while trying to embedNDArray embedWord(NDArray index) throws EmbeddingException
preprocessWordToEmbed(String)
.index
- the index of the word to embedEmbeddingException
- if there is an error while trying to embedjava.lang.String unembedWord(NDArray word)
word
- the word embedding to find the matching string word for.