Interface WordEmbedding
- All Known Implementing Classes:
TrainableWordEmbedding
NDArray
representations of words.
A word embedding maps words to a NDArray
that attempts to represent the key ideas in
the words. Each of the values in the dimension can represent different pieces of meaning such as
young-old, object-living, etc.
These word embeddings can be used in two different ways in models. First, they can be used
purely for preprocessing the model. In this case, it is a requirement for most models that use
text as an input. The model is not trained. For this use case, use embedWord(NDManager, String)
.
In the second option, the embedding can be trained using the standard deep learning techniques
to better handle the current dataset. For this case, you need two methods. First, call preprocessWordToEmbed(String)
within your dataset. Then, the first step in your model should be
to call embedWord(NDManager, long)
.
-
Method Summary
Modifier and TypeMethodDescriptionEmbeds the word after preprocessed usingpreprocessWordToEmbed(String)
.default NDArray
Embeds the word after preprocessed usingpreprocessWordToEmbed(String)
.default NDArray
Embeds a word.long
preprocessWordToEmbed
(String word) Pre-processes the word to embed into an array to pass into the model.unembedWord
(NDArray word) Returns the closest matching word for the given index.boolean
vocabularyContains
(String word) Returns whether an embedding exists for a word.
-
Method Details
-
vocabularyContains
Returns whether an embedding exists for a word.- Parameters:
word
- the word to check- Returns:
- true if an embedding exists
-
preprocessWordToEmbed
Pre-processes the word to embed into an array to pass into the model.Make sure to call
embedWord(NDManager, long)
after this.- Parameters:
word
- the word to embed- Returns:
- the word that is ready to embed
-
embedWord
Embeds a word.- Parameters:
manager
- the manager for the embedding arrayword
- the word to embed- Returns:
- the embedded word
- Throws:
EmbeddingException
- if there is an error while trying to embed
-
embedWord
Embeds the word after preprocessed usingpreprocessWordToEmbed(String)
.- Parameters:
manager
- the manager for the embedding arrayindex
- the index of the word to embed- Returns:
- the embedded word
- Throws:
EmbeddingException
- if there is an error while trying to embed
-
embedWord
Embeds the word after preprocessed usingpreprocessWordToEmbed(String)
.- Parameters:
index
- the index of the word to embed- Returns:
- the embedded word
- Throws:
EmbeddingException
- if there is an error while trying to embed
-
unembedWord
Returns the closest matching word for the given index.- Parameters:
word
- the word embedding to find the matching string word for.- Returns:
- a word similar to the passed in embedding
-