Interface WordEmbedding
-
- All Known Implementing Classes:
TrainableWordEmbedding
public interface WordEmbedding
A class to manage 1-DNDArray
representations of words.A word embedding maps words to a
NDArray
that attempts to represent the key ideas in the words. Each of the values in the dimension can represent different pieces of meaning such as young-old, object-living, etc.These word embeddings can be used in two different ways in models. First, they can be used purely for preprocessing the model. In this case, it is a requirement for most models that use text as an input. The model is not trained. For this use case, use
embedWord(NDManager, String)
.In the second option, the embedding can be trained using the standard deep learning techniques to better handle the current dataset. For this case, you need two methods. First, call
preprocessWordToEmbed(String)
within your dataset. Then, the first step in your model should be to callembedWord(NDManager, long)
.
-
-
Method Summary
All Methods Instance Methods Abstract Methods Default Methods Modifier and Type Method Description NDArray
embedWord(NDArray index)
Embeds the word after preprocessed usingpreprocessWordToEmbed(String)
.default NDArray
embedWord(NDManager manager, long index)
Embeds the word after preprocessed usingpreprocessWordToEmbed(String)
.default NDArray
embedWord(NDManager manager, java.lang.String word)
Embeds a word.long
preprocessWordToEmbed(java.lang.String word)
Pre-processes the word to embed into an array to pass into the model.java.lang.String
unembedWord(NDArray word)
Returns the closest matching word for the given index.boolean
vocabularyContains(java.lang.String word)
Returns whether an embedding exists for a word.
-
-
-
Method Detail
-
vocabularyContains
boolean vocabularyContains(java.lang.String word)
Returns whether an embedding exists for a word.- Parameters:
word
- the word to check- Returns:
- true if an embedding exists
-
preprocessWordToEmbed
long preprocessWordToEmbed(java.lang.String word)
Pre-processes the word to embed into an array to pass into the model.Make sure to call
embedWord(NDManager, long)
after this.- Parameters:
word
- the word to embed- Returns:
- the word that is ready to embed
-
embedWord
default NDArray embedWord(NDManager manager, java.lang.String word) throws EmbeddingException
Embeds a word.- Parameters:
manager
- the manager for the embedding arrayword
- the word to embed- Returns:
- the embedded word
- Throws:
EmbeddingException
- if there is an error while trying to embed
-
embedWord
default NDArray embedWord(NDManager manager, long index) throws EmbeddingException
Embeds the word after preprocessed usingpreprocessWordToEmbed(String)
.- Parameters:
manager
- the manager for the embedding arrayindex
- the index of the word to embed- Returns:
- the embedded word
- Throws:
EmbeddingException
- if there is an error while trying to embed
-
embedWord
NDArray embedWord(NDArray index) throws EmbeddingException
Embeds the word after preprocessed usingpreprocessWordToEmbed(String)
.- Parameters:
index
- the index of the word to embed- Returns:
- the embedded word
- Throws:
EmbeddingException
- if there is an error while trying to embed
-
unembedWord
java.lang.String unembedWord(NDArray word)
Returns the closest matching word for the given index.- Parameters:
word
- the word embedding to find the matching string word for.- Returns:
- a word similar to the passed in embedding
-
-