Interface InvertedIndex<T extends SequenceElement>
-
- All Superinterfaces:
Serializable
public interface InvertedIndex<T extends SequenceElement> extends Serializable
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description voidaddLabelForDoc(int doc, String label)Adds words to the given documentvoidaddLabelForDoc(int doc, T word)Add word to a documentvoidaddLabelsForDoc(int doc, Collection<String> label)Adds words to the given documentvoidaddLabelsForDoc(int doc, List<T> word)Add word to a documentvoidaddWordsToDoc(int doc, List<T> words)Adds words to the given documentvoidaddWordsToDoc(int doc, List<T> words, String label)Adds words to the given documentvoidaddWordsToDoc(int doc, List<T> words, Collection<String> label)Adds words to the given documentvoidaddWordsToDoc(int doc, List<T> words, T label)Adds words to the given documentvoidaddWordsToDocVocabWord(int doc, List<T> words, Collection<T> label)Adds words to the given documentvoidaddWordToDoc(int doc, T word)Add word to a documentint[]allDocs()Returns a list of all documentsIterator<List<List<T>>>batchIter(int batchSize)Iterate over batchesintbatchSize()For word vectors, this is the batch size for which to train onvoidcleanup()Cleanup any resources usedIterator<List<T>>docs()Iterate over documentsList<T>document(int index)Returns a list of words for a documentint[]documents(T vocabWord)Returns the list of documents a vocab word is inorg.nd4j.common.primitives.Pair<List<T>,String>documentWithLabel(int index)Returns a list of words for a document and the associated labelorg.nd4j.common.primitives.Pair<List<T>,Collection<String>>documentWithLabels(int index)Returns a list of words associated with the document and the associated labelsvoideachDoc(org.nd4j.shade.guava.base.Function<List<T>,Void> func, Executor exec)Iterate over each documentvoideachDocWithLabel(org.nd4j.shade.guava.base.Function<org.nd4j.common.primitives.Pair<List<T>,String>,Void> func, Executor exec)Iterate over each document with a labelvoideachDocWithLabels(org.nd4j.shade.guava.base.Function<org.nd4j.common.primitives.Pair<List<T>,Collection<String>>,Void> func, Executor exec)Iterate over each document with a labelvoidfinish()Finishes saving dataIterator<List<T>>miniBatches()Iterates over mini batchesintnumDocuments()Returns the number of documentsdoublesample()Sampling for creating mini batcheslongtotalWords()Total number of words in the indexvoidunlock()Unlock the index
-
-
-
Method Detail
-
batchIter
Iterator<List<List<T>>> batchIter(int batchSize)
Iterate over batches- Returns:
- the batch size
-
unlock
void unlock()
Unlock the index
-
cleanup
void cleanup()
Cleanup any resources used
-
sample
double sample()
Sampling for creating mini batches- Returns:
- the sampling for mini batches
-
miniBatches
Iterator<List<T>> miniBatches()
Iterates over mini batches- Returns:
- the mini batches created by this vectorizer
-
document
List<T> document(int index)
Returns a list of words for a document- Parameters:
index-- Returns:
-
documentWithLabel
org.nd4j.common.primitives.Pair<List<T>,String> documentWithLabel(int index)
Returns a list of words for a document and the associated label- Parameters:
index-- Returns:
-
documentWithLabels
org.nd4j.common.primitives.Pair<List<T>,Collection<String>> documentWithLabels(int index)
Returns a list of words associated with the document and the associated labels- Parameters:
index-- Returns:
-
documents
int[] documents(T vocabWord)
Returns the list of documents a vocab word is in- Parameters:
vocabWord- the vocab word to get documents for- Returns:
- the documents for a vocab word
-
numDocuments
int numDocuments()
Returns the number of documents- Returns:
-
allDocs
int[] allDocs()
Returns a list of all documents- Returns:
- the list of all documents
-
addWordToDoc
void addWordToDoc(int doc, T word)Add word to a document- Parameters:
doc- the document to add toword- the word to add
-
addWordsToDoc
void addWordsToDoc(int doc, List<T> words)Adds words to the given document- Parameters:
doc- the document to add towords- the words to add
-
addLabelForDoc
void addLabelForDoc(int doc, T word)Add word to a document- Parameters:
doc- the document to add toword- the word to add
-
addLabelForDoc
void addLabelForDoc(int doc, String label)Adds words to the given document- Parameters:
doc- the document to add to
-
addWordsToDoc
void addWordsToDoc(int doc, List<T> words, String label)Adds words to the given document- Parameters:
doc- the document to add towords- the words to addlabel- the label for the document
-
addWordsToDoc
void addWordsToDoc(int doc, List<T> words, T label)Adds words to the given document- Parameters:
doc- the document to add towords- the words to addlabel- the label for the document
-
addLabelsForDoc
void addLabelsForDoc(int doc, List<T> word)Add word to a document- Parameters:
doc- the document to add toword- the word to add
-
addLabelsForDoc
void addLabelsForDoc(int doc, Collection<String> label)Adds words to the given document- Parameters:
doc- the document to add tolabel- the labels to add
-
addWordsToDoc
void addWordsToDoc(int doc, List<T> words, Collection<String> label)Adds words to the given document- Parameters:
doc- the document to add towords- the words to addlabel- the label for the document
-
addWordsToDocVocabWord
void addWordsToDocVocabWord(int doc, List<T> words, Collection<T> label)Adds words to the given document- Parameters:
doc- the document to add towords- the words to addlabel- the label for the document
-
finish
void finish()
Finishes saving data
-
totalWords
long totalWords()
Total number of words in the index- Returns:
- the total number of words in the index
-
batchSize
int batchSize()
For word vectors, this is the batch size for which to train on- Returns:
- the batch size for which to train on
-
eachDocWithLabels
void eachDocWithLabels(org.nd4j.shade.guava.base.Function<org.nd4j.common.primitives.Pair<List<T>,Collection<String>>,Void> func, Executor exec)
Iterate over each document with a label- Parameters:
func- the function to applyexec- executor service for execution
-
eachDocWithLabel
void eachDocWithLabel(org.nd4j.shade.guava.base.Function<org.nd4j.common.primitives.Pair<List<T>,String>,Void> func, Executor exec)
Iterate over each document with a label- Parameters:
func- the function to applyexec- executor service for execution
-
-