Class DynamicNGramDictionary
- java.lang.Object
-
- org.predict4all.nlp.ngram.dictionary.AbstractNGramDictionary<DynamicNGramTrieNode>
-
- org.predict4all.nlp.ngram.dictionary.TrainingNGramDictionary
-
- org.predict4all.nlp.ngram.dictionary.DynamicNGramDictionary
-
- All Implemented Interfaces:
AutoCloseable
public class DynamicNGramDictionary extends TrainingNGramDictionary
Represent aTrainingNGramDictionary
that can also be opened to be trained again.
This type of dictionary is useful when using a dynamic user model : the dynamic user dictionary is loaded and trained during each session, and then saved to be used in the next sessions.
-
-
Field Summary
-
Fields inherited from class org.predict4all.nlp.ngram.dictionary.TrainingNGramDictionary
NGRAM_COUNT_FORMAT
-
Fields inherited from class org.predict4all.nlp.ngram.dictionary.AbstractNGramDictionary
DICTIONARY_INFORMATION_BYTE_COUNT, maxOrder, rootNode
-
-
Constructor Summary
Constructors Constructor Description DynamicNGramDictionary(int maxOrderP)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description double[]
computeD(TrainingConfiguration configuration)
Compute the optimal value for d (absolute discounting parameter).
Usually d is computed with formula :
D = C1 / (C1 + 2 * C2)
Where C1 = number of ngram with count == 1, and C2 = number of ngram with count == 2.protected void
executeWriteLevelOnRoot(FileChannel fileChannel, int n)
Call the correct node method to save a trie level to file.protected long
getRootBlockSize()
TIntHashSet
getWordUsed()
static DynamicNGramDictionary
load(File dictionaryFile)
Create and open a existing dynamic ngram dictionary.protected void
openDictionary(File dictionaryFile)
Open a dictionary from a file.
To use the dictionary, the sameWordDictionary
used to save it should be used.-
Methods inherited from class org.predict4all.nlp.ngram.dictionary.TrainingNGramDictionary
checkChildrenLoading, close, countNGrams, create, getNodeForPrefix, pruneNGramsCount, pruneNGramsOrderCount, pruneNGramsWeightedDifference, putAndIncrementBy, putAndIncrementBy, saveDictionary, updateProbabilities, updateProbabilities
-
Methods inherited from class org.predict4all.nlp.ngram.dictionary.AbstractNGramDictionary
compact, getMaxOrder, getNextWord, getProbability, getRawProbability, getRoot, listNextWords, readDictionaryInformation, writeDictionaryInfo
-
-
-
-
Method Detail
-
openDictionary
protected void openDictionary(File dictionaryFile) throws IOException
Description copied from class:AbstractNGramDictionary
Open a dictionary from a file.
To use the dictionary, the sameWordDictionary
used to save it should be used.- Overrides:
openDictionary
in classTrainingNGramDictionary
- Parameters:
dictionaryFile
- the file containing a dictionary.- Throws:
IOException
- if dictionary can't be opened
-
load
public static DynamicNGramDictionary load(File dictionaryFile) throws IOException
Create and open a existing dynamic ngram dictionary.- Parameters:
dictionaryFile
- file containing the dynamic ngram dictionary- Returns:
- the loaded dynamic dictionary
- Throws:
IOException
- if dictionary can't be loaded
-
getRootBlockSize
protected long getRootBlockSize()
- Overrides:
getRootBlockSize
in classTrainingNGramDictionary
- Returns:
- should return the byte count needed to save the root block (useful to shift data in file to save the root in first position in file)
-
executeWriteLevelOnRoot
protected void executeWriteLevelOnRoot(FileChannel fileChannel, int n) throws IOException
Description copied from class:TrainingNGramDictionary
Call the correct node method to save a trie level to file.- Overrides:
executeWriteLevelOnRoot
in classTrainingNGramDictionary
- Parameters:
fileChannel
- the file channel where trie is savedn
- the level to save- Throws:
IOException
- if writing fail
-
computeD
public double[] computeD(TrainingConfiguration configuration)
Description copied from class:AbstractNGramDictionary
Compute the optimal value for d (absolute discounting parameter).
Usually d is computed with formula :
D = C1 / (C1 + 2 * C2)
Where C1 = number of ngram with count == 1, and C2 = number of ngram with count == 2. Theses values are computed for each order (0 index = unigram, 1 index = bigram, etc.)- Overrides:
computeD
in classTrainingNGramDictionary
- Parameters:
configuration
- configuration to use to compute D (can set min/max values and a D value)- Returns:
- computed d value for this dictionary
-
getWordUsed
public TIntHashSet getWordUsed()
-
-