Overview (predict4all-core 1.2.0 API)

Packages
Package	Description
org.predict4all.nlp
org.predict4all.nlp.exception
org.predict4all.nlp.io	Contains custom `InputStream` and `OutputStream` to save/load Predict4All specific items (`Token` and `Word`). Note that NGram are saved without these stream as they are designed to be loaded on demand with a `FileChannel`. Both token and word streams extends `DataOutputStream` or `DataInputStream` : this was done for optimization, this method is much more optimized that using any other serialization methods.
org.predict4all.nlp.language	Represent every language specific items. A base `AbstractLanguageModel` allow simpler `LanguageModel` implementations. Sub-packages as "french" may contains language specific code.
org.predict4all.nlp.language.french
org.predict4all.nlp.language.french.matcher
org.predict4all.nlp.ngram	Package containing everything about the NGram model used in Predict4All. Contains the ngram training algorithm in `NGramDictionaryGenerator` Also contains `AbstractNGramTrieNode` : a trie structure that can be implemented in both ways : dynamic or static. This trie structure allow having a huge number of ngram available for probability computation without loading them into memory.
org.predict4all.nlp.ngram.debug
org.predict4all.nlp.ngram.dictionary
org.predict4all.nlp.ngram.trie
org.predict4all.nlp.ngram.trie.map
org.predict4all.nlp.parser	Package mainly focus on classes to convert a raw input text (as `String`) to `Token` that can be used by Predict4All. This package is used by both training algorithms and predictor : this allow consistency among parsing and using user input. Both word and token stream
org.predict4all.nlp.parser.matcher
org.predict4all.nlp.parser.token
org.predict4all.nlp.prediction	Main PREDICT4ALL entry point : this package contains "the glue" between every prediction components. PREDICT4ALL core features are located in `WordPredictor`
org.predict4all.nlp.prediction.model
org.predict4all.nlp.semantic	Semantic related prediction model (WIP) - not used by current `WordPredictor`
org.predict4all.nlp.trainer	Represents the whole data training process managed by the main `DataTrainer`. Training is done with different steps : `Tokenizer` `TokenConverter` `WordDictionaryGenerator` `NGramDictionaryGenerator` Note that the `DataTrainer` use `TrainingCorpus` and `AbstractTrainingDocument` : this abstraction level is useful to be able to train the model on same corpus without having to go through every training step : really useful when developing new training algorithms.
org.predict4all.nlp.trainer.configuration
org.predict4all.nlp.trainer.corpus
org.predict4all.nlp.trainer.step
org.predict4all.nlp.utils	Contains some simple data structure and lambda interfaces and classic "utils" static classes.
org.predict4all.nlp.utils.progressindicator
org.predict4all.nlp.words	Contains classes related to `Word` and `WordDictionary` Mainly used to identify `Word` as unique instance identified with int ID. This package mainly focus on managing vocabulary.
org.predict4all.nlp.words.correction	Contains every classes and algorithms related to word correction. The main component is `WordCorrectionGenerator` as it contains most of the correction logic. `CorrectionRule` is also important as it the main entry point to configure word correction.
org.predict4all.nlp.words.model