Class FrenchLanguageModel
- java.lang.Object
-
- org.predict4all.nlp.language.AbstractLanguageModel
-
- org.predict4all.nlp.language.french.FrenchLanguageModel
-
- All Implemented Interfaces:
LanguageModel
public class FrenchLanguageModel extends AbstractLanguageModel
-
-
Field Summary
Fields Modifier and Type Field Description static TokenMatcher[]
MATCHERS_NGRAM_FR
static TokenMatcher[]
MATCHERS_SEMANTIC_ANALYSIS_FR
-
Constructor Summary
Constructors Constructor Description FrenchLanguageModel()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description int
getAverageVocabularySize()
Average total vocabulary size (different existing words)int
getAverageWordLength()
BaseWordDictionary
getBaseWordDictionary(TrainingConfiguration configuration)
java.lang.String
getId()
StopWordDictionary
getStopWordDictionary(TrainingConfiguration configuration)
TokenMatcher[]
getTokenMatchersForNGram()
TokenMatcher[]
getTokenMatchersForSemanticAnalysis()
java.util.Set<java.lang.String>
getValidOneCharWords()
-
-
-
Field Detail
-
MATCHERS_SEMANTIC_ANALYSIS_FR
public static final TokenMatcher[] MATCHERS_SEMANTIC_ANALYSIS_FR
-
MATCHERS_NGRAM_FR
public static final TokenMatcher[] MATCHERS_NGRAM_FR
-
-
Method Detail
-
getId
public java.lang.String getId()
- Returns:
- identifier for this language model (e.g. ISO code)
-
getAverageWordLength
public int getAverageWordLength()
- Returns:
- the average word length for this language (can be round to the upper value)
-
getAverageVocabularySize
public int getAverageVocabularySize()
Description copied from interface:LanguageModel
Average total vocabulary size (different existing words)- Returns:
- the average vocabulary size for this language.
-
getTokenMatchersForSemanticAnalysis
public TokenMatcher[] getTokenMatchersForSemanticAnalysis()
- Specified by:
getTokenMatchersForSemanticAnalysis
in interfaceLanguageModel
- Overrides:
getTokenMatchersForSemanticAnalysis
in classAbstractLanguageModel
-
getTokenMatchersForNGram
public TokenMatcher[] getTokenMatchersForNGram()
- Specified by:
getTokenMatchersForNGram
in interfaceLanguageModel
- Overrides:
getTokenMatchersForNGram
in classAbstractLanguageModel
-
getValidOneCharWords
public java.util.Set<java.lang.String> getValidOneCharWords()
- Specified by:
getValidOneCharWords
in interfaceLanguageModel
- Overrides:
getValidOneCharWords
in classAbstractLanguageModel
-
getStopWordDictionary
public StopWordDictionary getStopWordDictionary(TrainingConfiguration configuration)
-
getBaseWordDictionary
public BaseWordDictionary getBaseWordDictionary(TrainingConfiguration configuration)
-
-