Class RakeParams
- java.lang.Object
-
- io.github.crew102.rapidrake.model.RakeParams
-
public class RakeParams extends Object
A parameter object for RAKE settings.
-
-
Constructor Summary
Constructors Constructor Description RakeParams(String[] stopWords, String[] stopPOS, int wordMinChar, boolean stem, String phraseDelims)
Constructor.RakeParams(String[] stopWords, String[] stopPOS, int wordMinChar, boolean stem, String phraseDelims, opennlp.tools.stemmer.snowball.SnowballStemmer.ALGORITHM stemmerLang)
Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description String
getPhraseDelims()
opennlp.tools.stemmer.snowball.SnowballStemmer.ALGORITHM
getStemmerLang()
List<String>
getStopPOS()
List<String>
getStopWords()
int
getWordMinChar()
boolean
shouldStem()
-
-
-
Constructor Detail
-
RakeParams
public RakeParams(String[] stopWords, String[] stopPOS, int wordMinChar, boolean stem, String phraseDelims, opennlp.tools.stemmer.snowball.SnowballStemmer.ALGORITHM stemmerLang)
Constructor.- Parameters:
stopWords
- an array of stopwords, which will be treated like phrase delimiters when RAKE is identifying candidate keywordsstopPOS
- an array of part-of-speech (POS) tags that should be considered stopwords. Words that are tagged with any of the parts-of-speech listed instopPOS
will be treated like delimiters. See Part-Of-Speech Tagging with R for a list of acceptable POS tags and their meanings.wordMinChar
- the minimum number of characters that a token/word must have. Words below this threshold are treated like phrase delimiters.stem
- an indicator for whether you want to stem the tokens in each keywordphraseDelims
- a character set containing the punctuation characters used to identify phrasesstemmerLang
- the stemming language/algorithm that should be used
-
RakeParams
public RakeParams(String[] stopWords, String[] stopPOS, int wordMinChar, boolean stem, String phraseDelims)
Constructor. This version ofRakeParams
exists to maintain backward compatibility of the overall API of the package...ThestemmerLang
param had to be added to an overloaded version of this method b/c of https://github.com/crew102/rapidrake-java/issues/4, hence the funky API.
-
-
Method Detail
-
getWordMinChar
public int getWordMinChar()
-
shouldStem
public boolean shouldStem()
-
getPhraseDelims
public String getPhraseDelims()
-
getStemmerLang
public opennlp.tools.stemmer.snowball.SnowballStemmer.ALGORITHM getStemmerLang()
-
-