Index (Smile NLP 1.0.3 API)

A B C D E F G H I L M N O P R S T U V W

A

Abbreviations - Interface in smile.nlp.dictionary: A dictionary interface for abbreviations.
add(String, String, String) - Method in class smile.nlp.SimpleCorpus: Add a document to the corpus.
addAnchor(String) - Method in interface smile.nlp.AnchorText: Add a link label to the anchor text.
addAnchor(String) - Method in class smile.nlp.SimpleText
addChild(K[], V, int) - Method in class smile.nlp.Trie.Node
AnchorText - Interface in smile.nlp: The anchor text is the visible, clickable text in a hyperlink.
AprioriPhraseExtractor - Class in smile.nlp.collocation: An Apiori-like algorithm to extract n-gram phrases.
AprioriPhraseExtractor() - Constructor for class smile.nlp.collocation.AprioriPhraseExtractor

B

Bigram - Class in smile.nlp: Bigrams or digrams are groups of two words, and are very commonly used as the basis for simple statistical analysis of text.
Bigram(String, String) - Constructor for class smile.nlp.Bigram: Constructor.
BigramCollocation - Class in smile.nlp.collocation: Collocations are expressions of multiple words which commonly co-occur.
BigramCollocation(String, String, int, double) - Constructor for class smile.nlp.collocation.BigramCollocation: Constructor.
BigramCollocationFinder - Class in smile.nlp.collocation: Tools to identify collocations (words that often appear consecutively) within corpora.
BigramCollocationFinder(int) - Constructor for class smile.nlp.collocation.BigramCollocationFinder: Constructor.
BM25 - Class in smile.nlp.relevance: The BM25 weighting scheme, often called Okapi weighting, after the system in which it was first implemented, was developed as a way of building a probabilistic model sensitive to term frequency and document length while not introducing too many additional parameters into the model.
BM25() - Constructor for class smile.nlp.relevance.BM25: Default constructor with k1 = 1.2, b = 0.75, delta = 1.0.
BM25(double, double, double) - Constructor for class smile.nlp.relevance.BM25: Constructor.
BreakIteratorSentenceSplitter - Class in smile.nlp.tokenizer: A sentence splitter based on the java.text.BreakIterator, which supports multiple natural languages (selected by locale setting).
BreakIteratorSentenceSplitter() - Constructor for class smile.nlp.tokenizer.BreakIteratorSentenceSplitter: Constructor for the default locale.
BreakIteratorSentenceSplitter(Locale) - Constructor for class smile.nlp.tokenizer.BreakIteratorSentenceSplitter: Constructor for the given locale.
BreakIteratorTokenizer - Class in smile.nlp.tokenizer: A word tokenizer based on the java.text.BreakIterator, which supports multiple natural languages (selected by locale setting).
BreakIteratorTokenizer() - Constructor for class smile.nlp.tokenizer.BreakIteratorTokenizer: Constructor for the default locale.
BreakIteratorTokenizer(Locale) - Constructor for class smile.nlp.tokenizer.BreakIteratorTokenizer: Constructor for the given locale.

C

compareTo(BigramCollocation) - Method in class smile.nlp.collocation.BigramCollocation
compareTo(NGram) - Method in class smile.nlp.NGram
compareTo(Relevance) - Method in class smile.nlp.relevance.Relevance
contains(String) - Method in interface smile.nlp.dictionary.Dictionary: Returns true if this dictionary contains the specified word.
contains(String) - Method in enum smile.nlp.dictionary.EnglishDictionary
contains(String) - Method in class smile.nlp.dictionary.EnglishPunctuations
contains(String) - Method in enum smile.nlp.dictionary.EnglishStopWords
contains(String) - Method in class smile.nlp.dictionary.SimpleDictionary
CooccurrenceKeywordExtractor - Class in smile.nlp.keyword: Keyword extraction from a single document using word co-occurrence statistical information.
CooccurrenceKeywordExtractor() - Constructor for class smile.nlp.keyword.CooccurrenceKeywordExtractor
Corpus - Interface in smile.nlp: A corpus is a collection of documents.

D

Dictionary - Interface in smile.nlp.dictionary: A dictionary is a set of words in some natural language.
doc() - Method in class smile.nlp.relevance.Relevance: Returns the document to rank.

E

EnglishDictionary - Enum in smile.nlp.dictionary: A concise dictionary of common terms in English.
EnglishPOSLexicon - Class in smile.nlp.pos: An English lexicon with part-of-speech tags.
EnglishPOSLexicon() - Constructor for class smile.nlp.pos.EnglishPOSLexicon
EnglishPunctuations - Class in smile.nlp.dictionary: Punctuation marks in English.
EnglishStopWords - Enum in smile.nlp.dictionary: Several sets of English stop words.
equals(Object) - Method in class smile.nlp.Bigram
equals(Object) - Method in class smile.nlp.collocation.BigramCollocation
equals(Object) - Method in class smile.nlp.NGram
equals(Object) - Method in class smile.nlp.SimpleText
extract(Collection<String[]>, int, int) - Method in class smile.nlp.collocation.AprioriPhraseExtractor
extract(String) - Method in class smile.nlp.keyword.CooccurrenceKeywordExtractor: Returns the top 10 keywords.
extract(String, int) - Method in class smile.nlp.keyword.CooccurrenceKeywordExtractor: Returns a given number of top keywords.

F

find(Corpus, int) - Method in class smile.nlp.collocation.BigramCollocationFinder: Finds top k bigram collocations in the given corpus.
find(Corpus, double) - Method in class smile.nlp.collocation.BigramCollocationFinder: Finds bigram collocations in the given corpus whose p-value is less than the given threshold.
freq - Variable in class smile.nlp.NGram: Frequency of n-gram in the corpus.
frequency() - Method in class smile.nlp.collocation.BigramCollocation: Returns the frequency of bigram in the corpus.

G

get(String) - Static method in class smile.nlp.pos.EnglishPOSLexicon: Returns part-of-speech tags for given word, or null if the word does not exist in the dictionary.
get(K[]) - Method in class smile.nlp.Trie: Returns the associated value of a given key.
get(K) - Method in class smile.nlp.Trie: Returns the node of a given key.
getAbbreviation(String) - Method in interface smile.nlp.dictionary.Abbreviations: Returns the abbreviation for a word.
getAnchor() - Method in interface smile.nlp.AnchorText: Returns the anchor text if any.
getAnchor() - Method in class smile.nlp.SimpleText: Returns the anchor text if any.
getAverageDocumentSize() - Method in interface smile.nlp.Corpus: Returns the average size of documents in the corpus.
getAverageDocumentSize() - Method in class smile.nlp.SimpleCorpus
getBigramFrequency(Bigram) - Method in interface smile.nlp.Corpus: Returns the total frequency of the bigram in the corpus.
getBigramFrequency(Bigram) - Method in class smile.nlp.SimpleCorpus
getBigrams() - Method in interface smile.nlp.Corpus: Returns an iterator over the bigrams in the corpus.
getBigrams() - Method in class smile.nlp.SimpleCorpus
getBody() - Method in class smile.nlp.Text: Returns the body of text.
getChild(K[], int) - Method in class smile.nlp.Trie.Node
getChild(K) - Method in class smile.nlp.Trie.Node
getDefault() - Static method in class smile.nlp.pos.HMMPOSTagger: Returns the default English POS tagger.
getFull(String) - Method in interface smile.nlp.dictionary.Abbreviations: Returns the full word for a given abbreviation.
getID() - Method in class smile.nlp.Text: Returns the id of document in the corpus.
getInstance() - Static method in class smile.nlp.dictionary.EnglishPunctuations: Returns the singleton instance.
getInstance() - Static method in class smile.nlp.tokenizer.PennTreebankTokenizer: Returns the singleton instance.
getInstance() - Static method in class smile.nlp.tokenizer.SimpleParagraphSplitter: Returns the singleton instance.
getInstance() - Static method in class smile.nlp.tokenizer.SimpleSentenceSplitter: Returns the singleton instance.
getKey() - Method in class smile.nlp.Trie.Node
getNumBigrams() - Method in interface smile.nlp.Corpus: Returns the number of bigrams in the corpus.
getNumBigrams() - Method in class smile.nlp.SimpleCorpus
getNumDocuments() - Method in interface smile.nlp.Corpus: Returns the number of documents in the corpus.
getNumDocuments() - Method in class smile.nlp.SimpleCorpus
getNumTerms() - Method in interface smile.nlp.Corpus: Returns the number of unique terms in the corpus.
getNumTerms() - Method in class smile.nlp.SimpleCorpus
getTermFrequency(String) - Method in interface smile.nlp.Corpus: Returns the total frequency of the term in the corpus.
getTermFrequency(String) - Method in class smile.nlp.SimpleCorpus
getTerms() - Method in interface smile.nlp.Corpus: Returns an iterator over the terms in the corpus.
getTerms() - Method in class smile.nlp.SimpleCorpus
getTitle() - Method in class smile.nlp.Text: Returns the title of text.
getValue(String) - Static method in enum smile.nlp.pos.PennTreebankPOS: Returns an enum value from a string.
getValue() - Method in class smile.nlp.Trie.Node

H

hashCode() - Method in class smile.nlp.Bigram
hashCode() - Method in class smile.nlp.collocation.BigramCollocation
hashCode() - Method in class smile.nlp.NGram
hashCode() - Method in class smile.nlp.SimpleText
HMMPOSTagger - Class in smile.nlp.pos: Part-of-speech tagging with hidden Markov model.
HMMPOSTagger() - Constructor for class smile.nlp.pos.HMMPOSTagger: Constructor.

I

iterator() - Method in interface smile.nlp.dictionary.Dictionary: Returns an iterator over the elements in this dictionary.
iterator() - Method in enum smile.nlp.dictionary.EnglishDictionary
iterator() - Method in class smile.nlp.dictionary.EnglishPunctuations
iterator() - Method in enum smile.nlp.dictionary.EnglishStopWords
iterator() - Method in class smile.nlp.dictionary.SimpleDictionary

L

LancasterStemmer - Class in smile.nlp.stemmer: The Paice/Husk Lancaster stemming algorithm.
LancasterStemmer() - Constructor for class smile.nlp.stemmer.LancasterStemmer: Constructor.
LancasterStemmer(boolean) - Constructor for class smile.nlp.stemmer.LancasterStemmer: Constructor.
learn(String[][], PennTreebankPOS[][]) - Static method in class smile.nlp.pos.HMMPOSTagger: Learns an HMM POS tagger by maximum likelihood estimation.
load(String, List<String[]>, List<PennTreebankPOS[]>) - Static method in class smile.nlp.pos.HMMPOSTagger: Load training data from a corpora.

M

main(String[]) - Static method in class smile.nlp.pos.HMMPOSTagger: Train the default model on WSJ and BROWN datasets.
maxtf() - Method in class smile.nlp.SimpleText
maxtf() - Method in interface smile.nlp.TextTerms: Returns the maximum term frequency over all terms in the document.

N

NGram - Class in smile.nlp: An n-gram is a contiguous sequence of n words from a given sequence of text.
NGram(String[]) - Constructor for class smile.nlp.NGram: Constructor.
NGram(String[], int) - Constructor for class smile.nlp.NGram: Constructor.

O

open - Variable in enum smile.nlp.pos.PennTreebankPOS: True if the POS is a open class.

P

ParagraphSplitter - Interface in smile.nlp.tokenizer: A paragraph splitter segments text into paragraphs.
PennTreebankPOS - Enum in smile.nlp.pos: The Penn Treebank Tag set.
PennTreebankTokenizer - Class in smile.nlp.tokenizer: A word tokenizer that tokenizes English sentences using the conventions used by the Penn Treebank.
PorterStemmer - Class in smile.nlp.stemmer: Porter's stemming algorithm.
PorterStemmer() - Constructor for class smile.nlp.stemmer.PorterStemmer: Constructor.
POSTagger - Interface in smile.nlp.pos: Part-of-speech tagging (POS tagging) is the process of marking up the words in a sentence as corresponding to a particular part of speech.
Punctuations - Interface in smile.nlp.dictionary: Punctuation marks are symbols that indicate the structure and organization of written language, as well as intonation and pauses to be observed when reading aloud.
put(K[], V) - Method in class smile.nlp.Trie: Add a key with associated value to the trie.

R

rank(Corpus, TextTerms, String, int, int) - Method in class smile.nlp.relevance.BM25
rank(Corpus, TextTerms, String[], int[], int) - Method in class smile.nlp.relevance.BM25
rank(Corpus, TextTerms, String, int, int) - Method in interface smile.nlp.relevance.RelevanceRanker: Returns a relevance score between a term and a document based on a corpus.
rank(Corpus, TextTerms, String[], int[], int) - Method in interface smile.nlp.relevance.RelevanceRanker: Returns a relevance score between a set of terms and a document based on a corpus.
rank(int, int, long, long) - Method in class smile.nlp.relevance.TFIDF: Returns a relevance score between a term and a document based on a corpus.
rank(Corpus, TextTerms, String, int, int) - Method in class smile.nlp.relevance.TFIDF
rank(Corpus, TextTerms, String[], int[], int) - Method in class smile.nlp.relevance.TFIDF
Relevance - Class in smile.nlp.relevance: In the context of information retrieval, relevance denotes how well a retrieved set of documents meets the information need of the user.
Relevance(Text, double) - Constructor for class smile.nlp.relevance.Relevance: Constructor.
RelevanceRanker - Interface in smile.nlp.relevance: An interface to provide relevance ranking algorithm.

S

score() - Method in class smile.nlp.collocation.BigramCollocation: Returns the chi-square statistical score of the collocation.
score(int, int, double, int, int, double, int, int, double, long, long) - Method in class smile.nlp.relevance.BM25: Returns a relevance score between a term and a document based on a corpus.
score(double, long, long) - Method in class smile.nlp.relevance.BM25: Returns a relevance score between a term and a document based on a corpus.
score(double, int, double, long, long) - Method in class smile.nlp.relevance.BM25: Returns a relevance score between a term and a document based on a corpus.
score() - Method in class smile.nlp.relevance.Relevance: Returns the relevance score.
search(String) - Method in interface smile.nlp.Corpus: Returns an iterator over the set of documents containing the given term.
search(RelevanceRanker, String) - Method in interface smile.nlp.Corpus: Returns an iterator over the set of documents containing the given term in descending order of relevance.
search(RelevanceRanker, String[]) - Method in interface smile.nlp.Corpus: Returns an iterator over the set of documents containing (at least one of) the given terms in descending order of relevance.
search(String) - Method in class smile.nlp.SimpleCorpus
search(RelevanceRanker, String) - Method in class smile.nlp.SimpleCorpus
search(RelevanceRanker, String[]) - Method in class smile.nlp.SimpleCorpus
SentenceSplitter - Interface in smile.nlp.tokenizer: A sentence splitter segments text into sentences (a string of words satisfying the grammatical rules of a language).
setAnchor(String) - Method in interface smile.nlp.AnchorText: Sets the anchor text.
setAnchor(String) - Method in class smile.nlp.SimpleText: Sets the anchor text.
setBody(String) - Method in class smile.nlp.Text
setID(String) - Method in class smile.nlp.Text
setTitle(String) - Method in class smile.nlp.Text
SimpleCorpus - Class in smile.nlp: A simple implementation of corpus in main memory for small datasets.
SimpleCorpus() - Constructor for class smile.nlp.SimpleCorpus: Constructor.
SimpleCorpus(SentenceSplitter, Tokenizer, StopWords, Punctuations) - Constructor for class smile.nlp.SimpleCorpus: Constructor.
SimpleDictionary - Class in smile.nlp.dictionary: A simple implementation of dictionary interface.
SimpleDictionary(String) - Constructor for class smile.nlp.dictionary.SimpleDictionary: Constructor.
SimpleParagraphSplitter - Class in smile.nlp.tokenizer: This is a simple paragraph splitter.
SimpleSentenceSplitter - Class in smile.nlp.tokenizer: This is a simple sentence splitter for English.
SimpleText - Class in smile.nlp: A list-of-words representation of documents.
SimpleText(String, String, String, String[]) - Constructor for class smile.nlp.SimpleText: Constructor.
SimpleTokenizer - Class in smile.nlp.tokenizer: A word tokenizer that tokenizes English sentences with some differences from TreebankWordTokenizer, noteably on handling not-contractions.
SimpleTokenizer() - Constructor for class smile.nlp.tokenizer.SimpleTokenizer: Constructor.
SimpleTokenizer(boolean) - Constructor for class smile.nlp.tokenizer.SimpleTokenizer: Constructor.
size() - Method in interface smile.nlp.Corpus: Returns the number of words in the corpus.
size() - Method in interface smile.nlp.dictionary.Dictionary: Returns the number of elements in this dictionary.
size() - Method in enum smile.nlp.dictionary.EnglishDictionary
size() - Method in class smile.nlp.dictionary.EnglishPunctuations
size() - Method in enum smile.nlp.dictionary.EnglishStopWords
size() - Method in class smile.nlp.dictionary.SimpleDictionary
size() - Method in class smile.nlp.SimpleCorpus
size() - Method in class smile.nlp.SimpleText
size() - Method in interface smile.nlp.TextTerms: Returns the number of words.
size() - Method in class smile.nlp.Trie: Returns the number of entries.
smile.nlp - package smile.nlp: Natural language processing.
smile.nlp.collocation - package smile.nlp.collocation: Collocation finding algorithms.
smile.nlp.dictionary - package smile.nlp.dictionary: Common dictionaries such as stop words, punctuation, common English words, etc.
smile.nlp.keyword - package smile.nlp.keyword
smile.nlp.pos - package smile.nlp.pos: Part-of-speech taggers.
smile.nlp.relevance - package smile.nlp.relevance: Term-document relevance ranking algorithms.
smile.nlp.stemmer - package smile.nlp.stemmer: English word stemmer algorithms.
smile.nlp.tokenizer - package smile.nlp.tokenizer: Sentence splitter and word tokenizer.
split(String) - Method in class smile.nlp.tokenizer.BreakIteratorSentenceSplitter
split(String) - Method in class smile.nlp.tokenizer.BreakIteratorTokenizer
split(String) - Method in interface smile.nlp.tokenizer.ParagraphSplitter: Split text into sentences.
split(String) - Method in class smile.nlp.tokenizer.PennTreebankTokenizer
split(String) - Method in interface smile.nlp.tokenizer.SentenceSplitter: Split text into sentences.
split(String) - Method in class smile.nlp.tokenizer.SimpleParagraphSplitter
split(String) - Method in class smile.nlp.tokenizer.SimpleSentenceSplitter
split(String) - Method in class smile.nlp.tokenizer.SimpleTokenizer
split(String) - Method in interface smile.nlp.tokenizer.Tokenizer: Divide the given string into a list of substrings.
stem(String) - Method in class smile.nlp.stemmer.LancasterStemmer
stem(String) - Method in class smile.nlp.stemmer.PorterStemmer
stem(String) - Method in interface smile.nlp.stemmer.Stemmer: Transforms a word into its root form.
Stemmer - Interface in smile.nlp.stemmer: A Stemmer transforms a word into its root form.
StopWords - Interface in smile.nlp.dictionary: A set of stop words in some language.
stripPluralParticiple(String) - Method in class smile.nlp.stemmer.PorterStemmer: Remove plurals and participles.

T

tag(String[]) - Method in class smile.nlp.pos.HMMPOSTagger
tag(String[]) - Method in interface smile.nlp.pos.POSTagger: Tags the sentence in the form of a sequence of words
Text - Class in smile.nlp: A minimal interface of text in the corpus.
Text(String, String, String) - Constructor for class smile.nlp.Text
TextTerms - Interface in smile.nlp
tf(String) - Method in class smile.nlp.SimpleText
tf(String) - Method in interface smile.nlp.TextTerms: Returns the term frequency.
TFIDF - Class in smile.nlp.relevance: The tf-idf weight (term frequency-inverse document frequency) is a weight often used in information retrieval and text mining.
TFIDF() - Constructor for class smile.nlp.relevance.TFIDF: Constructor.
TFIDF(double) - Constructor for class smile.nlp.relevance.TFIDF: Constructor.
Tokenizer - Interface in smile.nlp.tokenizer: A token is a string of characters, categorized according to the rules as a symbol.
toString() - Method in class smile.nlp.Bigram
toString() - Method in class smile.nlp.collocation.BigramCollocation
toString() - Method in class smile.nlp.NGram
toString() - Method in class smile.nlp.SimpleText
Trie<K,V> - Class in smile.nlp: A trie, also called digital tree or prefix tree, is an ordered tree data structure that is used to store a dynamic set or associative array where the keys are usually strings.
Trie() - Constructor for class smile.nlp.Trie: Constructor.
Trie(int) - Constructor for class smile.nlp.Trie: Constructor.
Trie.Node - Class in smile.nlp
Trie.Node(K) - Constructor for class smile.nlp.Trie.Node

U

unique() - Method in class smile.nlp.SimpleText
unique() - Method in interface smile.nlp.TextTerms: Returns the iterator of unique words.

V

valueOf(String) - Static method in enum smile.nlp.dictionary.EnglishDictionary: Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum smile.nlp.dictionary.EnglishStopWords: Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum smile.nlp.pos.PennTreebankPOS: Returns the enum constant of this type with the specified name.
values() - Static method in enum smile.nlp.dictionary.EnglishDictionary: Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum smile.nlp.dictionary.EnglishStopWords: Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum smile.nlp.pos.PennTreebankPOS: Returns an array containing the constants of this enum type, in the order they are declared.

W

w1 - Variable in class smile.nlp.Bigram: Immutable first word of bigram.
w1() - Method in class smile.nlp.collocation.BigramCollocation: Returns the first word of bigram.
w2 - Variable in class smile.nlp.Bigram: Immutable second word of bigram.
w2() - Method in class smile.nlp.collocation.BigramCollocation: Returns the second word of bigram.
walkin(File, List<File>) - Static method in class smile.nlp.pos.HMMPOSTagger: Recursive function to descend into the directory tree and find all the files that end with ".POS"
words - Variable in class smile.nlp.NGram: Immutable word sequences.
words() - Method in class smile.nlp.SimpleText
words() - Method in interface smile.nlp.TextTerms: Returns the iterator of the words of the document.

A B C D E F G H I L M N O P R S T U V W