pragmatic

Type Members

class PragmaticScorer extends Serializable

Scorer is a rule based implementation inspired on http://fjavieralba.com/basic-sentiment-analysis-with-python.html Its strategy is to tag words by a dictionary in a sentence context, and later identify such context to get amplifiers

class SentimentDetector extends AnnotatorApproach[SentimentDetectorModel]

Trains a rule based sentiment detector, which calculates a score based on predefined keywords.

A dictionary of predefined sentiment keywords must be provided with setDictionary, where each line is a word delimited to its class (either positive or negative). The dictionary can be set in either in the form of a delimited text file or directly as an ExternalResource.

By default, the sentiment score will be assigned labels "positive" if the score is >= 0, else "negative". To retrieve the raw sentiment scores, enableScore needs to be set to true.

For extended examples of usage, see the Spark NLP Workshop and the SentimentTestSpec.

Example

In this example, the dictionary default-sentiment-dict.txt has the form of

...
cool,positive
superb,positive
bad,negative
uninspired,negative
...

where each sentiment keyword is delimited by ",".

import spark.implicits._
import com.johnsnowlabs.nlp.DocumentAssembler
import com.johnsnowlabs.nlp.annotator.Tokenizer
import com.johnsnowlabs.nlp.annotators.Lemmatizer
import com.johnsnowlabs.nlp.annotators.sda.pragmatic.SentimentDetector
import com.johnsnowlabs.nlp.util.io.ReadAs
import org.apache.spark.ml.Pipeline

val documentAssembler = new DocumentAssembler()
  .setInputCol("text")
  .setOutputCol("document")

val tokenizer = new Tokenizer()
  .setInputCols("document")
  .setOutputCol("token")

val lemmatizer = new Lemmatizer()
  .setInputCols("token")
  .setOutputCol("lemma")
  .setDictionary("src/test/resources/lemma-corpus-small/lemmas_small.txt", "->", "\t")

val sentimentDetector = new SentimentDetector()
  .setInputCols("lemma", "document")
  .setOutputCol("sentimentScore")
  .setDictionary("src/test/resources/sentiment-corpus/default-sentiment-dict.txt", ",", ReadAs.TEXT)

val pipeline = new Pipeline().setStages(Array(
  documentAssembler,
  tokenizer,
  lemmatizer,
  sentimentDetector,
))

val data = Seq(
  "The staff of the restaurant is nice",
  "I recommend others to avoid because it is too expensive"
).toDF("text")
val result = pipeline.fit(data).transform(data)

result.selectExpr("sentimentScore.result").show(false)
+----------+  //  +------+ for enableScore set to true
|result    |  //  |result|
+----------+  //  +------+
|[positive]|  //  |[1.0] |
|[negative]|  //  |[-2.0]|
+----------+  //  +------+

See also: ViveknSentimentApproach for an alternative approach to sentiment extraction

class SentimentDetectorModel extends AnnotatorModel[SentimentDetectorModel] with HasSimpleAnnotate[SentimentDetectorModel]

Rule based sentiment detector, which calculates a score based on predefined keywords.
Rule based sentiment detector, which calculates a score based on predefined keywords.
This is the instantiated model of the SentimentDetector. For training your own model, please see the documentation of that class.
A dictionary of predefined sentiment keywords must be provided with setDictionary, where each line is a word delimited to its class (either positive or negative). The dictionary can be set in either in the form of a delimited text file or directly as an ExternalResource.
By default, the sentiment score will be assigned labels "positive" if the score is >= 0, else "negative". To retrieve the raw sentiment scores, enableScore needs to be set to true.
For extended examples of usage, see the Spark NLP Workshop and the SentimentTestSpec.

See also
ViveknSentimentApproach for an alternative approach to sentiment extraction

Value Members

object SentimentDetector extends DefaultParamsReadable[SentimentDetector] with Serializable

This is the companion object of SentimentDetector.
This is the companion object of SentimentDetector. Please refer to that class for the documentation.
object SentimentDetectorModel extends ParamsAndFeaturesReadable[SentimentDetectorModel] with Serializable

package pragmatic

Type Members

class PragmaticScorer extends Serializable

class SentimentDetector extends AnnotatorApproach[SentimentDetectorModel]

Example

class SentimentDetectorModel extends AnnotatorModel[SentimentDetectorModel] with HasSimpleAnnotate[SentimentDetectorModel]

Value Members

object SentimentDetector extends DefaultParamsReadable[SentimentDetector] with Serializable

object SentimentDetectorModel extends ParamsAndFeaturesReadable[SentimentDetectorModel] with Serializable

Ungrouped