Package

com.johnsnowlabs.nlp.annotators.parser

typdep

Permalink

package typdep

Visibility
  1. Public
  2. All

Type Members

  1. class ConllData extends AnyRef

    Permalink
  2. class DependencyArcList extends AnyRef

    Permalink
  3. class DependencyInstance extends Serializable

    Permalink
  4. class DependencyPipe extends Serializable

    Permalink
  5. class LocalFeatureData extends AnyRef

    Permalink
  6. class LowRankTensor extends AnyRef

    Permalink
  7. class Options extends Serializable

    Permalink
  8. class Parameters extends Serializable

    Permalink
  9. class PredictionParameters extends AnyRef

    Permalink
  10. trait ReadablePretrainedTypedDependency extends ParamsAndFeaturesReadable[TypedDependencyParserModel] with HasPretrained[TypedDependencyParserModel]

    Permalink
  11. class TrainDependencies extends Serializable

    Permalink
  12. case class TrainFile(path: String, conllFormat: String) extends Product with Serializable

    Permalink
  13. class TypedDependencyParser extends Serializable

    Permalink
  14. class TypedDependencyParserApproach extends AnnotatorApproach[TypedDependencyParserModel]

    Permalink

    Labeled parser that finds a grammatical relation between two words in a sentence.

    Labeled parser that finds a grammatical relation between two words in a sentence. Its input is either a CoNLL2009 or ConllU dataset.

    For instantiated/pretrained models, see TypedDependencyParserModel.

    Dependency parsers provide information about word relationship. For example, dependency parsing can tell you what the subjects and objects of a verb are, as well as which words are modifying (describing) the subject. This can help you find precise answers to specific questions.

    The parser requires the dependant tokens beforehand with e.g. DependencyParser. The required training data can be set in two different ways (only one can be chosen for a particular model):

    Apart from that, no additional training data is needed.

    See TypedDependencyParserApproachTestSpec for further reference on this API.

    Example

    import spark.implicits._
    import com.johnsnowlabs.nlp.base.DocumentAssembler
    import com.johnsnowlabs.nlp.annotators.sbd.pragmatic.SentenceDetector
    import com.johnsnowlabs.nlp.annotators.Tokenizer
    import com.johnsnowlabs.nlp.annotators.pos.perceptron.PerceptronModel
    import com.johnsnowlabs.nlp.annotators.parser.dep.DependencyParserModel
    import com.johnsnowlabs.nlp.annotators.parser.typdep.TypedDependencyParserApproach
    import org.apache.spark.ml.Pipeline
    
    val documentAssembler = new DocumentAssembler()
      .setInputCol("text")
      .setOutputCol("document")
    
    val sentence = new SentenceDetector()
      .setInputCols("document")
      .setOutputCol("sentence")
    
    val tokenizer = new Tokenizer()
      .setInputCols("sentence")
      .setOutputCol("token")
    
    val posTagger = PerceptronModel.pretrained()
      .setInputCols("sentence", "token")
      .setOutputCol("pos")
    
    val dependencyParser = DependencyParserModel.pretrained()
      .setInputCols("sentence", "pos", "token")
      .setOutputCol("dependency")
    
    val typedDependencyParser = new TypedDependencyParserApproach()
      .setInputCols("dependency", "pos", "token")
      .setOutputCol("dependency_type")
      .setConllU("src/test/resources/parser/labeled/train_small.conllu.txt")
      .setNumberOfIterations(1)
    
    val pipeline = new Pipeline().setStages(Array(
      documentAssembler,
      sentence,
      tokenizer,
      posTagger,
      dependencyParser,
      typedDependencyParser
    ))
    
    // Additional training data is not needed, the dependency parser relies on CoNLL-U only.
    val emptyDataSet = Seq.empty[String].toDF("text")
    val pipelineModel = pipeline.fit(emptyDataSet)
  15. class TypedDependencyParserModel extends AnnotatorModel[TypedDependencyParserModel] with HasSimpleAnnotate[TypedDependencyParserModel]

    Permalink

    Labeled parser that finds a grammatical relation between two words in a sentence.

    Labeled parser that finds a grammatical relation between two words in a sentence. Its input is either a CoNLL2009 or ConllU dataset.

    Dependency parsers provide information about word relationship. For example, dependency parsing can tell you what the subjects and objects of a verb are, as well as which words are modifying (describing) the subject. This can help you find precise answers to specific questions.

    The parser requires the dependant tokens beforehand with e.g. DependencyParser.

    Pretrained models can be loaded with pretrained of the companion object:

    val typedDependencyParser = TypedDependencyParserModel.pretrained()
      .setInputCols("dependency", "pos", "token")
      .setOutputCol("dependency_type")

    The default model is "dependency_typed_conllu", if no name is provided. For available pretrained models please see the Models Hub.

    For extended examples of usage, see the Spark NLP Workshop and the TypedDependencyModelTestSpec.

    Example

    import spark.implicits._
    import com.johnsnowlabs.nlp.base.DocumentAssembler
    import com.johnsnowlabs.nlp.annotators.Tokenizer
    import com.johnsnowlabs.nlp.annotators.sbd.pragmatic.SentenceDetector
    import com.johnsnowlabs.nlp.annotators.pos.perceptron.PerceptronModel
    import com.johnsnowlabs.nlp.annotators.parser.dep.DependencyParserModel
    import com.johnsnowlabs.nlp.annotators.parser.typdep.TypedDependencyParserModel
    import org.apache.spark.ml.Pipeline
    
    val documentAssembler = new DocumentAssembler()
      .setInputCol("text")
      .setOutputCol("document")
    
    val sentence = new SentenceDetector()
      .setInputCols("document")
      .setOutputCol("sentence")
    
    val tokenizer = new Tokenizer()
      .setInputCols("sentence")
      .setOutputCol("token")
    
    val posTagger = PerceptronModel.pretrained()
      .setInputCols("sentence", "token")
      .setOutputCol("pos")
    
    val dependencyParser = DependencyParserModel.pretrained()
      .setInputCols("sentence", "pos", "token")
      .setOutputCol("dependency")
    
    val typedDependencyParser = TypedDependencyParserModel.pretrained()
      .setInputCols("dependency", "pos", "token")
      .setOutputCol("dependency_type")
    
    val pipeline = new Pipeline().setStages(Array(
      documentAssembler,
      sentence,
      tokenizer,
      posTagger,
      dependencyParser,
      typedDependencyParser
    ))
    
    val data = Seq(
      "Unions representing workers at Turner Newall say they are 'disappointed' after talks with stricken parent " +
        "firm Federal Mogul."
    ).toDF("text")
    val result = pipeline.fit(data).transform(data)
    
    result.selectExpr("explode(arrays_zip(token.result, dependency.result, dependency_type.result)) as cols")
      .selectExpr("cols['0'] as token", "cols['1'] as dependency", "cols['2'] as dependency_type")
      .show(8, truncate = false)
    +------------+------------+---------------+
    |token       |dependency  |dependency_type|
    +------------+------------+---------------+
    |Unions      |ROOT        |root           |
    |representing|workers     |amod           |
    |workers     |Unions      |flat           |
    |at          |Turner      |case           |
    |Turner      |workers     |flat           |
    |Newall      |say         |nsubj          |
    |say         |Unions      |parataxis      |
    |they        |disappointed|nsubj          |
    +------------+------------+---------------+

Value Members

  1. object TypedDependencyParserApproach extends DefaultParamsReadable[TypedDependencyParserApproach] with Serializable

    Permalink

    This is the companion object of TypedDependencyParserApproach.

    This is the companion object of TypedDependencyParserApproach. Please refer to that class for the documentation.

  2. object TypedDependencyParserModel extends ReadablePretrainedTypedDependency with Serializable

    Permalink

    This is the companion object of TypedDependencyParserModel.

    This is the companion object of TypedDependencyParserModel. Please refer to that class for the documentation.

  3. package feature

    Permalink
  4. package io

    Permalink
  5. package util

    Permalink

Ungrouped