Package

com.johnsnowlabs.nlp.annotators

ner

Permalink

package ner

Visibility
  1. Public
  2. All

Type Members

  1. case class NamedEntity(start: Int, end: Int, entity: String, text: String, sentenceId: String, confidence: Option[Float]) extends Product with Serializable

    Permalink
  2. trait NerApproach[T <: NerApproach[_]] extends Params

    Permalink

  3. class NerConverter extends AnnotatorModel[NerConverter] with HasSimpleAnnotate[NerConverter]

    Permalink

    Converts a IOB or IOB2 representation of NER to a user-friendly one, by associating the tokens of recognized entities and their label.

    Converts a IOB or IOB2 representation of NER to a user-friendly one, by associating the tokens of recognized entities and their label. Results in CHUNK Annotation type.

    NER chunks can then be filtered by setting a whitelist with setWhiteList. Chunks with no associated entity (tagged "O") are filtered.

    See also Inside–outside–beginning (tagging) for more information.

    Example

    This is a continuation of the example of the NerDLModel. See that class on how to extract the entities.

    The output of the NerDLModel follows the Annotator schema and can be converted like so:

    result.selectExpr("explode(ner)").show(false)
    +----------------------------------------------------+
    |col                                                 |
    +----------------------------------------------------+
    |[named_entity, 0, 2, B-ORG, [word -> U.N], []]      |
    |[named_entity, 3, 3, O, [word -> .], []]            |
    |[named_entity, 5, 12, O, [word -> official], []]    |
    |[named_entity, 14, 18, B-PER, [word -> Ekeus], []]  |
    |[named_entity, 20, 24, O, [word -> heads], []]      |
    |[named_entity, 26, 28, O, [word -> for], []]        |
    |[named_entity, 30, 36, B-LOC, [word -> Baghdad], []]|
    |[named_entity, 37, 37, O, [word -> .], []]          |
    +----------------------------------------------------+

    After the converter is used:

    val converter = new NerConverter()
      .setInputCols("sentence", "token", "ner")
      .setOutputCol("entities")
      .setPreservePosition(false)
    
    converter.transform(result).selectExpr("explode(entities)").show(false)
    +------------------------------------------------------------------------+
    |col                                                                     |
    +------------------------------------------------------------------------+
    |[chunk, 0, 2, U.N, [entity -> ORG, sentence -> 0, chunk -> 0], []]      |
    |[chunk, 14, 18, Ekeus, [entity -> PER, sentence -> 0, chunk -> 1], []]  |
    |[chunk, 30, 36, Baghdad, [entity -> LOC, sentence -> 0, chunk -> 2], []]|
    +------------------------------------------------------------------------+
  4. class NerOverwriter extends AnnotatorModel[NerOverwriter] with HasSimpleAnnotate[NerOverwriter]

    Permalink

Value Members

  1. object NerConverter extends ParamsAndFeaturesReadable[NerConverter] with Serializable

    Permalink
  2. object NerOverwriter extends DefaultParamsReadable[NerOverwriter] with Serializable

    Permalink

    This is the companion object of NerOverwriter.

    This is the companion object of NerOverwriter. Please refer to that class for the documentation.

  3. object NerTagsEncoding

    Permalink

    Works with different NER representations as tags Supports: IOB and IOB2 https://en.wikipedia.org/wiki/Inside%E2%80%93outside%E2%80%93beginning_(tagging)

  4. object Verbose extends Enumeration

    Permalink
  5. package crf

    Permalink
  6. package dl

    Permalink

Ungrouped