Class/Object

org.clulab.processors.clu

CluProcessor

Related Docs: object CluProcessor | package clu

Permalink

class CluProcessor extends Processor with Configured

Processor that uses only tools that are under Apache License Currently supports: tokenization (in-house), lemmatization (Morpha, copied in our repo to minimize dependencies), POS tagging, NER, chunking, dependency parsing - using our MTL architecture (dep parsing coming soon)

Linear Supertypes
Configured, Processor, AnyRef, Any
Known Subclasses
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. CluProcessor
  2. Configured
  3. Processor
  4. AnyRef
  5. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new CluProcessor(config: Config = ConfigFactory.load("cluprocessor"))

    Permalink

Type Members

  1. case class EmbeddingsAttachment(embeddings: ConstEmbeddingParameters) extends IntermediateDocumentAttachment with Product with Serializable

    Permalink
  2. class PredicateAttachment extends IntermediateDocumentAttachment

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. def annotate(doc: Document): Document

    Permalink

    Annotate the given document, returning an annotated document.

    Annotate the given document, returning an annotated document. The default implementation is an NLP pipeline of side-effecting calls.

    Definition Classes
    CluProcessorProcessor
  5. def annotate(text: String, keepText: Boolean = false): Document

    Permalink

    Annotate the given text string, specify whether to retain the text in the resultant Document.

    Annotate the given text string, specify whether to retain the text in the resultant Document.

    Definition Classes
    Processor
  6. def annotateFromSentences(sentences: Iterable[String], keepText: Boolean = false): Document

    Permalink

    Annotate the given sentences, specify whether to retain the text in the resultant Document.

    Annotate the given sentences, specify whether to retain the text in the resultant Document.

    Definition Classes
    Processor
  7. def annotateFromTokens(sentences: Iterable[Iterable[String]], keepText: Boolean = false): Document

    Permalink

    Annotate the given tokens, specify whether to retain the text in the resultant Document.

    Annotate the given tokens, specify whether to retain the text in the resultant Document.

    Definition Classes
    Processor
  8. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  9. def basicSanityCheck(doc: Document): Unit

    Permalink
  10. def cheapLemmatize(doc: Document): Unit

    Permalink

    Generates cheap lemmas with the word in lower case, for languages where a lemmatizer is not available

  11. def chunking(doc: Document): Unit

    Permalink

    Shallow parsing; modifies the document in place

    Shallow parsing; modifies the document in place

    Definition Classes
    CluProcessorProcessor
  12. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  13. val config: Config

    Permalink
  14. def contains(argPath: String): Boolean

    Permalink
    Definition Classes
    Configured
  15. def discourse(doc: Document): Unit

    Permalink

    Discourse parsing; modifies the document in place

    Discourse parsing; modifies the document in place

    Definition Classes
    CluProcessorProcessor
  16. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  17. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  18. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  19. def getArgBoolean(argPath: String, defaultValue: Option[Boolean]): Boolean

    Permalink
    Definition Classes
    Configured
  20. def getArgFloat(argPath: String, defaultValue: Option[Float]): Float

    Permalink
    Definition Classes
    Configured
  21. def getArgInt(argPath: String, defaultValue: Option[Int]): Int

    Permalink
    Definition Classes
    Configured
  22. def getArgString(argPath: String, defaultValue: Option[String]): String

    Permalink
    Definition Classes
    Configured
  23. def getArgStrings(argPath: String, defaultValue: Option[Seq[String]]): Seq[String]

    Permalink
    Definition Classes
    Configured
  24. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  25. def getConf: Config

    Permalink
    Definition Classes
    CluProcessorConfigured
  26. def getPredicateIndexes(preds: IndexedSeq[String]): IndexedSeq[Int]

    Permalink

    Gets the index of all predicates in this sentence

  27. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  28. val internStrings: Boolean

    Permalink
  29. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  30. def lemmatize(doc: Document): Unit

    Permalink

    Lematization; modifies the document in place

    Lematization; modifies the document in place

    Definition Classes
    CluProcessorProcessor
  31. lazy val lemmatizer: Lemmatizer

    Permalink
  32. lazy val localTokenizer: Tokenizer

    Permalink
    Attributes
    protected
  33. def mkConstEmbeddings(doc: Document): Unit

    Permalink
  34. def mkDocument(text: String, keepText: Boolean = false): Document

    Permalink

    Constructs a document of tokens from free text; includes sentence splitting and tokenization

    Constructs a document of tokens from free text; includes sentence splitting and tokenization

    Definition Classes
    CluProcessorProcessor
  35. def mkDocumentFromSentences(sentences: Iterable[String], keepText: Boolean = false, charactersBetweenSentences: Int = 1): Document

    Permalink

    Constructs a document of tokens from an array of untokenized sentences

    Constructs a document of tokens from an array of untokenized sentences

    Definition Classes
    CluProcessorProcessor
  36. def mkDocumentFromTokens(sentences: Iterable[Iterable[String]], keepText: Boolean = false, charactersBetweenSentences: Int = 1, charactersBetweenTokens: Int = 1): Document

    Permalink

    Constructs a document of tokens from an array of tokenized sentences

    Constructs a document of tokens from an array of tokenized sentences

    Definition Classes
    CluProcessorProcessor
  37. lazy val mtlDepsHead: Metal

    Permalink
  38. lazy val mtlDepsLabel: Metal

    Permalink
  39. lazy val mtlNer: Metal

    Permalink
  40. lazy val mtlPosChunkSrlp: Metal

    Permalink
  41. lazy val mtlSrla: Metal

    Permalink
  42. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  43. def nerSentence(words: Array[String], lemmas: Option[Array[String]], tags: Array[String], startCharOffsets: Array[Int], endCharOffsets: Array[Int], docDateOpt: Option[String], embeddings: ConstEmbeddingParameters): (IndexedSeq[String], Option[IndexedSeq[String]])

    Permalink

    Produces NE labels for one sentence

  44. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  45. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  46. def parse(doc: Document): Unit

    Permalink

    Syntactic parsing; modifies the document in place

    Syntactic parsing; modifies the document in place

    Definition Classes
    CluProcessorProcessor
  47. def parseSentence(words: IndexedSeq[String], posTags: IndexedSeq[String], nerLabels: IndexedSeq[String], embeddings: ConstEmbeddingParameters): DirectedGraph[String]

    Permalink

    Dependency parsing

  48. def recognizeNamedEntities(doc: Document): Unit

    Permalink

    NER; modifies the document in place

    NER; modifies the document in place

    Definition Classes
    CluProcessorProcessor
  49. def relationExtraction(doc: Document): Unit

    Permalink

    Relation extraction; modifies the document in place.

    Relation extraction; modifies the document in place.

    Definition Classes
    CluProcessorProcessor
  50. def resolveCoreference(doc: Document): Unit

    Permalink

    Coreference resolution; modifies the document in place

    Coreference resolution; modifies the document in place

    Definition Classes
    CluProcessorProcessor
  51. def srl(doc: Document): Unit

    Permalink

    Semantic role labeling

    Semantic role labeling

    Definition Classes
    CluProcessorProcessor
  52. def srlSentence(words: IndexedSeq[String], posTags: IndexedSeq[String], nerLabels: IndexedSeq[String], predicateIndexes: IndexedSeq[Int], embeddings: ConstEmbeddingParameters): DirectedGraph[String]

    Permalink

    Produces semantic role frames for one sentence

  53. def srlSentence(sent: Sentence, predicateIndexes: IndexedSeq[Int], embeddings: ConstEmbeddingParameters): DirectedGraph[String]

    Permalink
  54. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  55. def tagPartsOfSpeech(doc: Document): Unit

    Permalink

    Part of speech tagging + chunking + SRL (predicates), jointly

    Part of speech tagging + chunking + SRL (predicates), jointly

    Definition Classes
    CluProcessorProcessor
  56. def tagSentence(words: IndexedSeq[String], embeddings: ConstEmbeddingParameters): (IndexedSeq[String], IndexedSeq[String], IndexedSeq[String])

    Permalink

    Produces POS tags, chunks, and semantic role predicates for one sentence

  57. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  58. lazy val tokenizer: Tokenizer

    Permalink
  59. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  60. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  61. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Configured

Inherited from Processor

Inherited from AnyRef

Inherited from Any

Ungrouped