edu.arizona.sista.processors.bionlp
Shallow parsing; modifies the document in place
Shallow parsing; modifies the document in place
Discourse parsing; modifies the document in place
Discourse parsing; modifies the document in place
SRL; modifies the document in place
SRL; modifies the document in place
Lematization; modifies the document in place
Lematization; modifies the document in place
Constructs a document of tokens from free text; includes sentence splitting and tokenization
Constructs a document of tokens from free text; includes sentence splitting and tokenization
Constructs a document of tokens from an array of untokenized sentences
Constructs a document of tokens from an array of untokenized sentences
Constructs a document of tokens from an array of tokenized sentences
Constructs a document of tokens from an array of tokenized sentences
Syntactic parsing; modifies the document in place
Syntactic parsing; modifies the document in place
Hook to allow postprocessing of CoreNLP POS tagging *in place*, overwriting original POS tags This is useful for domain-specific corrections
Hook to allow postprocessing of CoreNLP POS tagging *in place*, overwriting original POS tags This is useful for domain-specific corrections
The CoreNLP annotation
Implements the bio-specific post-processing steps from McClosky et al.
Implements the bio-specific post-processing steps from McClosky et al. (2011)
Input CoreNLP sentence
The modified tokens
Removes Figure and Table references that appear within parentheses
Removes Figure and Table references that appear within parentheses
The original input text
The preprocessed text
NER; modifies the document in place
NER; modifies the document in place
Coreference resolution; modifies the document in place
Coreference resolution; modifies the document in place
Runs the bio-specific NER and returns an array of BIO (begin-input-output) labels for the sentence
Runs the bio-specific NER and returns an array of BIO (begin-input-output) labels for the sentence
Our own sentence, containing words, lemmas, and POS tags
an array of BIO labels
Part of speech tagging This modifies the document in place, which is not too elegant.
Part of speech tagging This modifies the document in place, which is not too elegant. But there are two reasons for this: (a) Some annotators (e.g., Stanford's CoreNLP) require some state (i.e., their Annotation object) to be passed between operations; (b) This is more efficient during annotate() where all the possible operations are chained.
A processor for biomedical texts, based on CoreNLP, but with different tokenization and NER User: mihais Date: 10/27/14