org.opensextant.extraction (Xponents Core API)

Extraction Fundamentals

Extraction fundamentals include TextEntity, a span in free text, and TextMatch a TextEntity generated by an extractor, matcher, or rule. A span is defined as a character start offset and end offset. A TextEntity provides basic reasoning for span logic and math: compare spans before, after within, overlap, etc.

Beyond that, the extraction helpers here provide specific Solr tagger support, match filteration, match navigation, and match metrics.

Interface Summary
Interface	Description
Extractor	For now, this interface is closer to an AbstractExtractor where a clean interface might be output = Extractor.extract(input) This interface specifies more

Class Summary
Class	Description
ExtractionMetrics	This is a holder for tracking various common measures: No.
ExtractionResult
MatcherUtils
MatchFilter	The Class MatchFilter.
TextEntity	A very simple struct to hold data useful for post-processing entities once found.
TextMatch	A variation on TextEntity that also records pattern metadata

Exception Summary
Exception Description

ExtractionException
An exception to be thrown when place name matching goes awry.

NormalizationException

Exception	Description
ExtractionException	An exception to be thrown when place name matching goes awry.
NormalizationException

Package org.opensextant.extraction

Extraction Fundamentals