Class XTemporal
- java.lang.Object
-
- org.opensextant.extractors.flexpat.AbstractFlexPat
-
- org.opensextant.extractors.xtemporal.XTemporal
-
- All Implemented Interfaces:
Extractor
public class XTemporal extends AbstractFlexPat
Date/Time pattern extractor -- detects, parses, normalizes dates. Found date/time are DateMatch (TextMatch) objects- Author:
- ubaldino
-
-
Field Summary
Fields Modifier and Type Field Description static java.lang.String
DEFAULT_XTEMP_CFG
The Constant DEFAULT_XTEMP_CFG.static int
JAVA_0_DATE_YEAR
The Constant JAVA_0_DATE_YEAR.static long
ONE_YEAR_MS
The Constant ONE_YEAR_MS.static java.util.Date
TODAY
Application constants -- note the notion of TODAY is relative to the caller's notion of TODAY.static long
TODAY_EPOCH
The today epoch.-
Fields inherited from class org.opensextant.extractors.flexpat.AbstractFlexPat
debug, log, match_width, patterns, patterns_file
-
-
Method Summary
Modifier and Type Method Description void
cleanup()
Extractor interface: extractors are responsible for cleaning up after themselves.protected RegexPatternManager
createPatternManager(java.io.InputStream strm, java.lang.String name)
Create a pattern manager given the input stream and the file name.java.util.List<TextMatch>
extract(java.lang.String input_buf)
Support the standard Extractor interface.java.util.List<TextMatch>
extract(TextInput input)
Support the standard Extractor interface.TextMatchResult
extract_dates(java.lang.String text, java.lang.String text_id)
A direct call to extract dates; which is useful for diagnostics and development/testing.java.lang.String
getName()
Extractor interface: getName.boolean
isDistantPast(long epoch)
Checks if is distant past.boolean
isDistantPast(java.util.Date dt)
Checks if is distant past.boolean
isDistantPastYMD(java.util.Date dt)
if a date is too far in past to likley be a date of the format YYYY-MM-DD.boolean
isFuture(long epoch)
Given the set MAX_DATE_CUTOFF_YEAR, determine if the date epoch is earlier than this.boolean
isFuture(java.util.Date dt)
Checks if is future.void
match_DateTime(boolean flag)
enable date time patternsvoid
match_DayMonYear(boolean flag)
enable day mon year.void
match_MonDayYear(boolean flag)
enable mon day year patterns.void
setDistantPastYear(int y)
* Application thresholds -- chosen by the user.void
setToday(java.util.Date d)
Optionally reset your context...-
Methods inherited from class org.opensextant.extractors.flexpat.AbstractFlexPat
configure, configure, configure, configure, disableAll, enableAll, getPatternManager, markComplete, set_match_id, setMatchWidth, updateProgress
-
-
-
-
Field Detail
-
DEFAULT_XTEMP_CFG
public static final java.lang.String DEFAULT_XTEMP_CFG
The Constant DEFAULT_XTEMP_CFG.- See Also:
- Constant Field Values
-
TODAY
public static java.util.Date TODAY
Application constants -- note the notion of TODAY is relative to the caller's notion of TODAY. If you are processing data from the past but have a sense of what TODAY is, then when found dates fall on either side of that they will be relative PAST and relative FUTURE.
-
TODAY_EPOCH
public static long TODAY_EPOCH
The today epoch.
-
JAVA_0_DATE_YEAR
public static final int JAVA_0_DATE_YEAR
The Constant JAVA_0_DATE_YEAR.- See Also:
- Constant Field Values
-
ONE_YEAR_MS
public static final long ONE_YEAR_MS
The Constant ONE_YEAR_MS.- See Also:
- Constant Field Values
-
-
Method Detail
-
getName
public java.lang.String getName()
Extractor interface: getName.- Returns:
- extractor name
-
cleanup
public void cleanup()
Extractor interface: extractors are responsible for cleaning up after themselves.
-
createPatternManager
protected RegexPatternManager createPatternManager(java.io.InputStream strm, java.lang.String name) throws java.io.IOException
Description copied from class:AbstractFlexPat
Create a pattern manager given the input stream and the file name.- Specified by:
createPatternManager
in classAbstractFlexPat
- Parameters:
strm
- stream of patterns config filename
- app name- Returns:
- the regex pattern manager
- Throws:
java.io.IOException
- Signals that an I/O exception has occurred.
-
extract
public java.util.List<TextMatch> extract(TextInput input)
Support the standard Extractor interface. This provides access to the most common extraction;- Parameters:
input
- text- Returns:
- list of TextMatch
-
extract
public java.util.List<TextMatch> extract(java.lang.String input_buf)
Support the standard Extractor interface. This provides access to the most common extraction;- Parameters:
input_buf
- text- Returns:
- list of TextMatch
-
extract_dates
public TextMatchResult extract_dates(java.lang.String text, java.lang.String text_id)
A direct call to extract dates; which is useful for diagnostics and development/testing.- Parameters:
text
- texttext_id
- text ID- Returns:
- TextMatchResult, a wrapper around a list of TextMatch
-
match_DateTime
public void match_DateTime(boolean flag)
enable date time patterns- Parameters:
flag
- true if enabling date/time matching
-
match_MonDayYear
public void match_MonDayYear(boolean flag)
enable mon day year patterns.- Parameters:
flag
- true if enabling MonthDayYear family
-
match_DayMonYear
public void match_DayMonYear(boolean flag)
enable day mon year.- Parameters:
flag
- the flag
-
setToday
public void setToday(java.util.Date d)
Optionally reset your context... what is TODAY with respect to your data?- Parameters:
d
- date
-
setDistantPastYear
public void setDistantPastYear(int y)
* Application thresholds -- chosen by the user.- Parameters:
y
- 4-digit year
-
isFuture
public boolean isFuture(long epoch)
Given the set MAX_DATE_CUTOFF_YEAR, determine if the date epoch is earlier than this.- Parameters:
epoch
- epoch since 1970-01-01- Returns:
- true, if is future
-
isFuture
public boolean isFuture(java.util.Date dt)
Checks if is future.- Parameters:
dt
- the dt- Returns:
- true, if is future
-
isDistantPast
public boolean isDistantPast(long epoch)
Checks if is distant past.- Parameters:
epoch
- epoch- Returns:
- true if past DISTANT_PAST_THRESHOLD
-
isDistantPast
public boolean isDistantPast(java.util.Date dt)
Checks if is distant past.- Parameters:
dt
- date- Returns:
- true, if is distant past
-
isDistantPastYMD
public boolean isDistantPastYMD(java.util.Date dt)
if a date is too far in past to likley be a date of the format YYYY-MM-DD.- Parameters:
dt
- date- Returns:
- true if date is distant
-
-