Package com.cobber.fta.plugins.address
Class AddressEN
- Object
-
- LogicalType
-
- LogicalTypeCode
-
- LogicalTypeInfinite
-
- com.cobber.fta.plugins.address.AddressEN
-
- All Implemented Interfaces:
LTRandom
,Comparable<LogicalType>
public class AddressEN extends LogicalTypeInfinite
Plugin to detect an Address line. (English-language only).
-
-
Field Summary
-
Fields inherited from class com.cobber.fta.LogicalType
analysisConfig, defn, locale, localeInfo, pluginLocaleEntry, priority, threshold
-
-
Constructor Summary
Constructors Constructor Description AddressEN(PluginDefinition plugin)
Construct a plugin to detect an Address based on the Plugin Definition.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description PluginAnalysis
analyzeSet(AnalyzerContext context, long matchCount, long realSamples, String currentRegExp, Facts facts, FiniteMap cardinality, FiniteMap outliers, TokenStreams tokenStreams, AnalysisConfig analysisConfig)
Given the data to date as embodied by the arguments return an analysis.FTAType
getBaseType()
The underlying type we are qualifying.double
getConfidence(long matchCount, long realSamples, AnalyzerContext context)
Confidence in the type classification.String
getRegExp()
The Regular Expression that most closely matches (SeeLogicalType.isRegExpComplete()
) this Semantic Type.boolean
initialize(AnalysisConfig analysisConfig)
Called to perform any initialization.boolean
isCandidate(String trimmed, StringBuilder compressed, int[] charCounts, int[] lastIndex)
A fast check to see if the supplied String might be an instance of this Semantic type?boolean
isValid(String input, boolean detectMode, long count)
Is the supplied String an instance of this Semantic type?String
nextRandom()
nextRandom will generate a random (secure) valid example of this Semantic Type.-
Methods inherited from class com.cobber.fta.LogicalTypeInfinite
isClosed, isRegExpComplete
-
Methods inherited from class com.cobber.fta.LogicalTypeCode
getRandom, seed
-
Methods inherited from class com.cobber.fta.LogicalType
acceptsBaseType, compareTo, getDescription, getHeaderConfidence, getPluginDefinition, getPriority, getSemanticType, getSignature, getThreshold, isLocaleSensitive, isValid, setThreshold
-
-
-
-
Constructor Detail
-
AddressEN
public AddressEN(PluginDefinition plugin)
Construct a plugin to detect an Address based on the Plugin Definition.- Parameters:
plugin
- The definition of this plugin.
-
-
Method Detail
-
nextRandom
public String nextRandom()
Description copied from interface:LTRandom
nextRandom will generate a random (secure) valid example of this Semantic Type.- Returns:
- a new valid example of the Semantic Type.
-
initialize
public boolean initialize(AnalysisConfig analysisConfig) throws FTAPluginException
Description copied from class:LogicalType
Called to perform any initialization.- Overrides:
initialize
in classLogicalTypeCode
- Parameters:
analysisConfig
- The Analysis configuration used for this analysis- Returns:
- True if initialization was successful.
- Throws:
FTAPluginException
- Thrown when the plugin is incorrectly configured.
-
getRegExp
public String getRegExp()
Description copied from class:LogicalType
The Regular Expression that most closely matches (SeeLogicalType.isRegExpComplete()
) this Semantic Type. Note: All valid matches will match this RE, but the inverse is not necessarily true.- Specified by:
getRegExp
in classLogicalType
- Returns:
- The Java Regular Expression that most closely matches this Semantic Type.
-
getBaseType
public FTAType getBaseType()
Description copied from class:LogicalType
The underlying type we are qualifying.- Overrides:
getBaseType
in classLogicalType
- Returns:
- The underlying type - e.g. STRING, LONG, etc.
-
isValid
public boolean isValid(String input, boolean detectMode, long count)
Description copied from class:LogicalType
Is the supplied String an instance of this Semantic type?- Specified by:
isValid
in classLogicalType
- Parameters:
input
- String to check (trimmed for Numeric base Types, un-trimmed for String base Type)detectMode
- If true then we are in the process of detection, otherwise it is a simple validity check.count
- The number of instance of this sample.- Returns:
- true iff the supplied String is an instance of this Semantic type.
-
isCandidate
public boolean isCandidate(String trimmed, StringBuilder compressed, int[] charCounts, int[] lastIndex)
Description copied from class:LogicalTypeInfinite
A fast check to see if the supplied String might be an instance of this Semantic type?- Specified by:
isCandidate
in classLogicalTypeInfinite
- Parameters:
trimmed
- String to checkcompressed
- A compressed representation of the input string (e.g. \d{5} for 20351).charCounts
- An array of occurrence counts for characters in the input (ASCII-only).lastIndex
- An array of the last index where character is located (ASCII-only).- Returns:
- true iff the supplied String is a possible instance of this Semantic type.
-
analyzeSet
public PluginAnalysis analyzeSet(AnalyzerContext context, long matchCount, long realSamples, String currentRegExp, Facts facts, FiniteMap cardinality, FiniteMap outliers, TokenStreams tokenStreams, AnalysisConfig analysisConfig)
Description copied from class:LogicalType
Given the data to date as embodied by the arguments return an analysis. If we think this is an instance of this Semantic type then valid will be true , if invalid then valid will be false and a new Pattern will be returned.- Specified by:
analyzeSet
in classLogicalType
- Parameters:
context
- The context used to interpret the Data Stream (for example, stream name, date resolution mode, etc)matchCount
- Number of samples that match so far (as determined by isValid()realSamples
- Number of real (i.e. non-blank and non-null) samples that we have processed so far.currentRegExp
- The current Regular Expression that we matched againstfacts
- Facts (min, max, sum) for the analysis to date (optional - i.e. maybe null)cardinality
- Cardinality set, up to the maximum maintainedoutliers
- Outlier set, up to the maximum maintainedtokenStreams
- Shapes observedanalysisConfig
- The Configuration of the current analysis- Returns:
- Null if we think this is an instance of this Semantic type (backout pattern otherwise)
-
getConfidence
public double getConfidence(long matchCount, long realSamples, AnalyzerContext context)
Description copied from class:LogicalType
Confidence in the type classification. Typically this will be the number of matches divided by the number of real samples.- Overrides:
getConfidence
in classLogicalType
- Parameters:
matchCount
- Number of matches (as determined by isValid())realSamples
- Number of samples observed - does not include either nulls or blankscontext
- Context we are operating under (includes data stream name(s))- Returns:
- Confidence as a percentage.
-
-