Object
- TextAnalyzer

```
public class TextAnalyzer
extends Object
```
Analyze Text data to determine type information and other key metrics associated with a text stream. A key objective of the analysis is that it should be sufficiently fast to be in-line (i.e. as the data is input from some source it should be possible to stream the data through this class without undue performance degradation).
Typical usage is:
```
 
 		TextAnalyzer analysis = new TextAnalyzer("Age");

 		analysis.train("12");
 		analysis.train("62");
 		analysis.train("21");
 		analysis.train("37");
 		...

 		TextAnalysisResult result = analysis.getResult();
 
 
```

Nested Class Summary

Nested Classes
Modifier and Type Class Description

static class TextAnalyzer.Feature
Enumeration that defines all on/off features for parsers.

Field Summary

Fields
Modifier and Type Field Description

protected static int REFLECTION_SAMPLES

Constructor Summary

Constructors
Constructor	Description
`TextAnalyzer()`	Construct an anonymous Text Analyzer for a data stream.
`TextAnalyzer(AnalyzerContext context)`	Construct a Text Analyzer using the supplied context.
`TextAnalyzer(String name)`	Construct a Text Analyzer for the named data stream.
`TextAnalyzer(String name, com.cobber.fta.dates.DateTimeParser.DateResolutionMode resolutionMode)`	Construct a Text Analyzer for the named data stream with the supplied DateResolutionMode.

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method	Description
`void`	`configure(TextAnalyzer.Feature feature, boolean state)`	Method for changing state of an on/off feature for this TextAnalyzer.
`static TextAnalyzer`	`deserialize(String serialized)`	Create a new TextAnalyzer from a serialized representation - used in concert with `serialize()` and `merge(TextAnalyzer, TextAnalyzer)` to merge TextAnalyzers run on separate shards into a single TextAnalyzer and hence a single TextAnalysisResult.
`protected static int`	`distanceLevenshtein(String source, Set<String> universe)`	Calculate the Levenshtein distance of the source string from the 'closest' string from the provided universe.
`boolean`	`equals(Object obj)`
`boolean`	`equals(Object obj, double epsilon)`
`AnalysisConfig`	`getConfig()`	Get the configuration associated with this TextAnalyzer.
`AnalyzerContext`	`getContext()`	Get the context supplied to the TextAnalyzer.
`int`	`getDetectWindow()`	Get the size of the Detect Window (i.e number of Samples used to collect before attempting to determine the type.
`protected Facts`	`getFacts()`
`int`	`getHistogramBins()`	Gets the number of bins to use for the underlying approximation used to hold the Histogram once maxCardinality is exceeded.
`int`	`getMaxCardinality()`	Get the maximum cardinality that will be tracked.
`int`	`getMaxInputLength()`	Gets the current maximum input length for sampling.
`int`	`getMaxInvalids()`	Get the maximum number of invalid entries that will be tracked.
`int`	`getMaxOutliers()`	Get the maximum number of outliers that will be tracked.
`Plugins`	`getPlugins()`
`int`	`getPluginThreshold()`	Get the current detection Threshold for Semantic Type plugins.
`double`	`getQuantileRelativeAccuracy()`	Gets the relative-error guarantee for quantiles.
`int`	`getReflectionSampleSize()`	Get the number of Samples required before we will 'reflect' on the analysis and potentially change determination.
`protected String`	`getRegExp(KnownTypes.ID id)`
`TextAnalysisResult`	`getResult()`	Determine the result of the training complete to date.
`String`	`getStreamName()`	Get the name of the Data Stream.
`int`	`getThreshold()`	Get the current detection Threshold.
`String`	`getTraceFilePath()`	Return the full path to the trace file, or null if no tracing configured.
`List<String>`	`getTrainingSet()`	Access the training set - this will typically be the first `AnalysisConfig.DETECT_WINDOW_DEFAULT` records.
`boolean`	`isEnabled(TextAnalyzer.Feature feature)`	Method for checking whether given TextAnalyzer feature is enabled.
`protected boolean`	`isNullEquivalent(String input)`
`static TextAnalyzer`	`merge(TextAnalyzer first, TextAnalyzer second)`	Create a new TextAnalyzer which is the result of merging two separate TextAnalyzers.
`protected TextAnalysisResult`	`reAnalyze(Map<String,Long> details)`
`void`	`registerDefaultPlugins(AnalysisConfig analysisConfig)`	Register the default set of plugins for Semantic Type detection.
`String`	`serialize()`	Serialize a TextAnalyzer - commonly used in concert with `deserialize(String)` and `merge(TextAnalyzer, TextAnalyzer)` to merge TextAnalyzers run on separate shards into a single TextAnalyzer and hence a single TextAnalysisResult.
`protected void`	`setConfig(AnalysisConfig analysisConfig)`	Set the configuration associated with this TextAnalyzer.
`protected void`	`setContext(AnalyzerContext context)`	Set the context supplied to the TextAnalyzer.
`void`	`setDebug(int debug)`	Internal Only.
`int`	`setDetectWindow(int detectWindow)`	Set the size of the Detect Window (that is, number of samples) to collect before attempting to determine the type.
`void`	`setDistinctCount(long distinctCount)`	Set the Distinct Count - commonly used where there is an external source that has visibility into the entire data set and 'knows' the distinct count of the set as a whole.
`protected void`	`setExternalFacts(Facts.ExternalFacts externalFacts)`
`int`	`setHistogramBins(int histogramBins)`	Sets the number of bins to use for the underlying approximation used to hold the Histogram once maxCardinality is exceeded.
`void`	`setKeyConfidence(double keyConfidence)`	Set the Key Confidence - typically used where there is an external source that indicated definitively that this is a key.
`void`	`setLocale(Locale locale)`	Override the default Locale.
`int`	`setMaxCardinality(int newCardinality)`	Set the maximum cardinality that will be tracked.
`int`	`setMaxInputLength(int maxInputLength)`	Sets the maximum input length for sampling.
`int`	`setMaxInvalids(int newMaxInvalids)`	Set the maximum number of invalid entries that will be tracked.
`int`	`setMaxOutliers(int newMaxOutliers)`	Set the maximum number of outliers that will be tracked.
`void`	`setPluginThreshold(int threshold)`	The percentage when we declare success 0 - 100 for Semantic Type plugins.
`double`	`setQuantileRelativeAccuracy(double quantileRelativeAccuracy)`	Sets the relative-error guarantee for quantiles.
`void`	`setThreshold(int threshold)`	The percentage when we declare success 0 - 100.
`void`	`setTotalBlankCount(long totalBlankCount)`	Set the count of all blank elements in the entire data stream.
`void`	`setTotalCount(long totalCount)`	Set the total number of elements in the Data Stream.
`void`	`setTotalMaxLength(int totalMaxLength)`	Set the maximum length for Numeric, Boolean and String across the entire data stream.
`void`	`setTotalMaxValue(String totalMaxValue)`	Set the maximum value for Numeric, Boolean and String across the entire data stream.
`void`	`setTotalMean(Double totalMean)`	Set the mean for Numeric types (Long, Double) across the entire data stream.
`void`	`setTotalMinLength(int totalMinLength)`	Set the minimum length for Numeric, Boolean and String across the entire data stream.
`void`	`setTotalMinValue(String totalMinValue)`	Set the minimum value for Numeric, Boolean and String types across the entire data stream.
`void`	`setTotalNullCount(long totalNullCount)`	Set the count of all null elements in the entire data stream.
`void`	`setTotalStandardDeviation(Double totalStandardDeviation)`	Get the Standard Deviation for Numeric types (Long, Double) across the entire data stream (if known).
`void`	`setTrace(String traceOptions)`	Set tracing options.
`void`	`setUniqueness(double uniqueness)`	Set the Uniqueness - typically used where there is an external source that has visibility into the entire data set and 'knows' the uniqueness of the set as a whole.
`boolean`	`train(String rawInput)`	Train is the streaming entry point used to supply input to the Text Analyzer.
`void`	`trainBulk(Map<String,Long> observed)`	TrainBulk is the core bulk entry point used to supply input to the Text Analyzer.

Methods inherited from class java.lang.Object
clone, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - REFLECTION_SAMPLES
```
protected static final int REFLECTION_SAMPLES
```
    See Also:
    
    Constant Field Values
- Constructor Detail
  - TextAnalyzer
```
public TextAnalyzer(AnalyzerContext context)
```
    Construct a Text Analyzer using the supplied context.
    
    Parameters:
    
    context - The context used to interpret the stream.
  - TextAnalyzer
```
public TextAnalyzer(String name)
```
    Construct a Text Analyzer for the named data stream.
    Note: The DateResolutionMode mode will be 'None'.
    
    Parameters:
    
    name - The name of the data stream (e.g. the column of the CSV file)
  - TextAnalyzer
```
public TextAnalyzer()
```
    Construct an anonymous Text Analyzer for a data stream.
    Note: The DateResolutionMode mode will be 'None'.
  - TextAnalyzer
```
public TextAnalyzer(String name,
                    com.cobber.fta.dates.DateTimeParser.DateResolutionMode resolutionMode)
```
    Construct a Text Analyzer for the named data stream with the supplied DateResolutionMode.
    
    Parameters:
    
    name - The name of the data stream (e.g. the column of the CSV file)
    
    resolutionMode - Determines what to do when the Date field is ambiguous (i.e. we cannot determine which of the fields is the day or the month. If resolutionMode is DayFirst, then assume day is first, if resolutionMode is MonthFirst then assume month is first, if it is Auto then choose either DayFirst or MonthFirst based on the locale, if it is None then the pattern returned will have '?' in to represent any ambiguity present.
- Method Detail
  - configure
```
public void configure(TextAnalyzer.Feature feature,
                      boolean state)
```
    Method for changing state of an on/off feature for this TextAnalyzer.
    
    Parameters:
    
    feature - The feature to be set.
    
    state - The new state of the feature.
  - isEnabled
```
public boolean isEnabled(TextAnalyzer.Feature feature)
```
    Method for checking whether given TextAnalyzer feature is enabled.
    
    Parameters:
    
    feature - The feature to be tested.
    
    Returns:
    
    Whether the identified feature is enabled.
  - getStreamName
```
public String getStreamName()
```
    Get the name of the Data Stream.
    
    Returns:
    
    The name of the Data Stream.
  - getContext
```
public AnalyzerContext getContext()
```
    Get the context supplied to the TextAnalyzer.
    
    Returns:
    
    The AnalyzerContext of the TextAnalyzer.
  - setContext
```
protected void setContext(AnalyzerContext context)
```
    Set the context supplied to the TextAnalyzer.
    
    Parameters:
    
    context - The Context for this analysis.
  - getConfig
```
public AnalysisConfig getConfig()
```
    Get the configuration associated with this TextAnalyzer.
    
    Returns:
    
    The AnalysisConfig of the TextAnalyzer.
  - setConfig
```
protected void setConfig(AnalysisConfig analysisConfig)
```
    Set the configuration associated with this TextAnalyzer. Note: Internal only.
    
    Parameters:
    
    analysisConfig - The replacement AnalysisConfig
  - setDebug
```
public void setDebug(int debug)
```
    Internal Only. Enable internal debugging.
    
    Parameters:
    
    debug - The debug level.
  - setTrace
```
public void setTrace(String traceOptions)
```
    Set tracing options. General form of options is <attribute1>=<value1>,<attribute2>=<value2> ... Supported attributes are:
    
    enabled=true/false,
    stream=<name of stream> (defaults to all)
    directory=<directory for trace file> (defaults to java.io.tmpdir)
    samples=<# samples to trace> (defaults to 1000)
    Parameters:
    
    traceOptions - The trace options.
  - setThreshold
```
public void setThreshold(int threshold)
```
    The percentage when we declare success 0 - 100. Typically this should not be adjusted, if you want to run in Strict mode then set this to 100.
    
    Parameters:
    
    threshold - The new threshold for detection.
  - getThreshold
```
public int getThreshold()
```
    Get the current detection Threshold.
    
    Returns:
    
    The current threshold.
  - setPluginThreshold
```
public void setPluginThreshold(int threshold)
```
    The percentage when we declare success 0 - 100 for Semantic Type plugins. Typically this should not be adjusted, if you want to run in Strict mode then set this to 100.
    
    Parameters:
    
    threshold - The new threshold used for detection.
  - getPluginThreshold
```
public int getPluginThreshold()
```
    Get the current detection Threshold for Semantic Type plugins. If not set, this will return -1, this means that each plugin is using a default threshold and doing something sensible!
    
    Returns:
    
    The current threshold.
  - setLocale
```
public void setLocale(Locale locale)
```
    Override the default Locale.
    Note: There is no support for Locales that do not use the Gregorian Calendar.
    
    Parameters:
    
    locale - The new Locale used to determine separators in numbers, date processing, default plugins, etc.
  - setDetectWindow
```
public int setDetectWindow(int detectWindow)
```
    Set the size of the Detect Window (that is, number of samples) to collect before attempting to determine the type. Default is AnalysisConfig.DETECT_WINDOW_DEFAULT.
    Note: It is not possible to change the Sample Size once training has started.
    
    Parameters:
    
    detectWindow - The number of samples to collect
    
    Returns:
    
    The previous value of this parameter.
  - getDetectWindow
```
public int getDetectWindow()
```
    Get the size of the Detect Window (i.e number of Samples used to collect before attempting to determine the type.
    
    Returns:
    
    The current size of the Detect Window.
  - getReflectionSampleSize
```
public int getReflectionSampleSize()
```
    Get the number of Samples required before we will 'reflect' on the analysis and potentially change determination.
    
    Returns:
    
    The current size of the reflection window.
  - setMaxCardinality
```
public int setMaxCardinality(int newCardinality)
```
    Set the maximum cardinality that will be tracked. Default is AnalysisConfig.MAX_CARDINALITY_DEFAULT.
    Note:
    
    The Cardinality must be larger than the Cardinality of the largest Finite Semantic type (if Semantic Type detection is enabled - see configure(Feature, boolean)).
    
    It is not possible to change the cardinality once training has started.
    Parameters:
    
    newCardinality - The maximum Cardinality that will be tracked (0 implies no tracking)
    
    Returns:
    
    The previous value of this parameter.
  - getMaxCardinality
```
public int getMaxCardinality()
```
    Get the maximum cardinality that will be tracked. See setMaxCardinality() method.
    
    Returns:
    
    The maximum cardinality.
  - setMaxOutliers
```
public int setMaxOutliers(int newMaxOutliers)
```
    Set the maximum number of outliers that will be tracked. Default is AnalysisConfig.MAX_OUTLIERS_DEFAULT.
    Note: It is not possible to change the outlier count once training has started.
    
    Parameters:
    
    newMaxOutliers - The maximum number of outliers that will be tracked (0 implies no tracking)
    
    Returns:
    
    The previous value of this parameter.
  - getMaxOutliers
```
public int getMaxOutliers()
```
    Get the maximum number of outliers that will be tracked. See setMaxOutliers() method.
    
    Returns:
    
    The maximum number of outliers to track.
  - setMaxInvalids
```
public int setMaxInvalids(int newMaxInvalids)
```
    Set the maximum number of invalid entries that will be tracked. Default is AnalysisConfig.MAX_INVALID_DEFAULT.
    Note: It is not possible to change the invalid count once training has started.
    
    Parameters:
    
    newMaxInvalids - The maximum number of invalid entries that will be tracked (0 implies no tracking)
    
    Returns:
    
    The previous value of this parameter.
  - getMaxInvalids
```
public int getMaxInvalids()
```
    Get the maximum number of invalid entries that will be tracked. See setMaxInvalids() method.
    
    Returns:
    
    The maximum number of invalid entries to track.
  - setKeyConfidence
```
public void setKeyConfidence(double keyConfidence)
```
    Set the Key Confidence - typically used where there is an external source that indicated definitively that this is a key.
    
    Parameters:
    
    keyConfidence - The new keyConfidence
  - setUniqueness
```
public void setUniqueness(double uniqueness)
```
    Set the Uniqueness - typically used where there is an external source that has visibility into the entire data set and 'knows' the uniqueness of the set as a whole.
    
    Parameters:
    
    uniqueness - The new Uniqueness
  - setDistinctCount
```
public void setDistinctCount(long distinctCount)
```
    Set the Distinct Count - commonly used where there is an external source that has visibility into the entire data set and 'knows' the distinct count of the set as a whole. If determined by FTA it will typically indicate that the distinct count is less than the maximum cardinality being tracked.
    
    Parameters:
    
    distinctCount - The new Distinct Count
  - setTotalCount
```
public void setTotalCount(long totalCount)
```
    Set the total number of elements in the Data Stream. Only used when there is an external source that has visibility into the entire data stream.
    
    Parameters:
    
    totalCount - The total number of elements, as opposed to the number sampled.
  - setTotalNullCount
```
public void setTotalNullCount(long totalNullCount)
```
    Set the count of all null elements in the entire data stream. Only used when there is an external source that has visibility into the entire data stream.
    
    Parameters:
    
    totalNullCount - The total number of null elements, as opposed to the number of nulls in the sample set.
  - setTotalBlankCount
```
public void setTotalBlankCount(long totalBlankCount)
```
    Set the count of all blank elements in the entire data stream. Only used when there is an external source that has visibility into the entire data stream.
    
    Parameters:
    
    totalBlankCount - The total number of blank elements, as opposed to the number of blanks in the sample set.
  - setTotalMean
```
public void setTotalMean(Double totalMean)
```
    Set the mean for Numeric types (Long, Double) across the entire data stream. Only used when there is an external source that has visibility into the entire data stream.
    
    Parameters:
    
    totalMean - The mean of all elements in the data stream, as opposed to the mean of the sampled set.
  - setTotalStandardDeviation
```
public void setTotalStandardDeviation(Double totalStandardDeviation)
```
    Get the Standard Deviation for Numeric types (Long, Double) across the entire data stream (if known). Only used when there is an external source that has visibility into the entire data stream.
    
    Parameters:
    
    totalStandardDeviation - The Standard Deviation of all elements in the data stream, as opposed to the Standard Deviation of the sampled set.
  - setTotalMinValue
```
public void setTotalMinValue(String totalMinValue)
```
    Set the minimum value for Numeric, Boolean and String types across the entire data stream. Only used when there is an external source that has visibility into the entire data stream.
    
    Parameters:
    
    totalMinValue - The minimum value of all elements in the data stream, as opposed to the minimum of the sampled set.
  - setTotalMaxValue
```
public void setTotalMaxValue(String totalMaxValue)
```
    Set the maximum value for Numeric, Boolean and String across the entire data stream. Only used when there is an external source that has visibility into the entire data stream.
    
    Parameters:
    
    totalMaxValue - The manimum value of all elements in the data stream, as opposed to the manimum of the sampled set.
  - setTotalMinLength
```
public void setTotalMinLength(int totalMinLength)
```
    Set the minimum length for Numeric, Boolean and String across the entire data stream. Only used when there is an external source that has visibility into the entire data stream. Note: For String and Boolean types this length includes any whitespace.
    
    Parameters:
    
    totalMinLength - The minimum length of all elements in the data stream, as opposed to the minimum length of the sampled set.
  - setTotalMaxLength
```
public void setTotalMaxLength(int totalMaxLength)
```
    Set the maximum length for Numeric, Boolean and String across the entire data stream. Only used when there is an external source that has visibility into the entire data stream. Note: For String and Boolean types this length includes any whitespace.
    
    Parameters:
    
    totalMaxLength - The manimum length of all elements in the data stream, as opposed to the manimum length of the sampled set.
  - setMaxInputLength
```
public int setMaxInputLength(int maxInputLength)
```
    Sets the maximum input length for sampling. Default is AnalysisConfig.MAX_INPUT_LENGTH_DEFAULT.
    
    Parameters:
    
    maxInputLength - The maximum length of samples, any samples longer than this will be truncated to this length.
    
    Returns:
    
    The previous value of this parameter.
  - getQuantileRelativeAccuracy
```
public double getQuantileRelativeAccuracy()
```
    Gets the relative-error guarantee for quantiles.
    
    Returns:
    
    The relative-error guarantee for quantiles (relevant only if cardinality > maxCardinality).
  - setQuantileRelativeAccuracy
```
public double setQuantileRelativeAccuracy(double quantileRelativeAccuracy)
```
    Sets the relative-error guarantee for quantiles. Default is AnalysisConfig.QUANTILE_RELATIVE_ACCURACY_DEFAULT.
    
    Parameters:
    
    quantileRelativeAccuracy - The relative-error guarantee desired for quantile determination, note smaller values require more memory!
    
    Returns:
    
    The previous value of this parameter.
  - getHistogramBins
```
public int getHistogramBins()
```
    Gets the number of bins to use for the underlying approximation used to hold the Histogram once maxCardinality is exceeded.
    
    Returns:
    
    The number of underlying bins used for the approximation (relevant only if cardinality > maxCardinality).
  - setHistogramBins
```
public int setHistogramBins(int histogramBins)
```
    Sets the number of bins to use for the underlying approximation used to hold the Histogram once maxCardinality is exceeded. Default is AnalysisConfig.HISTOGRAM_BINS_DEFAULT.
    
    Parameters:
    
    histogramBins - the number of bins to use for the underlying approximation, note larger values require more memory!
    
    Returns:
    
    The previous value of this parameter.
  - getTraceFilePath
```
public String getTraceFilePath()
```
    Return the full path to the trace file, or null if no tracing configured. Note: This will only be valid (i.e. non-null) after the first invocation of train() or trainBulk().
    
    Returns:
    
    The Path to the trace file.
  - getMaxInputLength
```
public int getMaxInputLength()
```
    Gets the current maximum input length for sampling.
    
    Returns:
    
    The current maximum length before an input sample is truncated.
  - getRegExp
```
protected String getRegExp(KnownTypes.ID id)
```
  - getPlugins
```
public Plugins getPlugins()
```
  - registerDefaultPlugins
```
public void registerDefaultPlugins(AnalysisConfig analysisConfig)
```
    Register the default set of plugins for Semantic Type detection.
    
    Parameters:
    
    analysisConfig - The Analysis configuration used for this analysis. Note: The Locale (on the configuration) will impact both the set of plugins registered as well as the behavior of the individual plugins
  - trainBulk
```
public void trainBulk(Map<String,Long> observed)
               throws FTAPluginException,
                      FTAUnsupportedLocaleException
```
    TrainBulk is the core bulk entry point used to supply input to the Text Analyzer. This routine is commonly used to support training using the results aggregated from a database query.
    
    Parameters:
    
    observed - A Map containing the observed items and the corresponding count
    
    Throws:
    
    FTAPluginException - Thrown when a registered plugin has detected an issue
    
    FTAUnsupportedLocaleException - Thrown when a requested locale is not supported
  - train
```
public boolean train(String rawInput)
              throws FTAPluginException,
                     FTAUnsupportedLocaleException
```
    Train is the streaming entry point used to supply input to the Text Analyzer.
    
    Parameters:
    
    rawInput - The raw input as a String
    
    Returns:
    
    A boolean indicating if the resultant type is currently known.
    
    Throws:
    
    FTAPluginException - Thrown when a registered plugin has detected an issue
    
    FTAUnsupportedLocaleException - Thrown when a requested locale is not supported
  - isNullEquivalent
```
protected boolean isNullEquivalent(String input)
```
  - distanceLevenshtein
```
protected static int distanceLevenshtein(String source,
                                         Set<String> universe)
```
    Calculate the Levenshtein distance of the source string from the 'closest' string from the provided universe.
    
    Parameters:
    
    source - The source string to test.
    
    universe - The universe of strings to test for distance
    
    Returns:
    
    The Levenshtein distance from the best match.
  - reAnalyze
```
protected TextAnalysisResult reAnalyze(Map<String,Long> details)
                                throws FTAPluginException,
                                       FTAUnsupportedLocaleException
```
    Throws:
    
    FTAPluginException
    
    FTAUnsupportedLocaleException
  - getResult
```
public TextAnalysisResult getResult()
                             throws FTAPluginException,
                                    FTAUnsupportedLocaleException
```
    Determine the result of the training complete to date. Typically invoked after all training is complete, but may be invoked at any stage.
    
    Returns:
    
    A TextAnalysisResult with the analysis of any training completed.
    
    Throws:
    
    FTAPluginException - Thrown when a registered plugin has detected an issue
    
    FTAUnsupportedLocaleException - Thrown when a requested locale is not supported
  - getTrainingSet
```
public List<String> getTrainingSet()
```
    Access the training set - this will typically be the first AnalysisConfig.DETECT_WINDOW_DEFAULT records.
    
    Returns:
    
    A List of the raw input strings.
  - serialize
```
public String serialize()
                 throws FTAPluginException,
                        FTAUnsupportedLocaleException
```
    Serialize a TextAnalyzer - commonly used in concert with deserialize(String) and merge(TextAnalyzer, TextAnalyzer) to merge TextAnalyzers run on separate shards into a single TextAnalyzer and hence a single TextAnalysisResult.
    
    Returns:
    
    A Serialized version of this TextAnalyzer which can be hydrated via deserialize().
    
    Throws:
    
    FTAPluginException - Thrown when a registered plugin has detected an issue
    
    FTAUnsupportedLocaleException - Thrown when a requested locale is not supported
  - deserialize
```
public static TextAnalyzer deserialize(String serialized)
                                throws FTAMergeException,
                                       FTAPluginException,
                                       FTAUnsupportedLocaleException
```
    Create a new TextAnalyzer from a serialized representation - used in concert with serialize() and merge(TextAnalyzer, TextAnalyzer) to merge TextAnalyzers run on separate shards into a single TextAnalyzer and hence a single TextAnalysisResult.
    
    Parameters:
    
    serialized - The serialized form of a TextAnalyzer.
    
    Returns:
    
    A new TextAnalyzer which can be merged with another TextAnalyzer to product a single result.
    
    Throws:
    
    FTAMergeException - When we fail to de-serialize the provided String.
    
    FTAUnsupportedLocaleException - Thrown when a requested locale is not supported
    
    FTAPluginException - Thrown when a registered plugin has detected an issue
  - merge
```
public static TextAnalyzer merge(TextAnalyzer first,
                                 TextAnalyzer second)
                          throws FTAMergeException,
                                 FTAPluginException,
                                 FTAUnsupportedLocaleException
```
    Create a new TextAnalyzer which is the result of merging two separate TextAnalyzers. This is typically used to merge TextAnalyzers run on separate shards into a single TextAnalyzer and hence a single TextAnalysisResult. See also and @link #deserialize(String).
    
    Parameters:
    
    first - The first TextAnalyzer
    
    second - The second TextAnalyzer
    
    Returns:
    
    A new TextAnalyzer which is a merge of the two arguments.
    
    Throws:
    
    FTAMergeException - If the AnalysisConfig for both TextAnalyzers are not identical
    
    FTAUnsupportedLocaleException - Thrown when a requested locale is not supported
    
    FTAPluginException - Thrown when a registered plugin has detected an issue
  - getFacts
```
protected Facts getFacts()
```
  - setExternalFacts
```
protected void setExternalFacts(Facts.ExternalFacts externalFacts)
```
  - equals
```
public boolean equals(Object obj)
```
    Overrides:
    
    equals in class Object
  - equals
```
public boolean equals(Object obj,
                      double epsilon)
```

Class TextAnalyzer

Nested Class Summary

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

REFLECTION_SAMPLES

Constructor Detail

TextAnalyzer

TextAnalyzer

TextAnalyzer

TextAnalyzer

Method Detail

configure

isEnabled

getStreamName

getContext

setContext

getConfig

setConfig

setDebug

setTrace

setThreshold

getThreshold

setPluginThreshold

getPluginThreshold

setLocale

setDetectWindow

getDetectWindow

getReflectionSampleSize

setMaxCardinality

getMaxCardinality

setMaxOutliers

getMaxOutliers

setMaxInvalids

getMaxInvalids

setKeyConfidence

setUniqueness

setDistinctCount

setTotalCount

setTotalNullCount

setTotalBlankCount

setTotalMean

setTotalStandardDeviation

setTotalMinValue

setTotalMaxValue

setTotalMinLength

setTotalMaxLength

setMaxInputLength

getQuantileRelativeAccuracy

setQuantileRelativeAccuracy

getHistogramBins

setHistogramBins

getTraceFilePath

getMaxInputLength

getRegExp

getPlugins

registerDefaultPlugins

trainBulk

train

isNullEquivalent

distanceLevenshtein

reAnalyze

getResult

getTrainingSet

serialize

deserialize

merge

getFacts

setExternalFacts

equals

equals