Package com.cobber.fta
Class Facts
- Object
-
- Facts
-
public class Facts extends Object
A set of facts for the Analysis in question.
-
-
Field Summary
Fields Modifier and Type Field Description long
blankCount
The number of blanks seen in the sample set.Set<String>
bottomK
The bottom 10 values.Map<String,Long>
cardinality
double
confidence
The percentage confidence in the analysis.double
currentM2
char
decimalSeparator
Get the Decimal Separator used to interpret Doubles.Long
distinctCount
The number of distinct valid values seen in the sample set.long
groupingSeparators
protected Double
keyConfidence
The percentage confidence (0-1.0) that the observed stream is a Key field.boolean
leadingWhiteSpace
Do any elements have leading White Space?protected long
leadingZeroCount
The number of leading zeros seen in sample set.protected char
localeDecimalSeparator
long
matchCount
The number of samples that match the patternInfo.PatternInfo
matchPatternInfo
The PatternInfo associated with this matchCount.String
maxBoolean
double
maxDouble
java.time.LocalDate
maxLocalDate
java.time.LocalDateTime
maxLocalDateTime
java.time.LocalTime
maxLocalTime
long
maxLong
java.time.OffsetDateTime
maxOffsetDateTime
String
maxOutlierString
int
maxRawLength
The maximum length (not trimmed) - Only relevant for Numeric, Boolean and String.int
maxRawNonBlankLength
String
maxString
int
maxTrimmedLength
int
maxTrimmedLengthNumeric
int
maxTrimmedOutlierLength
java.time.ZonedDateTime
maxZonedDateTime
Double
mean
The mean of the observed values (Numeric types only).String
minBoolean
double
minDouble
java.time.LocalDate
minLocalDate
java.time.LocalDateTime
minLocalDateTime
java.time.LocalTime
minLocalTime
long
minLong
long
minLongNonZero
java.time.OffsetDateTime
minOffsetDateTime
String
minOutlierString
int
minRawLength
The minimum length (not trimmed) - Only relevant for Numeric, Boolean and String.int
minRawNonBlankLength
String
minString
int
minTrimmedLength
int
minTrimmedLengthNumeric
int
minTrimmedOutlierLength
java.time.ZonedDateTime
minZonedDateTime
boolean
monotonicDecreasing
boolean
monotonicIncreasing
boolean
multiline
Are any elements multi-line?long
nullCount
The number of nulls seen in the sample set.Map<String,Long>
outliers
long
sampleCount
The total number of samples seen.String
streamFormat
TopBottomK<Double,Double>
tbDouble
TopBottomK<java.time.LocalDate,java.time.chrono.ChronoLocalDate>
tbLocalDate
TopBottomK<java.time.LocalDateTime,java.time.chrono.ChronoLocalDateTime<?>>
tbLocalDateTime
TopBottomK<java.time.LocalTime,java.time.LocalTime>
tbLocalTime
TopBottomK<Long,Long>
tbLong
TopBottomK<java.time.OffsetDateTime,java.time.OffsetDateTime>
tbOffsetDateTime
TopBottomK<String,String>
tbString
TopBottomK<java.time.ZonedDateTime,java.time.chrono.ChronoZonedDateTime<?>>
tbZonedDateTime
Set<String>
topK
The top 10 values.long
totalCount
The total number of samples in the stream (typically -1 to indicate unknown).boolean
trailingWhiteSpace
Do any elements have trailing White Space?protected Double
uniqueness
What is the uniqueness percentage of this column.Double
variance
The variance of the observed values (Numeric types only).
-
Constructor Summary
Constructors Constructor Description Facts()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Facts
calculateFacts()
boolean
equals(Object obj, double epsilon)
Locale
getLocale()
Object
getMax()
String
getMaxValue()
Object
getMin()
String
getMinValue()
protected Object
getValue(String input)
void
hydrate()
void
setCollectStatistics(boolean collectStatistics)
void
setLocale(Locale locale)
void
setMaxValue(String maxValue)
void
setMinValue(String minValue)
-
-
-
Field Detail
-
minRawLength
public int minRawLength
The minimum length (not trimmed) - Only relevant for Numeric, Boolean and String. Note: For String and Boolean types this length includes any whitespace.
-
maxRawLength
public int maxRawLength
The maximum length (not trimmed) - Only relevant for Numeric, Boolean and String. Note: For String and Boolean types this length includes any whitespace.
-
multiline
public boolean multiline
Are any elements multi-line?
-
leadingWhiteSpace
public boolean leadingWhiteSpace
Do any elements have leading White Space?
-
trailingWhiteSpace
public boolean trailingWhiteSpace
Do any elements have trailing White Space?
-
keyConfidence
protected Double keyConfidence
The percentage confidence (0-1.0) that the observed stream is a Key field.
-
leadingZeroCount
protected long leadingZeroCount
The number of leading zeros seen in sample set. Only relevant for type Long.
-
decimalSeparator
public char decimalSeparator
Get the Decimal Separator used to interpret Doubles. Only relevant for type double.
-
uniqueness
protected Double uniqueness
What is the uniqueness percentage of this column.
-
localeDecimalSeparator
protected char localeDecimalSeparator
-
minBoolean
public String minBoolean
-
maxBoolean
public String maxBoolean
-
minLong
public long minLong
-
maxLong
public long maxLong
-
tbLong
public final TopBottomK<Long,Long> tbLong
-
minDouble
public double minDouble
-
maxDouble
public double maxDouble
-
tbDouble
public final TopBottomK<Double,Double> tbDouble
-
minString
public String minString
-
maxString
public String maxString
-
tbString
public final TopBottomK<String,String> tbString
-
minLocalDate
public java.time.LocalDate minLocalDate
-
maxLocalDate
public java.time.LocalDate maxLocalDate
-
tbLocalDate
public final TopBottomK<java.time.LocalDate,java.time.chrono.ChronoLocalDate> tbLocalDate
-
minLocalTime
public java.time.LocalTime minLocalTime
-
maxLocalTime
public java.time.LocalTime maxLocalTime
-
tbLocalTime
public final TopBottomK<java.time.LocalTime,java.time.LocalTime> tbLocalTime
-
minLocalDateTime
public java.time.LocalDateTime minLocalDateTime
-
maxLocalDateTime
public java.time.LocalDateTime maxLocalDateTime
-
tbLocalDateTime
public final TopBottomK<java.time.LocalDateTime,java.time.chrono.ChronoLocalDateTime<?>> tbLocalDateTime
-
minOffsetDateTime
public java.time.OffsetDateTime minOffsetDateTime
-
maxOffsetDateTime
public java.time.OffsetDateTime maxOffsetDateTime
-
tbOffsetDateTime
public final TopBottomK<java.time.OffsetDateTime,java.time.OffsetDateTime> tbOffsetDateTime
-
minZonedDateTime
public java.time.ZonedDateTime minZonedDateTime
-
maxZonedDateTime
public java.time.ZonedDateTime maxZonedDateTime
-
tbZonedDateTime
public final TopBottomK<java.time.ZonedDateTime,java.time.chrono.ChronoZonedDateTime<?>> tbZonedDateTime
-
minLongNonZero
public long minLongNonZero
-
monotonicIncreasing
public boolean monotonicIncreasing
-
monotonicDecreasing
public boolean monotonicDecreasing
-
minOutlierString
public String minOutlierString
-
maxOutlierString
public String maxOutlierString
-
minRawNonBlankLength
public int minRawNonBlankLength
-
maxRawNonBlankLength
public int maxRawNonBlankLength
-
minTrimmedLength
public int minTrimmedLength
-
maxTrimmedLength
public int maxTrimmedLength
-
minTrimmedLengthNumeric
public int minTrimmedLengthNumeric
-
maxTrimmedLengthNumeric
public int maxTrimmedLengthNumeric
-
minTrimmedOutlierLength
public int minTrimmedOutlierLength
-
maxTrimmedOutlierLength
public int maxTrimmedOutlierLength
-
groupingSeparators
public long groupingSeparators
-
cardinality
public Map<String,Long> cardinality
-
outliers
public Map<String,Long> outliers
-
currentM2
public double currentM2
-
sampleCount
public long sampleCount
The total number of samples seen.
-
matchCount
public long matchCount
The number of samples that match the patternInfo.
-
totalCount
public long totalCount
The total number of samples in the stream (typically -1 to indicate unknown).
-
nullCount
public long nullCount
The number of nulls seen in the sample set.
-
blankCount
public long blankCount
The number of blanks seen in the sample set.
-
distinctCount
public Long distinctCount
The number of distinct valid values seen in the sample set.
-
confidence
public double confidence
The percentage confidence in the analysis. Typically the matchCount divided by the realSamples (facts.sampleCount - (facts.nullCount + facts.blankCount)).
-
matchPatternInfo
public PatternInfo matchPatternInfo
The PatternInfo associated with this matchCount.
-
mean
public Double mean
The mean of the observed values (Numeric types only).
-
variance
public Double variance
The variance of the observed values (Numeric types only).
-
topK
public Set<String> topK
The top 10 values.
-
bottomK
public Set<String> bottomK
The bottom 10 values.
-
streamFormat
public String streamFormat
-
-
Method Detail
-
setLocale
public void setLocale(Locale locale)
-
getLocale
public Locale getLocale()
-
setCollectStatistics
public void setCollectStatistics(boolean collectStatistics)
-
getMin
public Object getMin()
-
getMax
public Object getMax()
-
getMinValue
public String getMinValue()
-
setMinValue
public void setMinValue(String minValue)
-
getMaxValue
public String getMaxValue()
-
setMaxValue
public void setMaxValue(String maxValue)
-
calculateFacts
public Facts calculateFacts()
-
getValue
protected Object getValue(String input)
-
hydrate
public void hydrate()
-
equals
public boolean equals(Object obj, double epsilon)
-
-