Package com.cobber.fta.core
Class RegExpGenerator
- Object
-
- RegExpGenerator
-
public class RegExpGenerator extends Object
Analyze a set of strings and return a suitable Regular Expression. Unlikely to be an optimal Regular Expression!!Typical usage is:
RegExpGenerator generator = new RegExpGenerator(); generator.train("janv."); generator.train("oct"); generator.train("dec."); ... String result = generator.getResult();
-
-
Constructor Summary
Constructors Constructor Description RegExpGenerator()
RegExpGenerator(int maxSetSize, Locale locale)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description String
getResult()
Given the set of Strings trained (See @link #train(String)) return a Regular Expression which will accept any of the training set.Set<String>
getValues()
Get the set of Strings (in upper case) used to train the Generator.boolean
isDigit()
boolean
isOther()
static boolean
isSpecial(char ch)
Is the supplied character reserved a special meaning in Regular Expressions? Note: We do not declare '-' as a special character, so should not be used in a Character Classstatic String
merge(String firstRE, String secondRE)
static String
slosh(char ch)
static String
slosh(String input)
Return an escaped String (similar to Pattern.quote but not unconditional).static String
toAutomatonRE(String regExp, boolean onlyASCII)
Map a set of "well-known" Regexp's to Unicode Character Classes that the Automaton package supports.void
train(String input)
This method should be called for each string in the set.
-
-
-
Method Detail
-
isSpecial
public static boolean isSpecial(char ch)
Is the supplied character reserved a special meaning in Regular Expressions? Note: We do not declare '-' as a special character, so should not be used in a Character Class- Parameters:
ch
- The character to test.- Returns:
- True if the character is reserved.
-
slosh
public static String slosh(char ch)
-
merge
public static String merge(String firstRE, String secondRE)
-
slosh
public static String slosh(String input)
Return an escaped String (similar to Pattern.quote but not unconditional).- Parameters:
input
- The String to be protected.- Returns:
- An escaped String.
-
isOther
public boolean isOther()
-
isDigit
public boolean isDigit()
-
train
public void train(String input)
This method should be called for each string in the set.- Parameters:
input
- The String to be used as part of the set.
-
getResult
public String getResult()
Given the set of Strings trained (See @link #train(String)) return a Regular Expression which will accept any of the training set.- Returns:
- A regular expression matching the training set.
-
getValues
public Set<String> getValues()
Get the set of Strings (in upper case) used to train the Generator.- Returns:
- The set of Strings (in upper case).
-
toAutomatonRE
public static String toAutomatonRE(String regExp, boolean onlyASCII)
Map a set of "well-known" Regexp's to Unicode Character Classes that the Automaton package supports.- Parameters:
regExp
- A String Java Regular Expression.onlyASCII
- If true then generate simple ASCII only regexps, otherwise utilize Unicode Character Classes.- Returns:
- The Automaton friendly RegExp.
-
-