impl

Type Members

class Dfa extends StrictLogging
case class GenericDfa[A](initial: A, transitions: Map[A, Map[SglChar, A]], accepting: Set[A]) extends StrictLogging with Product with Serializable
case class Nfa(initial: State, transitions: Map[State, Map[Char, Set[State]]], accepting: Set[State]) extends Product with Serializable
class RegexParser extends JavaTokenParsers
class State extends AnyRef

Value Members

object Condition extends Enumeration
object Dfa extends StrictLogging
object Direction extends Enumeration
object LookaroundExpander extends StrictLogging

A meta regular expression is the intersection or subtraction of 2 other (meta or simple) regular expressions.
A meta regular expression is the intersection or subtraction of 2 other (meta or simple) regular expressions. Lookaround constructions are transformed in equivalent meta simple regular expressions for processing.
A(?=B)C is transformed into AC ∩ AB.* A(?!B)C is transformed into AC - AB.*
In the case of more than one lookaround, the transformation is applied recursively.
This works if A is of known length
Only top level lookarounds that are part of a juxtaposition are permitted, i.e. they are no allowed inside parenthesis, nested or as members of a conjunction. Examples:
Allowed: A(?!B)C (?!B)C
Not allowed: (?!B)|B part of a conjuction (?!(?!B)) lookaround inside lookaround (A(?!B))B lookaround inside parenthesis A+(?!B)C lookaround with variable-length prefix
NOTE: Only lookahead is currently implemented
object MetaTrees
object Nfa extends Serializable

Take a regex AST and produce a NFA.
Take a regex AST and produce a NFA. Except when noted the Thompson-McNaughton-Yamada algorithm is used. Reference: http://stackoverflow.com/questions/11819185/steps-to-creating-an-nfa-from-a-regular-expression
object NormTree
object Normalizer

Regular expressions can have character classes and wildcards.
Regular expressions can have character classes and wildcards. In order to produce a NFA, they should be expanded to disjunctions. In the case of wildcards or negated characted classes, the complete alphabet must also be known to produce the expansion:
Example transformations with alphabet: abcdefgh
[abc] -> a|b|c [^{abc] -> d|e|f|g|h
def[}abc] -> def(d|e|f|g|h) . -> a|b|c|d|e|f|g|h abc. -> abc(a|b|c|d|e|f|g|h)
As the alphabet can be potentially huge (such as unicode is) something must be done to reduce the number of disjunctions:
[abc] -> a|b|c [^{abc] -> <other_char>
def[}abc] -> def(d|e|f|<other_char>) . -> <other_char> abc. -> abc(a|b|c|<other_char>)
Where <other_char> is a special metacharacter that matches any of the characters of the alphabet not present in the regex. Note that with this technique knowing the whole alphabet explicitly is not needed.
Care must be taken when the regex is meant to be used for an operation with another regex (such as intersection or difference). In this case, <other_char> must match only the characters present in neither regex. Example:
Regex space: [abc] and [^{cd]
Characters present in any regex: abcd
[abc] -> a|b|c
[}cd] -> a|b|<other_char>
object Operations extends StrictLogging
object Optimizer
object RegexParser extends StrictLogging
object RegexTree
object State
object Util extends StrictLogging

package impl

Type Members

class Dfa extends StrictLogging

case class GenericDfa[A](initial: A, transitions: Map[A, Map[SglChar, A]], accepting: Set[A]) extends StrictLogging with Product with Serializable

case class Nfa(initial: State, transitions: Map[State, Map[Char, Set[State]]], accepting: Set[State]) extends Product with Serializable

class RegexParser extends JavaTokenParsers

class State extends AnyRef

Value Members

object Condition extends Enumeration

object Dfa extends StrictLogging

object Direction extends Enumeration

object LookaroundExpander extends StrictLogging

object MetaTrees

object Nfa extends Serializable

object NormTree

object Normalizer

object Operations extends StrictLogging

object Optimizer

object RegexParser extends StrictLogging

object RegexTree

object State

object Util extends StrictLogging

Ungrouped