dregex

impl

package impl

Visibility
  1. Public
  2. All

Type Members

  1. class Dfa extends StrictLogging

  2. case class GenericDfa[A](initial: A, transitions: Map[A, Map[SglChar, A]], accepting: Set[A]) extends StrictLogging with Product with Serializable

  3. case class Nfa(initial: State, transitions: Map[State, Map[Char, Set[State]]], accepting: Set[State]) extends Product with Serializable

  4. class RegexParser extends JavaTokenParsers

  5. class State extends AnyRef

Value Members

  1. object Condition extends Enumeration

  2. object Dfa extends StrictLogging

  3. object Direction extends Enumeration

  4. object LookaroundExpander extends StrictLogging

    A meta regular expression is the intersection or subtraction of 2 other (meta or simple) regular expressions.

    A meta regular expression is the intersection or subtraction of 2 other (meta or simple) regular expressions. Lookaround constructions are transformed in equivalent meta simple regular expressions for processing.

    A(?=B)C is transformed into AC ∩ AB.* A(?!B)C is transformed into AC - AB.*

    In the case of more than one lookaround, the transformation is applied recursively.

    This works if A is of known length

    Only top level lookarounds that are part of a juxtaposition are permitted, i.e. they are no allowed inside parenthesis, nested or as members of a conjunction. Examples:

    Allowed: A(?!B)C (?!B)C

    Not allowed: (?!B)|B part of a conjuction (?!(?!B)) lookaround inside lookaround (A(?!B))B lookaround inside parenthesis A+(?!B)C lookaround with variable-length prefix

    NOTE: Only lookahead is currently implemented

  5. object MetaTrees

  6. object Nfa extends Serializable

    Take a regex AST and produce a NFA.

    Take a regex AST and produce a NFA. Except when noted the Thompson-McNaughton-Yamada algorithm is used. Reference: http://stackoverflow.com/questions/11819185/steps-to-creating-an-nfa-from-a-regular-expression

  7. object NormTree

  8. object Normalizer

    Regular expressions can have character classes and wildcards.

    Regular expressions can have character classes and wildcards. In order to produce a NFA, they should be expanded to disjunctions. In the case of wildcards or negated characted classes, the complete alphabet must also be known to produce the expansion:

    Example transformations with alphabet: abcdefgh

    [abc] -> a|b|c [abc] -> d|e|f|g|h def[abc] -> def(d|e|f|g|h) . -> a|b|c|d|e|f|g|h abc. -> abc(a|b|c|d|e|f|g|h)

    As the alphabet can be potentially huge (such as unicode is) something must be done to reduce the number of disjunctions:

    [abc] -> a|b|c [abc] -> <other_char> def[abc] -> def(d|e|f|<other_char>) . -> <other_char> abc. -> abc(a|b|c|<other_char>)

    Where <other_char> is a special metacharacter that matches any of the characters of the alphabet not present in the regex. Note that with this technique knowing the whole alphabet explicitly is not needed.

    Care must be taken when the regex is meant to be used for an operation with another regex (such as intersection or difference). In this case, <other_char> must match only the characters present in neither regex. Example:

    Regex space: [abc] and [cd] Characters present in any regex: abcd [abc] -> a|b|c [cd] -> a|b|<other_char>

  9. object Operations extends StrictLogging

  10. object Optimizer

  11. object RegexParser extends StrictLogging

  12. object RegexTree

  13. object State

  14. object Util extends StrictLogging

Ungrouped