colossus.parsing

Combinators

object Combinators

Streaming Parser Combinators

Overview

A Parser[T] is an object that consumes a stream of bytes to produce a result of type T.

A Combinator is a "higher-order" parser that takes one or more parsers to produce a new parser

The Stream parsers are very fast and efficient, but because of this they need to make some tradeoffs. They are mutable, not thread safe, and in general are designed for network protocols, which tend to have very deterministic grammars.

The Parser Rules:

1. A parser must greedily consume the data stream until it produces a result 2. When a parser consumes the last byte necessary to produce a result, it must stop consuming the stream and return the new result while resetting its state

Examples

Use any parser by itself:

val parser = bytes(4)
val data = DataBuffer(ByteString("aaaabbbbccc")
parser.parse(data) // Some(ByteString(97, 97, 97, 97))
parser.parse(data) >> {bytes => bytes.utf8String} // Some("bbbb")
parser.parse(data) // None

Combine two parsers

val parser = bytes(3) ~ bytes(2) >> {case a ~ b => a.ut8String + ":" + b.utf8String}
parser.parse(DataBuffer(ByteString("abc"))) // None
parser.parse(DataBuffer(ByteString("defgh"))) // Some("abc:de")
Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. Combinators
  2. AnyRef
  3. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Type Members

  1. trait Parser[+T] extends AnyRef

  2. case class ~[+A, +B](a: A, b: B) extends Product with Serializable

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  7. val byte: Parser[Byte]

    parse a single byte

  8. def bytes(num: Int): Parser[ByteString]

  9. def bytes(num: Parser[Long]): Parser[ByteString]

    read a fixed number bytes, prefixed by a length

  10. def bytesUntil(terminus: ByteString): Parser[ByteString]

    Keep reading bytes until the terminus is encounted.

    Keep reading bytes until the terminus is encounted. This accounts for possible partial terminus in the data. The terminus is NOT included in the returned value

  11. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  12. def const[T](t: T): Parser[T]

    Creates a parser that will always return the same value without consuming any data.

    Creates a parser that will always return the same value without consuming any data. Useful when flatMapping parsers

  13. def delimitedString(delimiter: Byte, terminus: Byte): Parser[Vector[String]]

    Parse a series of ascii strings seperated by a single-byte delimiter and terminated by a byte

  14. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  15. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  16. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  17. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  18. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  19. def intUntil(terminus: Byte, base: Int = 10): Parser[Long]

    Parses the ASCII representation of an integer, keeps going until the terminus is encountered

  20. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  21. def literal(lit: ByteString): Parser[ByteString]

  22. def maxSize[T](size: DataSize, parser: Parser[T]): Parser[T]

    Creates a parser that wraps another parser and will throw an exception if more than size data is required to parse a single object.

    Creates a parser that wraps another parser and will throw an exception if more than size data is required to parse a single object. See the ParserSizeTracker for more details.

  23. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  24. final def notify(): Unit

    Definition Classes
    AnyRef
  25. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  26. def repeat[T](times: Long, parser: Parser[T]): Parser[Vector[T]]

    Repeat a pattern a fixed number of times

    Repeat a pattern a fixed number of times

    times

    the number of times to parse the pattern

    parser

    the parser for the pattern

    returns

    the parsed sequence

  27. def repeat[T](times: Parser[Long], parser: Parser[T]): Parser[Vector[T]]

    Parse a pattern multiple times based on a numeric prefix

    Parse a pattern multiple times based on a numeric prefix

    This is useful for any situation where the repeated pattern is prefixed by the number of repetitions, for example num:[obj1][obj2][obj3]. In situations where the pattern doesn't immediately follow the number, you'll have to do it yourself, something like

    intUntil(':') ~ otherParser |> {case num ~ other => repeat(num, patternParser)

    }

    intUntil(':') ~ otherParser |> {case num ~ other => repeat(num, patternParser) }}}

    times

    parser for the number of times to repeat the pattern

    parser

    the parser that will parse a single instance of the pattern

    returns

    the parsed sequence

  28. def repeatUntil[T](parser: Parser[T], terminus: Byte): Parser[Seq[T]]

    Repeatedly parse a pattern until a terminal byte is reached

    Repeatedly parse a pattern until a terminal byte is reached

    Before calling parser this will examine the next byte. If the byte matches the terminus, it will return the built sequence. Otherwise it will pass control to parser (including the examined byte) until the parser returns a result.

    Notice that the terminal byte is consumed, so if we have

    val parser = repeatUntil(bytes(2), ':')
    parser.parse(DataBuffer(ByteString("aabbcc:ddee")))

    the bytes remaining in the buffer after parsing are just ddee.

    parser

    the parser repeat

    terminus

    the byte to singal to stop repeating

    returns

    the parsed sequence

  29. def stringUntil(terminus: Byte, toLower: Boolean = false, minSize: Option[Int] = None, allowWhiteSpace: Boolean = true, ltrim: Boolean = false): Parser[String]

    Parse a string until a designated byte is encountered

    Parse a string until a designated byte is encountered

    Limited filtering is currently supported, all of which happens during the reading.

    terminus

    reading will stop when this byte is encountered

    toLower

    if true any characters in the range A-Z will be lowercased before insertion

    minSize

    specify a minimum size

    allowWhiteSpace

    throw a ParseException if any whitespace is encountered before the terminus. If the terminus is a whitespace character, it will not be counted

    ltrim

    trim leading whitespace

  30. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  31. def toString(): String

    Definition Classes
    AnyRef → Any
  32. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  33. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  34. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped