Packages

  • package root

    This is the documentation for Parsley.

    This is the documentation for Parsley.

    Package structure

    The parsley package contains the Parsley class, as well as the Result, Success, and Failure types. In addition to these, it also contains the following packages and "modules" (a module is defined as being an object which mocks a package):

    • parsley.Parsley contains the bulk of the core "function-style" combinators.
    • parsley.combinator contains many helpful combinators that simplify some common parser patterns.
    • parsley.character contains the combinators needed to read characters and strings, as well as combinators to match specific sub-sets of characters.
    • parsley.debug contains debugging combinators, helpful for identifying faults in parsers.
    • parsley.expr contains the following sub modules:
      • parsley.expr.chain contains combinators used in expression parsing
      • parsley.expr.precedence is a builder for expression parsers built on a precedence table.
      • parsley.expr.infix contains combinators used in expression parsing, but with more permissive types than their equivalents in chain.
      • parsley.expr.mixed contains combinators that can be used for expression parsing, but where different fixities may be mixed on the same level: this is rare in practice.
    • parsley.syntax contains several implicits to add syntactic sugar to the combinators. These are sub-categorised into the following sub modules:
      • parsley.syntax.character contains implicits to allow you to use character and string literals as parsers.
      • parsley.syntax.lift enables postfix application of the lift combinator onto a function (or value).
      • parsley.syntax.zipped enables boths a reversed form of lift where the function appears on the right and is applied on a tuple (useful when type inference has failed) as well as a .zipped method for building tuples out of several combinators.
      • parsley.syntax.extension contains syntactic sugar combinators exposed as implicit classes.
    • parsley.errors contains modules to deal with error messages, their refinement and generation.
    • parsley.lift contains functions which lift functions that work on regular types to those which now combine the results of parsers returning those same types. these are ubiquitous.
    • parsley.ap contains functions which allow for the application of a parser returning a function to several parsers returning each of the argument types.
    • parsley.registers contains combinators that interact with the context-sensitive functionality in the form of registers.
    • parsley.token contains the Lexer class that provides a host of helpful lexing combinators when provided with the description of a language.
    • parsley.position contains parsers for extracting position information.
    • parsley.generic contains some basic implementations of the Parser Bridge pattern (see Design Patterns for Parser Combinators in Scala, or the parsley wiki): these can be used before more specialised generic bridge traits can be constructed.
    Definition Classes
    root
  • package parsley
    Definition Classes
    root
  • package errors

    This package contains various functionality relating to the generation and formatting of error messages.

    This package contains various functionality relating to the generation and formatting of error messages.

    In particular, it includes a collection of combinators for improving error messages within the parser, including labelling and providing additional information. It also contains combinators that can be used to valid data produced by a parser, to ensure it conforms to expected invariances, producing good quality error messages if this is not the case. Finally, this package contains ways of changing the formatting of error messages: this can either be changing how the default String-based errors are formatted, or by injectiing Parsley's errors into a custom error object.

    Definition Classes
    parsley
  • package tokenextractors

    This package contains implementations of token extractors that can be mixed into ErrorBuilder to decide how to extract unexpected tokens from the residual input left over from a parse error.

    This package contains implementations of token extractors that can be mixed into ErrorBuilder to decide how to extract unexpected tokens from the residual input left over from a parse error.

    These are common strategies, and something here is likely to be what is needed. They are all careful to handle unprintable characters and whitespace in a sensible way, and account for unicode codepoints that are wider than a single 16-bit character.

    Definition Classes
    errors
    Since

    4.0.0

  • DefaultErrorBuilder
  • ErrorBuilder
  • ErrorGen
  • SpecialisedGen
  • Token
  • TokenSpan
  • VanillaGen
  • combinator
  • patterns

trait ErrorBuilder[+Err] extends AnyRef

This typeclass specifies how to format an error from a parser as a specified type.

An instance of this trait is required when calling parse (or similar). By default, Parsley defines its own instance for ErrorBuilder[String] found in the ErrorBuilder companion object.

To implement this trait, a number of methods must be defined, as well the representation types for a variety of different components; the relation between the various methods is closely linked to the types that they both produce and consume. To only change the basics of formatting without having to define the entire instance, inherit from DefaultErrorBuilder: this will mean, however, that the representation types cannot be overriden.

How an Error is Structured

There are two kinds of error messages that are generated by Parsley: Specialised and Vanilla. These are produced by different combinators and can be merged with other errors of the same type if both errors appear at the same offset. However, Specialised errors will take precedence over Vanilla errors if they appear at the same offset. The most common form of error is the Vanilla variant, which is generated by most combinators, except for some in errors.combinator.

Both types of error share some common structure, namely:

  • The error preamble, which has the file and the position.
  • The content lines, the specifics of which differ between the two types of error.
  • The context lines, which has the surrounding lines of input for contextualisation.

Vanilla Errors

There are three kinds of content line found in a Vanilla error:

  1. Unexpected info: this contains information about the kind of token that caused the error.
  2. Expected info: this contains the information about what kinds of token could have avoided the error.
  3. Reasons: these are the bespoke reasons that an error has occurred (as generated by explain).

There can be at most one unexpected line, at most one expected line, and zero or more reasons. Both of the unexpected and expected info are built up of error items, which are either: the end of input, a named token, raw input taken from the parser definition. These can all be formatted separately.

The overall structure of a Vanilla error is given in the following diagram:

┌───────────────────────────────────────────────────────────────────────┐
│   Vanilla Error                                                       │
│                          ┌────────────────┐◄──────── position         │
│                  source  │                │                           │
│                     │    │   line      col│                           │
│                     ▼    │     │         ││                           │
│                  ┌─────┐ │     ▼         ▼│   end of input            │
│               In foo.txt (line 1, column 5):       │                  │
│                 ┌─────────────────────┐            │                  │
│unexpected ─────►│                     │            │  ┌───── expected │
│                 │          ┌──────────┐ ◄──────────┘  │               │
│                 unexpected end of input               ▼               │
│                 ┌──────────────────────────────────────┐              │
│                 expected "(", "negate", digit, or letter              │
│                          │    └──────┘  └───┘     └────┘ ◄────── named│
│                          │       ▲        └──────────┘ │              │
│                          │       │                     │              │
│                          │      raw                    │              │
│                          └─────────────────┬───────────┘              │
│                 '-' is a binary operator   │                          │
│                 └──────────────────────┘   │                          │
│                ┌──────┐        ▲           │                          │
│                │>3+4- │        │           expected items             │
│                │     ^│        │                                      │
│                └──────┘        └───────────────── reason              │
│                   ▲                                                   │
│                   │                                                   │
│                   line info                                           │
└───────────────────────────────────────────────────────────────────────┘

Specialised Errors

There is only one kind of content found in a Specialised error: a message. These are completely free-form, and are generated by the fail combinator, as well as its derived combinators. There can be one or more messages in a Specialised error.

The overall structure of a Specialised error is given in the following diagram:

┌───────────────────────────────────────────────────────────────────────┐
│   Specialised Error                                                   │
│                          ┌────────────────┐◄──────── position         │
│                  source  │                │                           │
│                     │    │   line       col                           │
│                     ▼    │     │         │                            │
│                  ┌─────┐ │     ▼         ▼                            │
│               In foo.txt (line 1, column 5):                          │
│                                                                       │
│           ┌───► something went wrong                                  │
│           │                                                           │
│ message ──┼───► it looks like a binary operator has no argument       │
│           │                                                           │
│           └───► '-' is a binary operator                              │
│                ┌──────┐                                               │
│                │>3+4- │                                               │
│                │     ^│                                               │
│                └──────┘                                               │
│                   ▲                                                   │
│                   │                                                   │
│                   line info                                           │
└───────────────────────────────────────────────────────────────────────┘
Err

The final result type of the error message

Source
ErrorBuilder.scala
Since

3.0.0

Linear Supertypes
Known Subclasses
Ordering
  1. Grouped
  2. Alphabetic
  3. By Inheritance
Inherited
  1. ErrorBuilder
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. Protected

Type Members

  1. abstract type EndOfInput <: Item

    Represents the end of the input.

    Represents the end of the input.

    Since

    3.0.0

  2. abstract type ErrorInfoLines

    The representation type of the main body within the error message.

    The representation type of the main body within the error message.

    Since

    3.0.0

  3. abstract type ExpectedItems

    The representation of all the different possible tokens that could have prevented an error.

    The representation of all the different possible tokens that could have prevented an error.

    Since

    3.0.0

  4. abstract type ExpectedLine

    The representation of the information regarding the solving tokens.

    The representation of the information regarding the solving tokens.

    Since

    3.0.0

  5. abstract type Item

    The base type of Raw, Named and EndOfInput that represents the individual items within the error.

    The base type of Raw, Named and EndOfInput that represents the individual items within the error.

    Since

    3.0.0

  6. abstract type LineInfo

    The representation of the line of input where the error occurred.

    The representation of the line of input where the error occurred.

    Since

    3.0.0

  7. abstract type Message

    The representation of a reason or a message generated by the parser.

    The representation of a reason or a message generated by the parser.

    Since

    3.0.0

  8. abstract type Messages

    The representation of the combined reasons or failure messages from the parser.

    The representation of the combined reasons or failure messages from the parser.

    Since

    3.0.0

  9. abstract type Named <: Item

    This represents "named" tokens, which have been provided with a label.

    This represents "named" tokens, which have been provided with a label.

    Since

    3.0.0

  10. abstract type Position

    The representation type of position information within the generated message.

    The representation type of position information within the generated message.

    Since

    3.0.0

  11. abstract type Raw <: Item

    This represents "raw" tokens, where are those without labels: they come direct from the input, or the characters that the parser is trying to read.

    This represents "raw" tokens, where are those without labels: they come direct from the input, or the characters that the parser is trying to read.

    Since

    3.0.0

  12. abstract type Source

    The representation of the file information.

    The representation of the file information.

    Since

    3.0.0

  13. abstract type UnexpectedLine

    The representation of the information regarding the problematic token.

    The representation of the information regarding the problematic token.

    Since

    3.0.0

Abstract Value Members

  1. abstract def combineExpectedItems(alts: Set[Item]): ExpectedItems

    Details how to combine the various expected items into a single representation.

    Details how to combine the various expected items into a single representation.

    alts

    The possible items that fix the error

    Since

    3.0.0

  2. abstract def combineMessages(alts: Seq[Message]): Messages

    Details how to combine any reasons or messages generated within a single error.

    Details how to combine any reasons or messages generated within a single error. Reasons are used by vanilla messages and messages are used by specialised messages.

    alts

    the messages to combine (see the message or reason methods).

    Since

    3.0.0

  3. abstract val endOfInput: EndOfInput

    Value that represents the end of the input in the error message.

    Value that represents the end of the input in the error message.

    Since

    3.0.0

  4. abstract def expected(alts: ExpectedItems): ExpectedLine

    Describes how to handle the information about the tokens that could have avoided the error.

    Describes how to handle the information about the tokens that could have avoided the error.

    alts

    the tokens that could have prevented the error (see the combineExpectedItems method).

    Since

    3.0.0

  5. abstract def format(pos: Position, source: Source, lines: ErrorInfoLines): Err

    This is the top level function, which finally compiles all the formatted sub-parts into a finished value of type Err.

    This is the top level function, which finally compiles all the formatted sub-parts into a finished value of type Err.

    pos

    this is the representation of the position of the error in the input (see the pos method).

    source

    this is the representation of the filename, if it exists (see the source method).

    lines

    this is the main body of the error message (see vanillaError or specialisedError methods).

    returns

    the final assembled error message.

    Since

    3.0.0

  6. abstract def lineInfo(line: String, linesBefore: Seq[String], linesAfter: Seq[String], errorPointsAt: Int, errorWidth: Int): LineInfo

    Describes how to format the information about the line that the error occured on.

    Describes how to format the information about the line that the error occured on.

    line

    the full line of input that produced this error message.

    linesBefore

    the lines of input just before the one that produced this message (up to numLinesBefore).

    linesAfter

    the lines of input just after the one that produced this message (up to numLinesAfter).

    errorPointsAt

    the offset into the line that the error points at.

    Since

    3.1.0

  7. abstract def message(msg: String): Message

    Describes how to represent the messages produced by the fail combinator (or any that are implemented using it).

    Describes how to represent the messages produced by the fail combinator (or any that are implemented using it).

    msg

    the message produced by the parser.

    Since

    3.0.0

  8. abstract def named(item: String): Named

    Formats a named item generated by a label.

    Formats a named item generated by a label.

    item

    the name given to the label.

    Since

    3.0.0

  9. abstract val numLinesAfter: Int

    The number of lines of input to request after an error occured.

    The number of lines of input to request after an error occured.

    Since

    3.1.0

  10. abstract val numLinesBefore: Int

    The number of lines of input to request before an error occured.

    The number of lines of input to request before an error occured.

    Since

    3.1.0

  11. abstract def pos(line: Int, col: Int): Position

    Formats a position into the representation type given by Position.

    Formats a position into the representation type given by Position.

    line

    the line the error occurred at.

    col

    the column the error occurred at.

    returns

    a representation of the position.

    Since

    3.0.0

  12. abstract def raw(item: String): Raw

    Formats a raw item generated by either the input string or a input reading combinator without a label.

    Formats a raw item generated by either the input string or a input reading combinator without a label.

    item

    the raw, unprocessed input.

    Since

    3.0.0

  13. abstract def reason(reason: String): Message

    Describes how to represent the reasons behind a parser fail.

    Describes how to represent the reasons behind a parser fail. These reasons originate from the explain combinator.

    reason

    the reason produced by the parser.

    Since

    3.0.0

  14. abstract def source(sourceName: Option[String]): Source

    Formats the name of the file if it exists into the type give by Source

    Formats the name of the file if it exists into the type give by Source

    sourceName

    the source name of the file, if any.

    Since

    3.0.0

  15. abstract def specialisedError(msgs: Messages, line: LineInfo): ErrorInfoLines

    Specialised errors are triggered by fail and any combinators that are implemented in terms of fail.

    Specialised errors are triggered by fail and any combinators that are implemented in terms of fail. These errors take precedence over the vanilla errors, and contain less, more specialised, information

    msgs

    information detailing the error (see the combineMessages method).

    line

    representation of the line of input that this error occured on (see the lineInfo method).

    Since

    3.0.0

  16. abstract def unexpected(item: Option[Item]): UnexpectedLine

    Describes how to handle the (potentially missing) information about what token(s) caused the error.

    Describes how to handle the (potentially missing) information about what token(s) caused the error.

    item

    The Item that caused this error

    Since

    3.0.0

  17. abstract def unexpectedToken(cs: Iterable[Char], amountOfInputParserWanted: Int, lexicalError: Boolean): Token

    Extracts an unexpected token from the remaining input.

    Extracts an unexpected token from the remaining input.

    When a parser fails, by default an error reports an unexpected token of a specific width. This works well for some parsers, but often it is nice to have the illusion of a dedicated lexing pass: instead of reporting the next few characters as unexpected, an unexpected token can be reported instead. This can take many forms, for instance trimming the token to the next whitespace, only taking one character, or even trying to lex a token out of the stream.

    This method can be easily implemented by mixing in an appropriate token extractor from parsley.errors.tokenextractors into this builder.

    cs

    the remaining input at point of failure (this is guaranteed to be non-empty)

    amountOfInputParserWanted

    the input the parser tried to read when it failed (this is not guaranteed to be smaller than the length of cs, but is guaranteed to be greater than 0)

    lexicalError

    was this error generated as part of "lexing", or in a wider parser (see markAsToken)

    returns

    a token extracted from cs that will be used as part of the unexpected message.

    Since

    4.0.0

  18. abstract def vanillaError(unexpected: UnexpectedLine, expected: ExpectedLine, reasons: Messages, line: LineInfo): ErrorInfoLines

    Vanilla errors are those produced such that they have information about both expected and unexpected tokens.

    Vanilla errors are those produced such that they have information about both expected and unexpected tokens. These are usually the default, and are not produced by fail (or any derivative) combinators.

    unexpected

    information about which token(s) caused the error (see the unexpected method).

    expected

    information about which token(s) would have avoided the error (see the expected method).

    reasons

    additional information about why the error occured (see the combineMessages method).

    line

    representation of the line of input that this error occured on (see the lineInfo method).

    Since

    3.0.0

Concrete Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##: Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.CloneNotSupportedException]) @native()
  6. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  7. def equals(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.Throwable])
  9. final def getClass(): Class[_ <: AnyRef]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  10. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  11. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  12. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  13. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  14. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  15. final def synchronized[T0](arg0: => T0): T0
    Definition Classes
    AnyRef
  16. def toString(): String
    Definition Classes
    AnyRef → Any
  17. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  18. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  19. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException]) @native()

Inherited from AnyRef

Inherited from Any

Top-Level Formatting

These methods help assembly the final products of the error messages. The format method will return the desired Err types, whereas specialisedError and vanillaError both assemble an ErrorInfoLines that the format method can consume.

Error Preamble

These methods control the formatting of the preamble of an error message, which is the position and source info. These are then consumed by format itself.

Contextual Input Lines

These methods control how many lines of input surrounding the error are requested, and direct how these should be put together to form a LineInfo.

Shared Components

These methods control any components or structure shared by both types of messages. In particular, the representation of reasons and messages is shared, as well as how they are combined together to form a unified block of content lines.

Specialised-Specific Components

These methods control the Specialised-specific components, namely the formatting of a bespoke error message.

Vanilla-Specific Components

These methods control the Vanilla-specific error components, namely how expected error items should be combined, how to format the unexpected line, and how to format reasons generated from explain.

Error Items

These methods control how error items within Vanilla errors are formatted. These are either the end of input, a named label generated by the label combinator, or a raw piece of input intrinsically associated with a combinator.

Ungrouped