class Lexer extends AnyRef
This class provides a large selection of functionality concerned with lexing.
This class provides lexing functionality to parsley
, however
it is guaranteed that nothing in this class is not implementable
purely using parsley
's pre-existing functionality. These are
regular parsers, but constructed in such a way that they create
a clear and logical separation from the rest of the parser.
The class is broken up into several internal "modules" that group
together similar kinds of functionality. Importantly, the lexemes
and nonlexemes
objects separate the underlying token implementations
based on whether or not they consume whitespace or not. Functionality
is broadly duplicated across both of these modules: lexemes
should
be used by a wider parser, to ensure whitespace is handled uniformly;
and nonlexemes
should be used to define further composite tokens or
in special circumstances where whitespace should not be consumed.
It is possible that some of the implementations of parsers found within this class may have been hand-optimised for performance: care will have been taken to ensure these implementations precisely match the semantics of the originals.
- Annotations
- @deprecatedInheritance()
- Source
- Lexer.scala
- Alphabetic
- By Inheritance
- Lexer
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Instance Constructors
- new Lexer(desc: LexicalDesc)
Builds a new lexer with a given description for the lexical structure of the language.
Builds a new lexer with a given description for the lexical structure of the language.
- desc
the configuration for the lexer, specifying the lexical rules of the grammar/language being parsed.
- Since
4.0.0
- new Lexer(desc: LexicalDesc, errConfig: ErrorConfig)
Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @native()
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable])
- def fully[A](p: Parsley[A]): Parsley[A]
This combinator ensures a parser fully parses all available input, and consumes whitespace at the start.
This combinator ensures a parser fully parses all available input, and consumes whitespace at the start.
This combinator should be used once as the outermost combinator in a parser. It is the only combinator that should consume leading whitespace, and this must be the first thing a parser does. It will ensure that, after the parser is complete, the end of the input stream has been reached.
- Since
4.0.0
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toString(): String
- Definition Classes
- AnyRef → Any
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()
- object lexeme extends Lexeme
This object is concerned with lexemes: these are tokens that are treated as "words", such that whitespace will be consumed after each has been parsed.
This object is concerned with lexemes: these are tokens that are treated as "words", such that whitespace will be consumed after each has been parsed.
Ideally, a wider parser should not be concerned with handling whitespace, as it is responsible for dealing with a stream of tokens. With parser combinators, however, it is usually not the case that there is a separate distinction between the parsing phase and the lexing phase. That said, it is good practice to establish a logical separation between the two worlds. As such, this object contains parsers that parse tokens, and these are whitespace-aware. This means that whitespace will be consumed after any of these parsers are parsed. It is not, however, required that whitespace be present.
- Since
4.0.0
- object nonlexeme
This object is concerned with non-lexemes: these are tokens that do not give any special treatment to whitespace.
This object is concerned with non-lexemes: these are tokens that do not give any special treatment to whitespace.
Whilst the functionality in
lexeme
is strongly recommended for wider use in a parser, the functionality here may be useful for more specialised use-cases. In particular, these may for the building blocks for more complex tokens (where whitespace is not allowed between them, say), in which case these compound tokens can be turned into lexemes manually. For example, the lexer does not have configuration for trailing specifiers on numeric literals (like,1024L
in Scala, say): the desired numeric literal parser could be extended with this functionality before whitespace is consumed by using the variant found in this object.Alternatively, these tokens can be used for lexical extraction, which can be performed by the
ErrorBuilder
typeclass: this can be used to try and extract tokens from the input stream when an error happens, to provide a more informative error. In this case, it is desirable to not consume whitespace after the token to keep the error tight and precise.- Since
4.0.0
- object space
This object is concerned with special treatment of whitespace.
This object is concerned with special treatment of whitespace.
For the vast majority of cases, the functionality within this object shouldn't be needed, as whitespace is consistently handled by
lexeme
andfully
. However, for grammars where whitespace is significant (like indentation-sensitive languages), this object provides some more fine-grained control over how whitespace is consumed by the parsers withinlexeme
.- Since
4.0.0
This is the documentation for Parsley.
Package structure
The parsley package contains the
Parsley
class, as well as theResult
,Success
, andFailure
types. In addition to these, it also contains the following packages and "modules" (a module is defined as being an object which mocks a package):parsley.Parsley
contains the bulk of the core "function-style" combinators.parsley.combinator
contains many helpful combinators that simplify some common parser patterns.parsley.character
contains the combinators needed to read characters and strings, as well as combinators to match specific sub-sets of characters.parsley.debug
contains debugging combinators, helpful for identifying faults in parsers.parsley.extension
contains syntactic sugar combinators exposed as implicit classes.parsley.io
contains extension methods to run parsers with input sourced from IO sources.parsley.expr
contains the following sub modules:parsley.expr.chain
contains combinators used in expression parsingparsley.expr.precedence
is a builder for expression parsers built on a precedence table.parsley.expr.infix
contains combinators used in expression parsing, but with more permissive types than their equivalents inchain
.parsley.expr.mixed
contains combinators that can be used for expression parsing, but where different fixities may be mixed on the same level: this is rare in practice.parsley.implicits
contains several implicits to add syntactic sugar to the combinators. These are sub-categorised into the following sub modules:parsley.implicits.character
contains implicits to allow you to use character and string literals as parsers.parsley.implicits.combinator
contains implicits related to combinators, such as the ability to make any parser into aParsley[Unit]
automatically.parsley.implicits.lift
enables postfix application of the lift combinator onto a function (or value).parsley.implicits.zipped
enables boths a reversed form of lift where the function appears on the right and is applied on a tuple (useful when type inference has failed) as well as a.zipped
method for building tuples out of several combinators.parsley.errors
contains modules to deal with error messages, their refinement and generation.parsley.errors.combinator
provides combinators that can be used to either produce more detailed errors as well as refine existing errors.parsley.errors.tokenextractors
provides mixins for common token extraction strategies during error message generation: these can be used to avoid implementingunexpectedToken
in theErrorBuilder
.parsley.lift
contains functions which lift functions that work on regular types to those which now combine the results of parsers returning those same types. these are ubiquitous.parsley.ap
contains functions which allow for the application of a parser returning a function to several parsers returning each of the argument types.parsley.registers
contains combinators that interact with the context-sensitive functionality in the form of registers.parsley.token
contains theLexer
class that provides a host of helpful lexing combinators when provided with the description of a language.parsley.position
contains parsers for extracting position information.parsley.genericbridges
contains some basic implementations of the Parser Bridge pattern (see Design Patterns for Parser Combinators in Scala, or the parsley wiki): these can be used before more specialised generic bridge traits can be constructed.