Valid parser states.
Valid parser states.
Access a byte range as a string.
Access a byte range as a string.
Since the underlying data are UTF-8 encoded, i and k must occur on unicode boundaries. Also, the resulting String is not guaranteed to have length (k - i).
Reads a byte as a single Char.
Reads a byte as a single Char. The byte must be valid ASCII (this method is used to parse JSON values like numbers, constants, or delimiters, which are known to be within ASCII).
Return true iff 'i' is at or beyond the end of the input (EOF).
Return true iff 'i' is at or beyond the end of the input (EOF).
This is a specialized accessor for the case where our underlying data are bytes not chars.
This is a specialized accessor for the case where our underlying data are bytes not chars.
The checkpoint() method is used to allow some parsers to store their progress.
The checkpoint() method is used to allow some parsers to store their progress.
Should be called when parsing is finished.
Should be called when parsing is finished.
Generate a Char from the hex digits of "ሴ" (i.
Generate a Char from the hex digits of "ሴ" (i.e. "1234").
NOTE: This is only capable of generating characters from the basic plane. This is why it can only return Char instead of Int.
Used to generate error messages with character info and offsets.
Used to generate error messages with character info and offsets.
Used to generate messages for internal errors.
Used to generate messages for internal errors.
This should only be used in situations where a possible bug in the parser was detected. For errors in user-provided JSON, use die().
Return true iff the bytes/chars from 'i' until 'j' are equal to 'str'.
Return true iff the bytes/chars from 'i' until 'j' are equal to 'str'.
Return true iff the byte/char at 'i' is equal to 'c'.
Return true iff the byte/char at 'i' is equal to 'c'.
Parse the JSON document into a single JSON value.
Parse the JSON document into a single JSON value.
The parser considers documents like '333', 'true', and '"foo"' to be valid, as well as more traditional documents like [1,2,3,4,5]. However, multiple top-level objects are not allowed.
Parse and return the next JSON value and the position beyond it.
Parse and return the next JSON value and the position beyond it.
Parse the JSON constant "false".
Parse the JSON constant "false".
Parse the JSON constant "null".
Parse the JSON constant "null".
Parse the given number, and add it to the given context.
Parse the given number, and add it to the given context.
We don't actually instantiate a number here, but rather pass the string of for future use. Facades can choose to be lazy and just store the string. This ends up being way faster and has the nice side-effect that we know exactly how the user represented the number.
Parse the given number, and add it to the given context.
Parse the given number, and add it to the given context.
This method is a bit slower than parseNum() because it has to be sure it doesn't run off the end of the input.
Normally (when operating in rparse in the context of an outer array or object) we don't need to worry about this and can just grab characters, because if we run out of characters that would indicate bad input. This is for cases where the number could possibly be followed by a valid EOF.
This method has all the same caveats as the previous method.
Parse the string according to JSON rules, and add to the given context.
Parse the string according to JSON rules, and add to the given context.
This method expects the data to be in UTF-8 and accesses it as bytes.
See if the string has any escape sequences.
See if the string has any escape sequences. If not, return the end of the string. If so, bail out and return -1.
This method expects the data to be in UTF-8 and accesses it as bytes. Thus we can just ignore any bytes with the highest bit set.
Parse the JSON constant "true".
Parse the JSON constant "true".
If the cursor 'i' is past the 'curr' buffer, we want to clear the current byte buffer, do a swap, load some more data, and continue.
If the cursor 'i' is past the 'curr' buffer, we want to clear the current byte buffer, do a swap, load some more data, and continue.
Tail-recursive parsing method to do the bulk of JSON parsing.
Tail-recursive parsing method to do the bulk of JSON parsing.
This single method manages parser states, data, etc. Except for parsing non-recursive values (like strings, numbers, and constants) all important work happens in this loop (or in methods it calls, like reset()).
Currently the code is optimized to make use of switch statements. Future work should consider whether this is better or worse than manually constructed if/else statements or something else. Also, it may be possible to reorder some cases for speed improvements.
Swap the curr and next arrays/buffers/counts.
Swap the curr and next arrays/buffers/counts.
We'll call this in response to certain reset() calls. Specifically, when the index provided to reset is no longer in the 'curr' buffer, we want to clear that data and swap the buffers.
Basic file parser.
Given a file name this parser opens it, chunks the data 256K at a time, and parses it.