Interface CharStreamScanner

All Known Implementing Classes:
AbstractCharStreamScanner, CharReaderScanner, CharSequenceScanner

public interface CharStreamScanner
This is the interface for a scanner that can be used to parse a stream or sequence of characters.
  • Field Details

    • EOS

      static final char EOS
      The NULL character '\0' used to indicate the end of stream (EOS).
      ATTENTION: Do not confuse and mix '\0' with '0'.
      See Also:
  • Method Details

    • hasNext

      boolean hasNext()
      This method determines if there is at least one more character available.
      Returns:
      true if there is at least one character available, false if the end of data has been reached.
    • next

      char next()
      This method reads the current character from the stream and increments the index stepping to the next character. You should check if a character is available before calling this method. Otherwise if your stream may contain the NUL character ('\0') you can not distinguish if the end of the stream was reached or you actually read the NUL character.
      Returns:
      the next() character or EOS if none is available.
    • peek

      char peek()
      This method reads the current character without incrementing the index.
      Returns:
      the current character or EOS if none is available.
    • getPosition

      int getPosition()
      Returns:
      the position in the sequence to scan or in other words the number of bytes that have been read. Will initially be 0. Please note that this API is designed for scanning textual content (for parsers). Therefore we consider 2.1 terabyte as a suitable limit.
    • readDigit

      default int readDigit()
      This method reads the next character if it is a digit. Else the state remains unchanged.
      Returns:
      the numeric value of the next Latin digit (e.g. 0 if '0') or -1 if the next character is no Latin digit.
    • readDigit

      int readDigit(int radix)
      This method reads the next character if it is a digit within the given radix. Else the state remains unchanged.
      Parameters:
      radix - the radix that defines the range of the digits. See Integer.parseInt(String, int). E.g. 10 to read any Latin digit (see readDigit()), 8 to read octal digit, 16 to read hex decimal digits.
      Returns:
      the numeric value of the next digit within the given radix or -1 if the next character is no such digit.
    • readLong

      long readLong(int maxDigits) throws NumberFormatException
      This method reads the long starting at the current position by reading as many Latin digits as available but at maximum the given maxDigits and returns its parsed value.
      ATTENTION:
      This method does NOT treat signs (+ or -) to do so, scan them yourself before and negate the result as needed.
      Parameters:
      maxDigits - is the maximum number of digits that will be read. The value has to be positive (greater than zero). Use 19 or higher to be able to read any long number.
      Returns:
      the parsed number.
      Throws:
      NumberFormatException - if the current current position does NOT point to a number.
    • readDouble

      default double readDouble() throws NumberFormatException
      This method reads the double value (decimal number) starting at the current position by reading as many matching characters as available and returns its parsed value.
      Returns:
      the parsed number.
      Throws:
      NumberFormatException - if the current current position does NOT point to a number.
    • readFloat

      default float readFloat() throws NumberFormatException
      This method reads the float value (decimal number) starting at the current position by reading as many matching characters as available and returns its parsed value.
      Returns:
      the parsed number.
      Throws:
      NumberFormatException - if the current current position does NOT point to a number.
    • consumeDecimal

      String consumeDecimal()
      Consumes the characters of a decimal number (double or float).
      Returns:
      the decimal number as String.
    • read

      String read(int count)
      This method reads the number of next characters given by count and returns them as string. If there are less characters available the returned string will be shorter than count and only contain the available characters.
      Parameters:
      count - is the number of characters to read. You may use Integer.MAX_VALUE to read until the end of data if the data-size is suitable.
      Returns:
      a string with the given number of characters or all available characters if less than count. Will be the empty string if no character is available at all.
    • expectUnsafe

      default boolean expectUnsafe(String expected)
      This method skips all next characters as long as they equal to the according character of the expected string.
      If a character differs this method stops and the parser points to the first character that differs from expected. Except for the latter circumstance, this method behaves like the following code:
       read(expected.length).equals(expected)
       
      ATTENTION:
      Be aware that if already the first character differs, this method will NOT change the state of the scanner. So take care NOT to produce infinity loops.
      Parameters:
      expected - is the expected string.
      Returns:
      true if the expected string was successfully consumed from this scanner, false otherwise.
    • expectUnsafe

      boolean expectUnsafe(String expected, boolean ignoreCase)
      This method skips all next characters as long as they equal to the according character of the expected string.
      If a character differs this method stops and the parser points to the first character that differs from expected. Except for the latter circumstance, this method behaves like the following code:
       read(expected.length).equals[IgnoreCase](expected)
       
      ATTENTION:
      Be aware that if already the first character differs, this method will NOT change the state of the scanner. So take care NOT to produce infinity loops.
      Parameters:
      expected - is the expected string.
      ignoreCase - - if true the case of the characters is ignored when compared.
      Returns:
      true if the expected string was successfully consumed from this scanner, false otherwise.
    • expectStrict

      default boolean expectStrict(String expected)
      This method acts as expectUnsafe(String, boolean) but if the expected String is NOT completely present, no character is consumed and the state of the scanner remains unchanged.
      Attention:
      This method requires lookahead. For implementations that are backed by an underlying stream (or reader) the length of the expected String shall not exceed the available lookahead size (buffer capacity given at construction time). Otherwise the method may fail.
      Parameters:
      expected - is the expected string.
      Returns:
      true if the expected string was successfully consumed from this scanner, false otherwise.
    • expectStrict

      default boolean expectStrict(String expected, boolean ignoreCase)
      This method acts as expectUnsafe(String, boolean) but if the expected String is NOT completely present, no character is consumed and the state of the scanner remains unchanged.
      Attention:
      This method requires lookahead. For implementations that are backed by an underlying stream (or reader) the length of the expected String shall not exceed the available lookahead size (buffer capacity given at construction time). Otherwise the method may fail.
      Parameters:
      expected - is the expected string.
      ignoreCase - - if true the case of the characters is ignored when compared.
      Returns:
      true if the expected string was successfully consumed from this scanner, false otherwise.
    • expectStrict

      boolean expectStrict(String expected, boolean ignoreCase, boolean lookahead)
      This method acts as expectUnsafe(String, boolean) but if the expected String is NOT completely present, no character is consumed and the state of the scanner remains unchanged.
      Attention:
      This method requires lookahead. For implementations that are backed by an underlying stream (or reader) the length of the expected String shall not exceed the available lookahead size (buffer capacity given at construction time). Otherwise the method may fail.
      Parameters:
      expected - is the expected string.
      ignoreCase - - if true the case of the characters is ignored when compared.
      lookahead - - if true the state of the scanner remains unchanged even if the expected String has been found, false otherwise.
      Returns:
      true if the expected string was successfully consumed from this scanner, false otherwise.
    • expectOne

      boolean expectOne(char expected)
      This method checks that the next character is equal to the given expected character.
      If the current character was as expected, the parser points to the next character. Otherwise its position will remain unchanged.
      Parameters:
      expected - is the expected character.
      Returns:
      true if the current character is the same as expected, false otherwise.
    • expectOne

      default boolean expectOne(CharFilter expected)
      This method checks that the next character is accepted by the given CharFilter.
      If the current character was as expected, the parser points to the next character. Otherwise its position will remain unchanged.
      Parameters:
      expected - is the CharFilter accepting the expected chars.
      Returns:
      true if the current character is accepted, false otherwise.
    • requireOne

      default void requireOne(char expected)
      This method verifies that the next character is equal to the given expected character.
      If the current character was as expected, the parser points to the next character. Otherwise an exception is thrown indicating the problem.
      Parameters:
      expected - is the expected character.
    • require

      void require(String expected, boolean ignoreCase)
      This method verifies that the expected string gets consumed from this scanner with respect to ignoreCase. Otherwise an exception is thrown indicating the problem.
      This method behaves functionally equivalent to the following code:
       if (!scanner.expectUnsafe(expected, ignoreCase)) {
         throw new IllegalStateException(...);
       }
       
      Parameters:
      expected - is the expected string.
      ignoreCase - - if true the case of the characters is ignored during comparison.
    • requireOne

      default int requireOne(CharFilter filter)
      Parameters:
      filter - the CharFilter accepting the expected characters to skip.
      Returns:
      the actual number of characters that have been skipped.
      Throws:
      IllegalStateException - if less than 1 or more than 1000 accepted characters have been consumed.
    • requireOneOrMore

      default int requireOneOrMore(CharFilter filter)
      Parameters:
      filter - the CharFilter accepting the expected characters to skip.
      Returns:
      the actual number of characters that have been skipped.
      Throws:
      IllegalStateException - if less than 1 or more than 1000 accepted characters have been consumed.
    • require

      default int require(CharFilter filter, int min)
      Parameters:
      filter - the CharFilter accepting the expected characters to skip.
      min - the minimum required number of skipped characters.
      Returns:
      the actual number of characters that have been skipped.
      Throws:
      IllegalStateException - if less than min or more than 1000 accepted characters have been consumed.
    • require

      default int require(CharFilter filter, int min, int max)
      Parameters:
      filter - the CharFilter accepting the expected characters to skip.
      min - the minimum required number of skipped characters.
      max - the maximum number of skipped characters.
      Returns:
      the actual number of characters that have been skipped.
      Throws:
      IllegalStateException - if less than min or more than max accepted characters have been consumed.
    • skipUntil

      boolean skipUntil(char stop)
      This method skips all next characters until the given stop character or the end is reached. If the stop character was reached, this scanner will point to the next character after stop when this method returns.
      Parameters:
      stop - is the character to read until.
      Returns:
      true if the first occurrence of the given stop character has been passed, false if there is no such character.
    • skipUntil

      boolean skipUntil(char stop, char escape)
      This method reads all next characters until the given stop character or the end of the string to parse is reached. In advance to skipUntil(char), this method will read over the stop character if it is escaped with the given escape character.
      Parameters:
      stop - is the character to read until.
      escape - is the character used to escape the stop character (e.g. '\').
      Returns:
      true if the first occurrence of the given stop character has been passed, false if there is no such character.
    • readUntil

      String readUntil(char stop, boolean acceptEnd)
      This method reads all next characters until the given stop character or the end is reached.
      After the call of this method, the current index will point to the next character after the (first) stop character or to the end if NO such character exists.
      Parameters:
      stop - is the character to read until.
      acceptEnd - if true the end of data will be treated as stop, too.
      Returns:
      the string with all read characters excluding the stop character or null if there was no stop character and acceptEnd is false.
    • readUntil

      String readUntil(CharFilter filter, boolean acceptEnd)
      This method reads all next characters until the first character accepted by the given filter or the end is reached.
      After the call of this method, the current index will point to the first accepted stop character or to the end if NO such character exists.
      Parameters:
      filter - is used to decide where to stop.
      acceptEnd - if true if end of data should be treated like the stop character and the rest of the text will be returned, false otherwise (to return null if the end of data was reached and the scanner has been consumed).
      Returns:
      the string with all read characters not accepted by the given CharFilter or null if there was no accepted character and acceptEnd is false.
    • readUntil

      default String readUntil(CharFilter filter, boolean acceptEnd, String stop)
      This method reads all next characters until the first character accepted by the given filter, the given stop String or the end is reached.
      After the call of this method, the current index will point to the first accepted stop character, or to the first character of the given stop String or to the end if NO such character exists.
      Parameters:
      filter - is used to decide where to stop.
      acceptEnd - if true if the end of data should be treated like the stop character and the rest of the text will be returned, false otherwise (to return null if end of data was reached and the scanner has been consumed).
      stop - the String where to stop consuming data. Should be at least two characters long (otherwise accept by CharFilter instead).
      Returns:
      the string with all read characters not accepted by the given CharFilter or until the given stop String was detected. If end of data was reached without a stop signal the entire rest of the data is returned or null if acceptEnd is false.
    • readUntil

      default String readUntil(CharFilter filter, boolean acceptEnd, String stop, boolean ignoreCase)
      This method reads all next characters until the first character accepted by the given filter, the given stop String or the end is reached.
      After the call of this method, the current index will point to the first accepted stop character, or to the first character of the given stop String or to the end if NO such character exists.
      Parameters:
      filter - is used to decide where to stop.
      acceptEnd - if true if the end of data should be treated like the stop character and the rest of the text will be returned, false otherwise (to return null if the end of data was reached and the scanner has been consumed).
      stop - the String where to stop consuming data. Should be at least two characters long (otherwise accept by CharFilter instead).
      ignoreCase - - if true the case of the characters is ignored when compared with characters from stop String.
      Returns:
      the string with all read characters not accepted by the given CharFilter or until the given stop String was detected. If the end of data was reached without a stop signal the entire rest of the data is returned or null if acceptEnd is false.
    • readUntil

      String readUntil(CharFilter filter, boolean acceptEnd, String stop, boolean ignoreCase, boolean trim)
      This method reads all next characters until the first character accepted by the given filter, the given stop String or the end is reached.
      After the call of this method, the current index will point to the first accepted stop character, or to the first character of the given stop String or to the end if NO such character exists.
      Parameters:
      filter - is used to decide where to stop.
      acceptEnd - if true if the end of data should be treated like the stop character and the rest of the text will be returned, false otherwise (to return null if the end of data was reached and the scanner has been consumed).
      stop - the String where to stop consuming data. Should be at least two characters long (otherwise accept by CharFilter instead).
      ignoreCase - - if true the case of the characters is ignored when compared with characters from stop String.
      trim - - true if the result should be trimmed, false otherwise.
      Returns:
      the string with all read characters not accepted by the given CharFilter or until the given stop String was detected. If the end of data was reached without hitting stop the entire rest of the data is returned or null if acceptEnd is false. Thre result will be trimmed if trim is true.
    • readUntil

      String readUntil(char stop, boolean acceptEnd, char escape)
      This method reads all next characters until the given (un-escaped) stop character or the end is reached.
      In advance to readUntil(char, boolean), this method allows that the stop character may be used in the input-string by adding the given escape character. After the call of this method, the current index will point to the next character after the (first) stop character or to the end if NO such character exists.
      This method is especially useful when quoted strings should be parsed. E.g.:
       CharStreamScanner scanner = getScanner();
       doSomething();
       char c = scanner.next();
       if ((c == '"') || (c == '\'')) {
         char escape = c; // may also be something like '\'
         String quote = scanner.readUntil(c, false, escape)
       } else {
         doOtherThings();
       }
       
      Parameters:
      stop - is the character to read until.
      acceptEnd - if true the end of data will be treated as stop, too.
      escape - is the character used to escape the stop character. To add an occurrence of the escape character it has to be duplicated (occur twice). The escape character may also be equal to the stop character. If other regular characters are escaped the escape character is simply ignored.
      Returns:
      the string with all read characters excluding the stop character or null if there was no stop character and acceptEnd is false.
    • readUntil

      String readUntil(char stop, boolean acceptEnd, CharScannerSyntax syntax)
      This method reads all next characters until the given stop character or the end of the string to parse is reached. In advance to readUntil(char, boolean), this method will scan the input using the given syntax which e.g. allows to escape the stop character.
      After the call of this method, the current index will point to the next character after the (first) stop character or to the end of the string if NO such character exists.
      Parameters:
      stop - is the character to read until.
      acceptEnd - if true the end of data will be treated as stop, too.
      syntax - contains the characters specific for the syntax to read.
      Returns:
      the string with all read characters excluding the stop character or null if there was no stop character.
      See Also:
    • readUntil

      String readUntil(CharFilter filter, boolean acceptEnd, CharScannerSyntax syntax)
      This method reads all next characters until the given CharFilter accepts the current character as stop character or the end of data is reached. In advance to readUntil(char, boolean), this method will scan the input using the given syntax which e.g. allows to escape the stop character.
      After the call of this method, the current index will point to the next character after the (first) stop character or to the end of the string if NO such character exists.
      Parameters:
      filter - is used to decide where to stop.
      acceptEnd - if true the end of data will be treated as stop, too.
      syntax - contains the characters specific for the syntax to read.
      Returns:
      the string with all read characters excluding the stop character or null if there was no stop character.
      See Also:
    • readWhile

      default String readWhile(CharFilter filter)
      This method reads all next characters that are accepted by the given filter.
      After the call of this method, the current index will point to the next character that was NOT accepted by the given filter or to the end if NO such character exists.
      Parameters:
      filter - is used to decide which characters should be accepted.
      Returns:
      a string with all characters accepted by the given filter. Will be the empty string if no character was accepted.
      See Also:
    • readWhile

      String readWhile(CharFilter filter, int max)
      This method reads all next characters that are accepted by the given filter.
      After the call of this method, the current index will point to the next character that was NOT accepted by the given filter. If the next max characters or the characters left until the end of this scanner are accepted, only that amount of characters are skipped.
      Parameters:
      filter - is used to decide which characters should be accepted.
      max - is the maximum number of characters that should be read.
      Returns:
      a string with all characters accepted by the given filter limited to the length of max and the end of this scanner. Will be the empty string if no character was accepted.
      See Also:
    • skip

      int skip(int count)
      This method skips the number of next characters given by count.
      Parameters:
      count - is the number of characters to skip. You may use Integer.MAX_VALUE to read until the end of data if the data-size is suitable.
      Returns:
      a to total number of characters that have been skipped. Typically equal to count. Will be less in case the end of data was reached.
    • skipOver

      default boolean skipOver(String substring)
      This method reads all next characters until the given substring has been detected.
      After the call of this method, the current index will point to the next character after the first occurrence of substring or to the end of data if the given substring was NOT found.
      Parameters:
      substring - is the substring to search and skip over starting at the current index.
      Returns:
      true if the given substring occurred and has been passed and false if the end of the string has been reached without any occurrence of the given substring.
    • skipOver

      default boolean skipOver(String substring, boolean ignoreCase)
      This method reads all next characters until the given substring has been detected.
      After the call of this method, the current index will point to the next character after the first occurrence of substring or to the end of data if the given substring was NOT found.
      Parameters:
      substring - is the substring to search and skip over starting at the current index.
      ignoreCase - - if true the case of the characters is ignored when compared with characters from substring.
      Returns:
      true if the given substring occurred and has been passed and false if the end of the string has been reached without any occurrence of the given substring.
    • skipOver

      boolean skipOver(String substring, boolean ignoreCase, CharFilter stopFilter)
      This method consumes all next characters until the given substring has been detected, a character was accepted by the given CharFilter or the end of data was reached.
      After the call of this method this scanner will point to the next character after the first occurrence of substring, to the stop character or to end of data.
      Parameters:
      substring - is the substring to search and skip over starting at the current index.
      ignoreCase - - if true the case of the characters is ignored when compared with characters from substring.
      stopFilter - is the filter used to detect stop characters. If such character was detected, the skip is stopped and the parser points to the character after the stop character. The substring should NOT contain a stop character.
      Returns:
      true if the given substring occurred and has been passed and false if a stop character has been detected or the end of the string has been reached without any occurrence of the given substring or stop character.
    • skipWhile

      int skipWhile(char c)
      This method reads all next characters that are identical to the character given by c.
      E.g. use readWhile(' ') to skip all blanks from the current index. After the call of this method, the current index will point to the next character that is different to the given character c or to the end if NO such character exists.
      Parameters:
      c - is the character to read over.
      Returns:
      the number of characters that have been skipped.
    • skipWhile

      default int skipWhile(CharFilter filter)
      This method reads all next characters that are accepted by the given filter.
      After the call of this method, the current index will point to the next character that was NOT accepted by the given filter or to the end if NO such character exists.
      Parameters:
      filter - is used to decide which characters should be accepted.
      Returns:
      the number of characters accepted by the given filter that have been skipped.
      See Also:
    • skipWhile

      int skipWhile(CharFilter filter, int max)
      This method reads all next characters that are accepted by the given filter.
      After the call of this method, the current index will point to the next character that was NOT accepted by the given filter. If the next max characters or the characters left until the end of this scanner are accepted, only that amount of characters are skipped.
      Parameters:
      filter - is used to decide which characters should be accepted.
      max - is the maximum number of characters that may be skipped.
      Returns:
      the number of skipped characters.
      See Also:
    • skipWhileAndPeek

      default char skipWhileAndPeek(CharFilter filter)
      Behaves like the following code:
       skipWhile(filter);
       return peek();
       
      Parameters:
      filter - is used to decide which characters should be accepted.
      Returns:
      the first character that was not accepted by the given CharFilter. Only the accepted characters have been consumed, this scanner still points to the returned character.
    • skipWhileAndPeek

      default char skipWhileAndPeek(CharFilter filter, int max)
      Behaves like the following code:
       skipWhile(filter, max);
       return peek();
       
      Parameters:
      filter - is used to decide which characters should be accepted.
      max - is the maximum number of characters that may be skipped.
      Returns:
      the first character that was not accepted by the given CharFilter. Only the accepted characters have been consumed, this scanner still points to the returned character.
    • readLine

      default String readLine()
      Returns:
      a String with the data until the end of the current line or the end of the data. Will be null if the end has already been reached and hasNext() returns false.
    • readLine

      String readLine(boolean trim)
      Parameters:
      trim - - true if the result should be trimmed, false otherwise.
      Returns:
      a String with the data until the end of the current line (trimmed if trim is true) or the end of the data. Will be null if the end has already been reached and hasNext() returns false.
    • readJavaStringLiteral

      default String readJavaStringLiteral()
      Reads and parses a Java String literal value according to JLS 3.10.6.
      As a complex example for the input "Hi \"\176\477\579•∑\"\n" this scanner would return the String output Hi "~'7/9•∑" followed by a newline character.
      Returns:
      the parsed Java String literal value or null if not pointing to a String literal.
    • readJavaStringLiteral

      String readJavaStringLiteral(boolean tolerant)
      Reads and parses a Java String literal value according to JLS 3.10.6.
      As a complex example for the input "Hi \"\176\477\579•∑\"\n" this scanner would return the String output Hi "~'7/9•∑" followed by a newline character.
      Parameters:
      tolerant - - true if invalid escape sequences should be tolerated (as '?'), false to throw an exception in such case.
      Returns:
      the parsed Java String literal value or null if not pointing to a String literal.
    • readJavaCharLiteral

      default Character readJavaCharLiteral()
      Reads and parses a Java Character literal value according to JLS 3.10.6.
      Examples are given in the following table:
      literal result comment
      'a' a regular char
      '\'' ' escaped char
      '\176' ~ escaped octal representation
      '•' escaped unicode representation
      Returns:
      the parsed Java String literal value or null if not pointing to a String literal.
    • readJavaCharLiteral

      Character readJavaCharLiteral(boolean tolerant)
      Reads and parses a Java Character literal value according to JLS 3.10.6.
      Examples are given in the following table:
      literal result comment
      'a' a regular char
      '\'' ' escaped char
      '\176' ~ escaped octal representation
      '•' escaped unicode representation
      Parameters:
      tolerant - - true if an invalid char literal should be tolerated (as '?'), false to throw an exception in such case.
      Returns:
      the parsed Java String literal value or null if not pointing to a String literal.
    • getBufferParsed

      String getBufferParsed()
      Returns:
      the String with the characters that have already been parsed but are still available in the underlying buffer. May be used for debugging or error messages.
    • getBufferToParse

      String getBufferToParse()
      Returns:
      the String with the characters that have not yet been parsed but are available in the underlying buffer. May be used for debugging or error messages.