Class CharSequenceScanner

java.lang.Object
io.github.mmm.scanner.AbstractCharStreamScanner
io.github.mmm.scanner.CharSequenceScanner
All Implemented Interfaces:
CharStreamScanner

public class CharSequenceScanner extends AbstractCharStreamScanner
This class represents a String or better a sequence of characters ( char[]) together with a position in that sequence.
It has various useful methods for scanning the sequence. This scanner is designed to be fast on long sequences and therefore internally converts Strings to a char array instead of frequently calling String.charAt(int).
ATTENTION:
This implementation is NOT thread-safe (intended by design).
Since:
1.0.0
  • Constructor Details

    • CharSequenceScanner

      public CharSequenceScanner(CharSequence charSequence)
      The constructor.
      Parameters:
      charSequence - is the string to scan.
    • CharSequenceScanner

      public CharSequenceScanner(String string)
      The constructor.
      Parameters:
      string - is the string to parse.
    • CharSequenceScanner

      public CharSequenceScanner(char[] characters)
      The constructor.
      Parameters:
      characters - is an array containing the characters to scan.
    • CharSequenceScanner

      public CharSequenceScanner(char[] characters, int offset, int length)
      The constructor.
      Parameters:
      characters - is an array containing the characters to scan.
      offset - is the index of the first char to scan in characters (typically 0 to start at the beginning of the array).
      length - is the number of characters to scan from characters starting at offset (typically characters.length - offset).
  • Method Details

    • charAt

      public char charAt(int index)
      Parameters:
      index - is the index of the requested character.
      Returns:
      the character at the given index.
      See Also:
    • getPosition

      public int getPosition()
      Returns:
      the position in the sequence to scan or in other words the number of bytes that have been read. Will initially be 0. Please note that this API is designed for scanning textual content (for parsers). Therefore we consider 2.1 terabyte as a suitable limit.
    • getLength

      public int getLength()
      Returns:
      the total length of the string to parse.
      See Also:
    • substring

      public String substring(int start, int end)
      Parameters:
      start - the start index, inclusive.
      end - the end index, exclusive.
      Returns:
      the specified substring.
      See Also:
    • getReplaced

      public String getReplaced(String substitute, int start, int end)
      This method gets the original string where the substring specified by start and end is replaced by substitute.
      Parameters:
      substitute - is the string used as replacement.
      start - is the inclusive start index of the substring to replace.
      end - is the exclusive end index of the substring to replace.
      Returns:
      the original string with the specified substring replaced by substitute.
    • appendSubstring

      public void appendSubstring(StringBuilder appendable, int start, int end)
      This method appends the substring specified by start and end to the given buffer.
      This avoids the overhead of creating a new string and copying the char array.
      Parameters:
      appendable - is the buffer where to append the substring to.
      start - the start index, inclusive.
      end - the end index, exclusive.
    • getCurrentIndex

      public int getCurrentIndex()
      This method gets the current position in the stream to scan. It will initially be 0. In other words this method returns the number of characters that have already been consumed.
      Returns:
      the current index position.
    • setCurrentIndex

      public void setCurrentIndex(int index)
      This method sets the current index.
      Parameters:
      index - is the next index position to set. The value has to be greater or equal to 0 and less or equal to getLength() .
    • hasNext

      public boolean hasNext()
      Description copied from interface: CharStreamScanner
      This method determines if there is at least one more character available.
      Specified by:
      hasNext in interface CharStreamScanner
      Overrides:
      hasNext in class AbstractCharStreamScanner
      Returns:
      true if there is at least one character available, false if the end of data has been reached.
    • next

      public char next()
      Description copied from interface: CharStreamScanner
      This method reads the current character from the stream and increments the index stepping to the next character. You should check if a character is available before calling this method. Otherwise if your stream may contain the NUL character ('\0') you can not distinguish if the end of the stream was reached or you actually read the NUL character.
      Specified by:
      next in interface CharStreamScanner
      Overrides:
      next in class AbstractCharStreamScanner
      Returns:
      the CharStreamScanner.next() character or CharStreamScanner.EOS if none is available.
    • peek

      public char peek()
      Description copied from interface: CharStreamScanner
      This method reads the current character without incrementing the index.
      Specified by:
      peek in interface CharStreamScanner
      Overrides:
      peek in class AbstractCharStreamScanner
      Returns:
      the current character or CharStreamScanner.EOS if none is available.
    • peek

      public String peek(int count)
      This method peeks the number of next characters given by count and returns them as string. If there are less characters available the returned string will be shorter than count and only contain the available characters. Unlike AbstractCharStreamScanner.read(int) this method does NOT consume the characters and will therefore NOT change the state of this scanner.
      Parameters:
      count - is the number of characters to peek. You may use Integer.MAX_VALUE to peek until the end of text (EOT) if the data-size is suitable.
      Returns:
      a string with the given number of characters or all available characters if less than count. Will be the empty string if no character is available at all.
    • stepBack

      public void stepBack()
      This method decrements the index by one. If the index is 0 this method will have no effect.
      E.g. use this method if you read a character too much.
    • readUntil

      public String readUntil(CharFilter filter, boolean acceptEot)
      Description copied from interface: CharStreamScanner
      This method reads all next characters until the first character accepted by the given filter or the end is reached.
      After the call of this method, the current index will point to the first accepted stop character or to the end if NO such character exists.
      Specified by:
      readUntil in interface CharStreamScanner
      Overrides:
      readUntil in class AbstractCharStreamScanner
      Parameters:
      filter - is used to decide where to stop.
      acceptEot - if true if end of data should be treated like the stop character and the rest of the text will be returned, false otherwise (to return null if the end of data was reached and the scanner has been consumed).
      Returns:
      the string with all read characters not accepted by the given CharFilter or null if there was no accepted character and acceptEnd is false.
    • expectRestWithLookahead

      protected boolean expectRestWithLookahead(char[] stopChars, boolean ignoreCase, Runnable appender, boolean skip)
      Specified by:
      expectRestWithLookahead in class AbstractCharStreamScanner
      Parameters:
      stopChars - the stop String as char[]. If ignoreCase is true in lower case.
      ignoreCase - - true to (also) compare chars in lower case, false otherwise.
      appender - an optional lambda to run before shifting buffers to append data.
      skip - - true to update buffers and offset such that on success this scanner points after the expected stop String, false otherwise (to not consume any character in any case).
      Returns:
      true if the stop String (stopChars) was found and consumed, false otherwise (and no data consumed).
      See Also:
    • expectStrict

      public boolean expectStrict(String expected, boolean ignoreCase, boolean lookahead)
      Description copied from interface: CharStreamScanner
      This method acts as CharStreamScanner.expectUnsafe(String, boolean) but if the expected String is NOT completely present, no character is consumed and the state of the scanner remains unchanged.
      Attention:
      This method requires lookahead. For implementations that are backed by an underlying stream (or reader) the length of the expected String shall not exceed the available lookahead size (buffer capacity given at construction time). Otherwise the method may fail.
      Parameters:
      expected - is the expected string.
      ignoreCase - - if true the case of the characters is ignored when compared.
      lookahead - - if true the state of the scanner remains unchanged even if the expected String has been found, false otherwise.
      Returns:
      true if the expected string was successfully consumed from this scanner, false otherwise.
    • getTail

      protected String getTail()
      This method gets the tail of this scanner without changing the state.
      Returns:
      the tail of this scanner.
    • getTail

      protected String getTail(int maximum)
      This method gets the tail of this scanner limited (truncated) to the given maximum number of characters without changing the state.
      Parameters:
      maximum - is the maximum number of characters to return from the tail.
      Returns:
      the tail of this scanner.
    • require

      public void require(String expected, boolean ignoreCase)
      Description copied from interface: CharStreamScanner
      This method verifies that the expected string gets consumed from this scanner with respect to ignoreCase. Otherwise an exception is thrown indicating the problem.
      This method behaves functionally equivalent to the following code:
       if (!scanner.expectUnsafe(expected, ignoreCase)) {
         throw new IllegalStateException(...);
       }
       
      Specified by:
      require in interface CharStreamScanner
      Overrides:
      require in class AbstractCharStreamScanner
      Parameters:
      expected - is the expected string.
      ignoreCase - - if true the case of the characters is ignored during comparison.
    • readWhile

      public String readWhile(CharFilter filter, int max)
      Description copied from interface: CharStreamScanner
      This method reads all next characters that are accepted by the given filter.
      After the call of this method, the current index will point to the next character that was NOT accepted by the given filter. If the next max characters or the characters left until the end of this scanner are accepted, only that amount of characters are skipped.
      Specified by:
      readWhile in interface CharStreamScanner
      Overrides:
      readWhile in class AbstractCharStreamScanner
      Parameters:
      filter - is used to decide which characters should be accepted.
      max - is the maximum number of characters that should be read.
      Returns:
      a string with all characters accepted by the given filter limited to the length of max and the end of this scanner. Will be the empty string if no character was accepted.
      See Also:
    • getOriginalString

      public String getOriginalString()
      This method gets the original string to parse.
      Returns:
      the original string.
      See Also: