java.lang.Object
io.github.mmm.scanner.AbstractCharStreamScanner
io.github.mmm.scanner.CharSequenceScanner
- All Implemented Interfaces:
TextFormatProcessor,TextPosition,CharStreamScanner,AutoCloseable
This class represents a
It has various useful methods for scanning the sequence. This scanner is designed to be fast on long sequences and therefore internally
ATTENTION:
This implementation is NOT thread-safe (intended by design).
String or better a sequence of characters (char[]) together with a
position in that sequence. It has various useful methods for scanning the sequence. This scanner is designed to be fast on long sequences and therefore internally
converts Strings to a char array instead of frequently
calling String.charAt(int). ATTENTION:
This implementation is NOT thread-safe (intended by design).
- Since:
- 1.0.0
-
Field Summary
Fields inherited from class io.github.mmm.scanner.AbstractCharStreamScanner
buffer, column, limit, line, offsetFields inherited from interface io.github.mmm.scanner.CharStreamScanner
EOS -
Constructor Summary
ConstructorsConstructorDescriptionCharSequenceScanner(char[] characters) The constructor.CharSequenceScanner(char[] characters, int offset, int length) The constructor.CharSequenceScanner(char[] characters, int offset, int length, TextFormatMessageHandler messageHandler) The constructor.CharSequenceScanner(char[] characters, int offset, int length, TextFormatMessageHandler messageHandler, int line, int column) The constructor.CharSequenceScanner(char[] characters, TextFormatMessageHandler messageHandler) The constructor.CharSequenceScanner(char[] characters, TextFormatMessageHandler messageHandler, int line, int column) The constructor.CharSequenceScanner(CharSequence charSequence) The constructor.CharSequenceScanner(CharSequence charSequence, TextFormatMessageHandler messageHandler) The constructor.CharSequenceScanner(String string) The constructor.CharSequenceScanner(String string, TextFormatMessageHandler messageHandler) The constructor.CharSequenceScanner(String string, TextFormatMessageHandler messageHandler, int line, int column) The constructor. -
Method Summary
Modifier and TypeMethodDescriptionvoidappendSubstring(StringBuilder appendable, int start, int end) charcharAt(int index) voidclose()booleanThis method determines if the givenexpectedStringis completely present at the current position.protected booleanexpectRestWithLookahead(char[] stopChars, boolean ignoreCase, Runnable appender, boolean skip) intThis method gets the current position in the stream to scan.intThis method gets the original string to parse.intgetReplaced(String substitute, int start, int end) This method gets theoriginal stringwhere thesubstringspecified bystartandendis replaced bysubstitute.protected StringgetTail()This method gets the tail of this scanner without changing the state.protected StringgetTail(int maximum) This method gets the tail of this scanner limited (truncated) to the givenmaximumnumber of characters without changing the state.booleanhasNext()This method determines if there is at least one more character available.charnext()This method reads the current character from the stream and increments the index stepping to the next character.charpeek()This method reads the current character withoutconsumingcharacters and will therefore not change the state of this scanner.charpeek(int lookaheadOffset) LikeCharStreamScanner.peek()but with further lookahead.
Attention:
This method requires lookahead.peekString(int count) This method peeks the number ofnext charactersgiven bycountand returns them as string.peekWhile(CharFilter filter, int maxLen) readUntil(CharFilter filter, boolean acceptEot) This method reads allnext charactersuntil the first characteracceptedby the givenfilteror the end is reached.readWhile(CharFilter filter, int min, int max) voidThis method verifies that theexpectedstring gets consumed from this scanner with respect toignoreCase.voidsetCurrentIndex(int index) This method sets thecurrent index.substring(int start, int end) Methods inherited from class io.github.mmm.scanner.AbstractCharStreamScanner
addMessage, append, builder, eot, expectOne, expectOne, expectUnsafe, fill, getAppended, getBufferParsed, getBufferToParse, getColumn, getLine, getMessages, handleChar, isEob, isEos, isEot, read, read, readDigit, readDouble, readFloat, readInteger, readJavaCharLiteral, readJavaNumberLiteral, readJavaStringLiteral, readLine, readLong, readNumber, readUnsignedLong, readUntil, readUntil, readUntil, readUntil, readUntil, requireMin, reset, setOffset, skip, skipNewLine, skipOver, skipUntil, skipUntil, skipWhile, skipWhile, toString, verifyLookaheadMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitMethods inherited from interface io.github.mmm.scanner.CharStreamScanner
expect, expect, expect, expect, expectOne, expectUnsafe, peekUntil, readBoolean, readBoolean, readBoolean, readDigit, readDouble, readFloat, readInteger, readJavaCharLiteral, readJavaStringLiteral, readLine, readLong, readUntil, readUntil, readUntil, readWhile, require, require, requireOne, requireOne, requireOneOrMore, skipOver, skipOver, skipWhile, skipWhileAndPeek, skipWhileAndPeekMethods inherited from interface io.github.mmm.base.text.TextFormatProcessor
addError, addInfo, addMessage, addWarning
-
Constructor Details
-
CharSequenceScanner
The constructor.- Parameters:
charSequence- is thestringto scan.
-
CharSequenceScanner
The constructor.- Parameters:
charSequence- is thestringto scan.messageHandler- theTextFormatMessageHandler.
-
CharSequenceScanner
The constructor.- Parameters:
string- is thestringto parse.
-
CharSequenceScanner
The constructor.- Parameters:
string- is thestringto parse.messageHandler- theTextFormatMessageHandler.
-
CharSequenceScanner
public CharSequenceScanner(String string, TextFormatMessageHandler messageHandler, int line, int column) The constructor.- Parameters:
string- is thestringto parse.messageHandler- theTextFormatMessageHandler.line- the initialline.column- the initialcolumn.
-
CharSequenceScanner
public CharSequenceScanner(char[] characters) The constructor.- Parameters:
characters- is an array containing the characters to scan.
-
CharSequenceScanner
The constructor.- Parameters:
characters- is an array containing the characters to scan.messageHandler- theTextFormatMessageHandler.
-
CharSequenceScanner
public CharSequenceScanner(char[] characters, TextFormatMessageHandler messageHandler, int line, int column) The constructor.- Parameters:
characters- is an array containing the characters to scan.messageHandler- theTextFormatMessageHandler.line- the initialline.column- the initialcolumn.
-
CharSequenceScanner
public CharSequenceScanner(char[] characters, int offset, int length) The constructor.- Parameters:
characters- is an array containing the characters to scan.offset- is the index of the first char to scan incharacters(typically0to start at the beginning of the array).length- is thenumber of charactersto scan fromcharactersstarting atoffset(typicallycharacters.length - offset).
-
CharSequenceScanner
public CharSequenceScanner(char[] characters, int offset, int length, TextFormatMessageHandler messageHandler) The constructor.- Parameters:
characters- is an array containing the characters to scan.offset- is the index of the first char to scan incharacters(typically0to start at the beginning of the array).length- is thenumber of charactersto scan fromcharactersstarting atoffset(typicallycharacters.length - offset).messageHandler- theTextFormatMessageHandler.
-
CharSequenceScanner
public CharSequenceScanner(char[] characters, int offset, int length, TextFormatMessageHandler messageHandler, int line, int column) The constructor.- Parameters:
characters- is an array containing the characters to scan.offset- is the index of the first char to scan incharacters(typically0to start at the beginning of the array).length- is thenumber of charactersto scan fromcharactersstarting atoffset(typicallycharacters.length - offset).messageHandler- theTextFormatMessageHandler.line- the initialline.column- the initialcolumn.
-
-
Method Details
-
charAt
public char charAt(int index) - Parameters:
index- is the index of the requested character.- Returns:
- the character at the given
index. - See Also:
-
getPosition
public int getPosition()- Returns:
- the position in the sequence to scan or in other words the number of characters that have been read. Will
initially be
0. Please note that this API is designed for scanning textual content (for parsers). Therefore we consider 2.1 terabyte as a suitablelimit.
-
getLength
public int getLength()- Returns:
- the total length of the
string to parse. - See Also:
-
substring
- Parameters:
start- the start index, inclusive.end- the end index, exclusive.- Returns:
- the specified substring.
- See Also:
-
getReplaced
This method gets theoriginal stringwhere thesubstringspecified bystartandendis replaced bysubstitute.- Parameters:
substitute- is the string used as replacement.start- is the inclusive start index of the substring to replace.end- is the exclusive end index of the substring to replace.- Returns:
- the
original stringwith the specified substring replaced bysubstitute.
-
appendSubstring
This method appends thesubstringspecified bystartandendto the givenbuffer.
This avoids the overhead of creating a new string and copying the char array.- Parameters:
appendable- is the buffer where to append the substring to.start- the start index, inclusive.end- the end index, exclusive.
-
getCurrentIndex
public int getCurrentIndex()This method gets the current position in the stream to scan. It will initially be0. In other words this method returns the number of characters that have already beenconsumed.- Returns:
- the current index position.
-
setCurrentIndex
public void setCurrentIndex(int index) This method sets thecurrent index.- Parameters:
index- is the next index position to set. The value has to be greater or equal to0and less or equal togetLength().
-
hasNext
public boolean hasNext()Description copied from interface:CharStreamScannerThis method determines if there is at least one more character available.- Specified by:
hasNextin interfaceCharStreamScanner- Overrides:
hasNextin classAbstractCharStreamScanner- Returns:
trueif there is at least one character available,falseif the end of data has been reached.
-
next
public char next()Description copied from interface:CharStreamScannerThis method reads the current character from the stream and increments the index stepping to the next character. You shouldcheckif a character is available before calling this method. Otherwise if your stream may contain the NUL character ('\0') you can not distinguish if the end of the stream was reached or you actually read the NUL character.- Specified by:
nextin interfaceCharStreamScanner- Overrides:
nextin classAbstractCharStreamScanner- Returns:
- the
CharStreamScanner.next()character orCharStreamScanner.EOSif none isavailable.
-
peek
public char peek()Description copied from interface:CharStreamScannerThis method reads the current character withoutconsumingcharacters and will therefore not change the state of this scanner.- Specified by:
peekin interfaceCharStreamScanner- Overrides:
peekin classAbstractCharStreamScanner- Returns:
- the current character or
CharStreamScanner.EOSif none isavailable.
-
peek
public char peek(int lookaheadOffset) Description copied from interface:CharStreamScannerLikeCharStreamScanner.peek()but with further lookahead.
Attention:
This method requires lookahead. For implementations that are backed by an underlying stream (or reader) the givenlookaheadOffsetshall not exceed the available lookahead size (buffer capacity given at construction time). Otherwise the method may fail.- Parameters:
lookaheadOffset- the lookahead offset. If0this method will behave likeCharStreamScanner.peek(). In case of1it will return the character after the next one and so forth.- Returns:
- the
peekedcharacter at the givenlookaheadOffsetorCharStreamScanner.EOSif no such character exists.
-
peekString
This method peeks the number ofnext charactersgiven bycountand returns them as string. If there are less charactersavailablethe returned string will be shorter thancountand only contain the available characters. UnlikeAbstractCharStreamScanner.read(int)this method does NOT consume the characters and will therefore NOT change the state of this scanner.- Parameters:
count- is the number of characters to peek. You may useInteger.MAX_VALUEto peek until the end of text (EOT) if the data-size is suitable.- Returns:
- a string with the given number of characters or all available characters if less than
count. Will be the empty string if no character isavailableat all.
-
peekWhile
- Parameters:
filter- theCharFilteracceptingonly the characters to peek.maxLen- the maximum number of characters to peek (to get as lookahead without modifying this stream).- Returns:
- a
Stringwith thepeekedcharacters of the givenmaxLenor less if a character was hit that is notacceptedby the givenfilteror the end-of-text has been reached before. The state of this stream remains unchanged. - See Also:
-
readUntil
Description copied from interface:CharStreamScannerThis method reads allnext charactersuntil the first characteracceptedby the givenfilteror the end is reached.
After the call of this method, the current index will point to the firstacceptedstop character or to the end if NO such character exists.- Specified by:
readUntilin interfaceCharStreamScanner- Overrides:
readUntilin classAbstractCharStreamScanner- Parameters:
filter- is used todecidewhere to stop.acceptEot- iftrueif end of data should be treated like thestopcharacter and the rest of the text will be returned,falseotherwise (to returnnullif the end of data was reached and the scanner has been consumed).- Returns:
- the string with all read characters not
acceptedby the givenCharFilterornullif there was noacceptedcharacter andacceptEndisfalse.
-
expectRestWithLookahead
protected boolean expectRestWithLookahead(char[] stopChars, boolean ignoreCase, Runnable appender, boolean skip) - Specified by:
expectRestWithLookaheadin classAbstractCharStreamScanner- Parameters:
stopChars- the stopStringaschar[]. IfignoreCaseistruein lower case.ignoreCase- -trueto (also) compare chars inlower case,falseotherwise.appender- an optional lambda torunbefore shifting buffers to append data.skip- -trueto update buffers and offset such that on success this scanner points after the expected stopString,falseotherwise (to not consume any character in any case).- Returns:
trueif the stopString(stopChars) was found and consumed,falseotherwise (and no data consumed).- See Also:
-
expect
Description copied from interface:CharStreamScannerThis method determines if the givenexpectedStringis completely present at the current position. It will onlyconsumecharacters and change the state iflookaheadisfalseand theexpectedStringwas found (entirely).
Attention:
This method requires lookahead. For implementations that are backed by an underlying stream (or reader) thelengthof the expectedStringshall not exceed the available lookahead size (buffer capacity given at construction time). Otherwise the method may fail.- Parameters:
expected- the expectedStringto search for.ignoreCase- - iftruethe case of the characters is ignored when compared,falseotherwise.lookahead- - iftruethe state of the scanner remains unchanged even if the expectedStringhas been found,falseotherwise (expectedStringis consumed on match).off- the number of characters that have already beenpeekedand after which the givenStringis expected. Will typically be0. Iflookaheadisfalseand the expectedStringwas found these characters will beskippedtogether with the expectedString.- Returns:
trueif theexpectedstring was successfully found,falseotherwise.
-
getTail
This method gets the tail of this scanner without changing the state.- Returns:
- the tail of this scanner.
-
getTail
This method gets the tail of this scanner limited (truncated) to the givenmaximumnumber of characters without changing the state.- Parameters:
maximum- is the maximum number of characters to return from thetail.- Returns:
- the tail of this scanner.
-
require
Description copied from interface:CharStreamScannerThis method verifies that theexpectedstring gets consumed from this scanner with respect toignoreCase. Otherwise an exception is thrown indicating the problem.
This method behaves functionally equivalent to the following code:if (!scanner.
expectUnsafe(expected, ignoreCase)) { throw newIllegalStateException(...); }- Specified by:
requirein interfaceCharStreamScanner- Overrides:
requirein classAbstractCharStreamScanner- Parameters:
expected- is the expected string.ignoreCase- - iftruethe case of the characters is ignored during comparison.
-
readWhile
Description copied from interface:CharStreamScannerThis method reads allnext charactersthat areacceptedby the givenfilter.
After the call of this method, the current index will point to the next character that was NOTacceptedby the givenfilter. If the nextmaxcharacters or the characters left until theendof this scanner areaccepted, only that amount of characters are skipped.- Specified by:
readWhilein interfaceCharStreamScanner- Overrides:
readWhilein classAbstractCharStreamScanner- Parameters:
filter- used todecidewhich characters should be accepted.min- the minimum number of characters expected.max- the maximum number of characters that should be read.- Returns:
- a string with all characters
acceptedby the givenfilterlimited to the length ofmaxand theendof this scanner. Will be the empty string if no character was accepted. - See Also:
-
getOriginalString
This method gets the original string to parse.- Returns:
- the original string.
- See Also:
-
close
public void close()
-