public final class CharStreams extends Object
CharStream
s
from a variety of sources as of 4.7. The motivation was to support
Unicode code points > U+FFFF. ANTLRInputStream
and
ANTLRFileStream
are now deprecated in favor of the streams created
by this interface.
DEPRECATED: new ANTLRFileStream("myinputfile")
NEW: CharStreams.fromFileName("myinputfile")
WARNING: If you use both the deprecated and the new streams, you will see
a nontrivial performance degradation. This speed hit is because the
Lexer
's internal code goes from a monomorphic to megamorphic
dynamic dispatch to get characters from the input stream. Java's
on-the-fly compiler (JIT) is unable to perform the same optimizations
so stick with either the old or the new streams, if performance is
a primary concern. See the extreme debugging and spelunking
needed to identify this issue in our timing rig:
https://github.com/antlr/antlr4/pull/1781
The ANTLR character streams still buffer all the input when you create
the stream, as they have done for ~20 years. If you need unbuffered
access, please note that it becomes challenging to create
parse trees. The parse tree has to point to tokens which will either
point into a stale location in an unbuffered stream or you have to copy
the characters out of the buffer into the token. That defeats the purpose
of unbuffered input. Per the ANTLR book, unbuffered streams are primarily
useful for processing infinite streams *during the parse.*
The new streams also use 8-bit buffers when possible so this new
interface supports character streams that use half as much memory
as the old ANTLRFileStream
, which assumed 16-bit characters.
A big shout out to Ben Hamilton (github bhamiltoncx) for his superhuman
efforts across all targets to get true Unicode 3.1 support for U+10FFFF.Modifier and Type | Method and Description |
---|---|
static CharStream |
fromChannel(ReadableByteChannel channel)
Creates a
CharStream given an opened ReadableByteChannel
containing UTF-8 bytes. |
static CharStream |
fromChannel(ReadableByteChannel channel,
Charset charset)
Creates a
CharStream given an opened ReadableByteChannel and the
charset of the bytes contained in the channel. |
static CodePointCharStream |
fromChannel(ReadableByteChannel channel,
Charset charset,
int bufferSize,
CodingErrorAction decodingErrorAction,
String sourceName,
long inputSize) |
static CodePointCharStream |
fromChannel(ReadableByteChannel channel,
int bufferSize,
CodingErrorAction decodingErrorAction,
String sourceName)
Creates a
CharStream given an opened ReadableByteChannel
containing UTF-8 bytes. |
static CharStream |
fromFileName(String fileName)
Creates a
CharStream given a string containing a
path to a UTF-8 file on disk. |
static CharStream |
fromFileName(String fileName,
Charset charset)
Creates a
CharStream given a string containing a
path to a file on disk and the charset of the bytes
contained in the file. |
static CharStream |
fromPath(Path path)
Creates a
CharStream given a path to a UTF-8
encoded file on disk. |
static CharStream |
fromPath(Path path,
Charset charset)
Creates a
CharStream given a path to a file on disk and the
charset of the bytes contained in the file. |
static CodePointCharStream |
fromReader(Reader r)
Creates a
CharStream given a Reader . |
static CodePointCharStream |
fromReader(Reader r,
String sourceName)
Creates a
CharStream given a Reader and its
source name. |
static CharStream |
fromStream(InputStream is)
Creates a
CharStream given an opened InputStream
containing UTF-8 bytes. |
static CharStream |
fromStream(InputStream is,
Charset charset)
Creates a
CharStream given an opened InputStream and the
charset of the bytes contained in the stream. |
static CharStream |
fromStream(InputStream is,
Charset charset,
long inputSize) |
static CodePointCharStream |
fromString(String s)
Creates a
CharStream given a String . |
static CodePointCharStream |
fromString(String s,
String sourceName)
|
public static CharStream fromPath(Path path) throws IOException
CharStream
given a path to a UTF-8
encoded file on disk.
Reads the entire contents of the file into the result before returning.IOException
public static CharStream fromPath(Path path, Charset charset) throws IOException
CharStream
given a path to a file on disk and the
charset of the bytes contained in the file.
Reads the entire contents of the file into the result before returning.IOException
public static CharStream fromFileName(String fileName) throws IOException
CharStream
given a string containing a
path to a UTF-8 file on disk.
Reads the entire contents of the file into the result before returning.IOException
public static CharStream fromFileName(String fileName, Charset charset) throws IOException
CharStream
given a string containing a
path to a file on disk and the charset of the bytes
contained in the file.
Reads the entire contents of the file into the result before returning.IOException
public static CharStream fromStream(InputStream is) throws IOException
CharStream
given an opened InputStream
containing UTF-8 bytes.
Reads the entire contents of the InputStream
into
the result before returning, then closes the InputStream
.IOException
public static CharStream fromStream(InputStream is, Charset charset) throws IOException
CharStream
given an opened InputStream
and the
charset of the bytes contained in the stream.
Reads the entire contents of the InputStream
into
the result before returning, then closes the InputStream
.IOException
public static CharStream fromStream(InputStream is, Charset charset, long inputSize) throws IOException
IOException
public static CharStream fromChannel(ReadableByteChannel channel) throws IOException
CharStream
given an opened ReadableByteChannel
containing UTF-8 bytes.
Reads the entire contents of the channel
into
the result before returning, then closes the channel
.IOException
public static CharStream fromChannel(ReadableByteChannel channel, Charset charset) throws IOException
CharStream
given an opened ReadableByteChannel
and the
charset of the bytes contained in the channel.
Reads the entire contents of the channel
into
the result before returning, then closes the channel
.IOException
public static CodePointCharStream fromReader(Reader r) throws IOException
CharStream
given a Reader
. Closes
the reader before returning.IOException
public static CodePointCharStream fromReader(Reader r, String sourceName) throws IOException
CharStream
given a Reader
and its
source name. Closes the reader before returning.IOException
public static CodePointCharStream fromString(String s)
CharStream
given a String
.public static CodePointCharStream fromString(String s, String sourceName)
public static CodePointCharStream fromChannel(ReadableByteChannel channel, int bufferSize, CodingErrorAction decodingErrorAction, String sourceName) throws IOException
CharStream
given an opened ReadableByteChannel
containing UTF-8 bytes.
Reads the entire contents of the channel
into
the result before returning, then closes the channel
.IOException
public static CodePointCharStream fromChannel(ReadableByteChannel channel, Charset charset, int bufferSize, CodingErrorAction decodingErrorAction, String sourceName, long inputSize) throws IOException
IOException
Copyright © 1992–2020 ANTLR. All rights reserved.