public abstract class StreamScanner extends ByteBasedScanner
Modifier and Type | Field and Description |
---|---|
protected XmlCharTypes |
_charTypes
This is a simple container object that is used to access the
decoding tables for characters.
|
protected InputStream |
_in
Underlying InputStream to use for reading content.
|
protected byte[] |
_inputBuffer |
protected int[] |
_quadBuffer
This buffer is used for name parsing.
|
protected ByteBasedPNameTable |
_symbols
For now, symbol table contains prefixed names.
|
_inputEnd, _inputPtr, _tmpChar, BYTE_a, BYTE_A, BYTE_AMP, BYTE_APOS, BYTE_C, BYTE_CR, BYTE_D, BYTE_EQ, BYTE_EXCL, BYTE_g, BYTE_GT, BYTE_HASH, BYTE_HYPHEN, BYTE_l, BYTE_LBRACKET, BYTE_LF, BYTE_LT, BYTE_m, BYTE_NULL, BYTE_o, BYTE_p, BYTE_P, BYTE_q, BYTE_QMARK, BYTE_QUOT, BYTE_RBRACKET, BYTE_s, BYTE_S, BYTE_SEMICOLON, BYTE_SLASH, BYTE_SPACE, BYTE_t, BYTE_T, BYTE_TAB, BYTE_u, BYTE_x
_attrCollector, _attrCount, _cfgCoalescing, _cfgLazyParsing, _config, _currElem, _currNsCount, _currRow, _currToken, _defaultNs, _depth, _entityPending, _isEmptyTag, _lastNsContext, _lastNsDecl, _nameBuffer, _nsBindingCache, _nsBindingCount, _nsBindings, _nsBindMisses, _pastBytesOrChars, _publicId, _rowStartOffset, _startColumn, _startRawOffset, _startRow, _systemId, _textBuilder, _tokenIncomplete, _tokenName, _xml11, CDATA_STR, INT_0, INT_9, INT_a, INT_A, INT_AMP, INT_APOS, INT_COLON, INT_CR, INT_EQ, INT_EXCL, INT_f, INT_F, INT_GT, INT_HYPHEN, INT_LBRACKET, INT_LF, INT_LT, INT_NULL, INT_QMARK, INT_QUOTE, INT_RBRACKET, INT_SLASH, INT_SPACE, INT_TAB, INT_z, MAX_UNICODE_CHAR, TOKEN_EOI
CHAR_CR, CHAR_LF, CHAR_NULL, CHAR_SPACE, STAX_DEFAULT_OUTPUT_ENCODING, STAX_DEFAULT_OUTPUT_VERSION, XML_DECL_KW_ENCODING, XML_DECL_KW_STANDALONE, XML_DECL_KW_VERSION, XML_SA_NO, XML_SA_YES, XML_V_10, XML_V_10_STR, XML_V_11, XML_V_11_STR, XML_V_UNKNOWN
ATTRIBUTE, CDATA, CHARACTERS, COMMENT, DTD, END_DOCUMENT, END_ELEMENT, ENTITY_DECLARATION, ENTITY_REFERENCE, NAMESPACE, NOTATION_DECLARATION, PROCESSING_INSTRUCTION, SPACE, START_DOCUMENT, START_ELEMENT
Constructor and Description |
---|
StreamScanner(ReaderConfig cfg,
InputStream in,
byte[] buffer,
int ptr,
int last) |
Modifier and Type | Method and Description |
---|---|
protected void |
_closeSource() |
protected int |
_nextEntity()
Helper method used to isolate things that need to be (re)set in
cases where
|
protected void |
_releaseBuffers() |
protected PName |
addPName(int hash,
int[] quads,
int qlen,
int lastQuadBytes) |
protected int |
checkInTreeIndentation(int c)
Note: consequtive white space is only considered indentation,
if the following token seems like a tag (start/end).
|
protected int |
checkPrologIndentation(int c) |
protected int |
handleCharEntity() |
protected int |
handleEndElement()
Note that this method is currently also shareable for all Ascii-based
encodings, and at least between UTF-8 and ISO-Latin1.
|
protected abstract int |
handleEntityInText(boolean inAttr) |
protected abstract int |
handleStartElement(byte b)
Parsing of start element requires parsing of the element name
(and attribute names), and is thus encoding-specific.
|
protected boolean |
loadAndRetain(int nrOfChars) |
protected boolean |
loadMore() |
protected byte |
loadOne() |
protected byte |
loadOne(int type) |
protected byte |
nextByte() |
protected byte |
nextByte(int tt) |
int |
nextFromProlog(boolean isProlog) |
int |
nextFromTree() |
protected PName |
parsePName(byte b)
This method can (for now?) be shared between all Ascii-based
encodings, since it only does coarse validity checking -- real
checks are done in different method.
|
protected PName |
parsePNameLong(int q,
int[] quads) |
protected PName |
parsePNameMedium(int i2,
int q1) |
protected PName |
parsePNameSlow(byte b) |
protected abstract String |
parsePublicId(byte quoteChar) |
protected abstract String |
parseSystemId(byte quoteChar) |
protected byte |
skipInternalWs(boolean reqd,
String msg) |
addUTFPName, decodeCharForError, getCurrentColumnNr, getCurrentLocation, getEndingByteOffset, getEndingCharOffset, getStartingByteOffset, getStartingCharOffset, markLF, markLF, reportInvalidInitial, reportInvalidOther, setStartLocation
bindName, bindNs, checkImmutableBinding, close, decodeAttrBinaryValue, decodeAttrValue, decodeAttrValues, decodeElements, findAttrIndex, findOrCreateBinding, finishCData, finishCharacters, finishComment, finishDTD, finishPI, finishSpace, finishToken, fireSaxCharacterEvents, fireSaxCommentEvent, fireSaxEndElement, fireSaxPIEvent, fireSaxSpaceEvents, fireSaxStartElement, getAttrCollector, getAttrCount, getAttrLocalName, getAttrNsURI, getAttrPrefix, getAttrPrefixedName, getAttrQName, getAttrType, getAttrValue, getAttrValue, getConfig, getCurrentLineNr, getDepth, getDTDPublicId, getDTDSystemId, getEndLocation, getInputPublicId, getInputSystemId, getName, getNamespacePrefix, getNamespaceURI, getNamespaceURI, getNamespaceURI, getNonTransientNamespaceContext, getNsCount, getPrefix, getPrefixes, getQName, getStartLocation, getText, getText, getTextCharacters, getTextCharacters, getTextLength, handleInvalidXmlChar, hasEmptyStack, isAttrSpecified, isEmptyTag, isTextWhitespace, loadMoreGuaranteed, loadMoreGuaranteed, reportDoubleHyphenInComments, reportDuplicateNsDecl, reportEntityOverflow, reportEofInName, reportIllegalCDataEnd, reportIllegalNsDecl, reportIllegalNsDecl, reportInputProblem, reportInvalidNameChar, reportInvalidNsIndex, reportInvalidXmlChar, reportMissingPISpace, reportMultipleColonsInName, reportPrologProblem, reportPrologUnexpChar, reportPrologUnexpElement, reportTreeUnexpChar, reportUnboundPrefix, reportUnexpandedEntityInAttr, reportUnexpectedEndTag, resetForDecoding, skipCData, skipCharacters, skipCoalescedText, skipComment, skipPI, skipSpace, skipToken, throwInvalidSpace, throwNullChar, throwUnexpectedChar, verifyXmlChar
protected InputStream _in
protected byte[] _inputBuffer
protected final XmlCharTypes _charTypes
protected final ByteBasedPNameTable _symbols
protected int[] _quadBuffer
public StreamScanner(ReaderConfig cfg, InputStream in, byte[] buffer, int ptr, int last)
protected void _releaseBuffers()
_releaseBuffers
in class XmlScanner
protected void _closeSource() throws IOException
_closeSource
in class ByteBasedScanner
IOException
protected abstract int handleEntityInText(boolean inAttr) throws XMLStreamException
XMLStreamException
protected abstract String parsePublicId(byte quoteChar) throws XMLStreamException
XMLStreamException
protected abstract String parseSystemId(byte quoteChar) throws XMLStreamException
XMLStreamException
public final int nextFromProlog(boolean isProlog) throws XMLStreamException
nextFromProlog
in class XmlScanner
XMLStreamException
public final int nextFromTree() throws XMLStreamException
nextFromTree
in class XmlScanner
XMLStreamException
protected int _nextEntity()
protected final int handleCharEntity() throws XMLStreamException
XMLStreamException
protected abstract int handleStartElement(byte b) throws XMLStreamException
XMLStreamException
protected final int handleEndElement() throws XMLStreamException
XMLStreamException
protected final PName parsePName(byte b) throws XMLStreamException
Some notes about assumption implementation makes:
XMLStreamException
protected PName parsePNameMedium(int i2, int q1) throws XMLStreamException
XMLStreamException
protected final PName parsePNameLong(int q, int[] quads) throws XMLStreamException
XMLStreamException
protected final PName parsePNameSlow(byte b) throws XMLStreamException
XMLStreamException
protected final PName addPName(int hash, int[] quads, int qlen, int lastQuadBytes) throws XMLStreamException
XMLStreamException
protected byte skipInternalWs(boolean reqd, String msg) throws XMLStreamException
XMLStreamException
protected final int checkInTreeIndentation(int c) throws XMLStreamException
Note: consequtive white space is only considered indentation, if the following token seems like a tag (start/end). This so that if a CDATA section follows, it can be coalesced in coalescing mode. Although we could check if coalescing mode is enabled, this should seldom have significant effect either way, so it removes one possible source of problems in coalescing mode.
XMLStreamException
protected final int checkPrologIndentation(int c) throws XMLStreamException
XMLStreamException
protected final boolean loadMore() throws XMLStreamException
loadMore
in class XmlScanner
XMLStreamException
protected final byte nextByte(int tt) throws XMLStreamException
XMLStreamException
protected final byte nextByte() throws XMLStreamException
XMLStreamException
protected final byte loadOne() throws XMLStreamException
XMLStreamException
protected final byte loadOne(int type) throws XMLStreamException
XMLStreamException
protected final boolean loadAndRetain(int nrOfChars) throws XMLStreamException
XMLStreamException
Copyright © 2019 FasterXML. All rights reserved.