scala.xml.parsing

MarkupParser

trait MarkupParser extends MarkupParserCommon with TokenTests

An XML parser.

Parses XML 1.0, invokes callback methods of a MarkupHandler and returns whatever the markup handler returns. Use ConstructingParser if you just want to parse XML to construct instances of scala.xml.Node.

While XML elements are returned, DTD declarations - if handled - are collected using side-effects.

known subclasses: XhtmlParser, ConstructingParser

Inherits

  1. MarkupParserCommon
  2. TokenTests
  3. AnyRef
  4. Any

Type Members

  1. type InputType = Source

  2. type PositionType = Int

Value Members

  1. def appendText(pos: Int, ts: NodeBuffer, txt: String): Unit

  2. def attrDecl(): Unit

    <! attlist := ATTLIST

    <! attlist := ATTLIST

  3. var ch: Char

    holds the next character

    holds the next character

  4. def checkPubID(s: String): Boolean

  5. def checkSysID(s: String): Boolean

  6. def content(pscope: NamespaceBinding): NodeSeq

    content1 ::= '<' content1 | '&' charref

    content1 ::= '<' content1 | '&' charref ...

  7. def content1(pscope: NamespaceBinding, ts: NodeBuffer): Unit

    '<' content1 ::=

    '<' content1 ::= ...

  8. var curInput: Source

  9. def document(): Document

    [22] prolog ::= XMLDecl? Misc* (doctypedecl Misc*)? [23] XMLDecl ::= '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>' [24] VersionInfo ::= S 'version' Eq ("'" VersionNum "'" | '"' VersionNum '"') [25] Eq ::= S? '=' S? [26] VersionNum ::= '1

    [22] prolog ::= XMLDecl? Misc* (doctypedecl Misc*)? [23] XMLDecl ::= '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>' [24] VersionInfo ::= S 'version' Eq ("'" VersionNum "'" | '"' VersionNum '"') [25] Eq ::= S? '=' S? [26] VersionNum ::= '1.0' [27] Misc ::= Comment | PI | S

  10. var dtd: DTD

  11. def element(pscope: NamespaceBinding): NodeSeq

  12. def element1(pscope: NamespaceBinding): NodeSeq

    '<' element ::= xmlTag1 '>' { xmlExpr | '{' simpleExpr '}' } ETag | xmlTag1 '/' '>'

    '<' element ::= xmlTag1 '>' { xmlExpr | '{' simpleExpr '}' } ETag | xmlTag1 '/' '>'

  13. def elementDecl(): Unit

    <! element := ELEMENT

    <! element := ELEMENT

  14. def entityDecl(): Unit

    <! element := ELEMENT

    <! element := ELEMENT

  15. var eof: Boolean

  16. def equals(arg0: Any): Boolean

    This method is used to compare the receiver object (this) with the argument object (arg0) for equivalence

    This method is used to compare the receiver object (this) with the argument object (arg0) for equivalence.

    The default implementations of this method is an equivalence relation:

    • It is reflexive: for any instance x of type Any, x.equals(x) should return true.
    • It is symmetric: for any instances x and y of type Any, x.equals(y) should return true if and only if y.equals(x) returns true.
    • It is transitive: for any instances x, y, and z of type AnyRef if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) should return true.

    If you override this method, you should verify that your implementation remains an equivalence relation. Additionally, when overriding this method it is often necessary to override hashCode to ensure that objects that are "equal" (o1.equals(o2) returns true) hash to the same Int (o1.hashCode.equals(o2.hashCode)).

    arg0

    the object to compare against this object for equality.

    returns

    true if the receiver object is equivalent to the argument; false otherwise.

    definition classes: AnyRef ⇐ Any
  17. var extIndex: Int

  18. def extSubset(): Unit

  19. def externalID(): ExternalID

    externalID ::= SYSTEM S syslit PUBLIC S pubid S syslit

    externalID ::= SYSTEM S syslit PUBLIC S pubid S syslit

  20. def externalSource(systemLiteral: String): Source

  21. def hashCode(): Int

    Returns a hash code value for the object

    Returns a hash code value for the object.

    The default hashing algorithm is platform dependent.

    Note that it is allowed for two objects to have identical hash codes (o1.hashCode.equals(o2.hashCode)) yet not be equal (o1.equals(o2) returns false). A degenerate implementation could always return 0. However, it is required that if two objects are equal (o1.equals(o2) returns true) that they have identical hash codes (o1.hashCode.equals(o2.hashCode)). Therefore, when overriding this method, be sure to verify that the behavior is consistent with the equals method.

    definition classes: AnyRef ⇐ Any
  22. def initialize: MarkupParser with MarkupHandler

    As the current code requires you to call nextch once manually after construction, this method formalizes that suboptimal reality

    As the current code requires you to call nextch once manually after construction, this method formalizes that suboptimal reality.

  23. var inpStack: List[Source]

    stack of inputs

    stack of inputs

  24. val input: Source

  25. def intSubset(): Unit

    "rec-xml/#ExtSubset" pe references may not occur within markup declarations

    "rec-xml/#ExtSubset" pe references may not occur within markup declarations

  26. def isAlpha(c: Char): Boolean

    These are 99% sure to be redundant but refactoring on the safe side

    These are 99% sure to be redundant but refactoring on the safe side.

    definition classes: TokenTests
  27. def isAlphaDigit(c: Char): Boolean

  28. def isName(s: String): Boolean

    Name ::= ( Letter | '_' ) (NameChar)*

    Name ::= ( Letter | '_' ) (NameChar)*

    see [5] of XML 1.0 specification

    definition classes: TokenTests
  29. def isNameChar(ch: Char): Boolean

    NameChar ::= Letter | Digit | '

    NameChar ::= Letter | Digit | '.' | '-' | '_' | ':' | CombiningChar | Extender

    see [4] and Appendix B of XML 1.0 specification

    definition classes: TokenTests
  30. def isNameStart(ch: Char): Boolean

    NameStart ::= ( Letter | '_' ) where Letter means in one of the Unicode general categories { Ll, Lu, Lo, Lt, Nl }

    NameStart ::= ( Letter | '_' ) where Letter means in one of the Unicode general categories { Ll, Lu, Lo, Lt, Nl }

    We do not allow a name to start with ':'. see [3] and Appendix B of XML 1.0 specification

    definition classes: TokenTests
  31. def isPubIDChar(ch: Char): Boolean

  32. def isValidIANAEncoding(ianaEncoding: Seq[Char]): Boolean

    Returns true if the encoding name is a valid IANA encoding

    Returns true if the encoding name is a valid IANA encoding. This method does not verify that there is a decoder available for this encoding, only that the characters are valid for an IANA encoding name.

    ianaEncoding

    The IANA encoding name.

    definition classes: TokenTests
  33. def lookahead(): BufferedIterator[Char]

    Create a lookahead reader which does not influence the input

    Create a lookahead reader which does not influence the input

  34. def markupDecl(): Unit

  35. def markupDecl1(): Any

  36. def nextch: Char

    this method assign the next character to ch and advances in input

    this method assign the next character to ch and advances in input

  37. def normalizeAttributeValue(attval: String): String

    for the moment, replace only character references see spec 3

    for the moment, replace only character references see spec 3.3.3 precond: cbuf empty

  38. def notationDecl(): Unit

    'N' notationDecl ::= "OTATION"

    'N' notationDecl ::= "OTATION"

  39. def parseDTD(): Unit

    parses document type declaration and assigns it to instance variable dtd

    parses document type declaration and assigns it to instance variable dtd.

    <! parseDTD ::= DOCTYPE name ... >

  40. def pop(): Unit

  41. var pos: Int

    holds the position in the source file

    holds the position in the source file

  42. val preserveWS: Boolean

    if true, does not remove surplus whitespace

    if true, does not remove surplus whitespace

    attributes: abstract
  43. def prolog(): (Option[String], Option[String], Option[Boolean])

    <? prolog ::= xml S? // this is a bit more lenient than necessary

    <? prolog ::= xml S? // this is a bit more lenient than necessary...

  44. def pubidLiteral(): String

  45. def push(entityName: String): Unit

  46. def pushExternal(systemId: String): Unit

  47. def reportSyntaxError(str: String): Unit

  48. def reportSyntaxError(pos: Int, str: String): Unit

  49. def reportValidationError(pos: Int, str: String): Unit

  50. def returning[T](x: T)(f: (T) ⇒ Unit): T

  51. def systemLiteral(): String

    attribute value, terminated by either ' or "

    attribute value, terminated by either ' or ". value may not contain <. AttValue ::= ' { _ } ' | " { _ } "

  52. def textDecl(): (Option[String], Option[String])

    prolog, but without standalone

    prolog, but without standalone

  53. var tmppos: Int

    holds temporary values of pos

    holds temporary values of pos

  54. def toString(): String

    Returns a string representation of the object

    Returns a string representation of the object.

    The default representation is platform dependent.

    definition classes: AnyRef ⇐ Any
  55. def xAttributeValue(): String

    attribute value, terminated by either ' or "

    attribute value, terminated by either ' or ". value may not contain <. AttValue ::= ' { _ } ' | " { _ } "

  56. def xAttributes(pscope: NamespaceBinding): (MetaData, NamespaceBinding)

    parse attribute and create namespace scope, metadata [41] Attributes ::= { S Name Eq AttValue }

    parse attribute and create namespace scope, metadata [41] Attributes ::= { S Name Eq AttValue }

  57. def xCharData: NodeSeq

    '<! CharData ::= [CDATA[ ( {char} - {char}"]]>"{char} ) ']]>'

    '<! CharData ::= [CDATA[ ( {char} - {char}"]]>"{char} ) ']]>'

    see [15]

  58. def xCharRef(ch: () ⇒ Char, nextch: () ⇒ Unit): String

    CharRef ::= "&#" '0'

    CharRef ::= "&#" '0'..'9' {'0'..'9'} ";" | "&#x" '0'..'9'|'A'..'F'|'a'..'f' { hexdigit } ";"

    see [66]

  59. def xComment: NodeSeq

    Comment ::= '<!--' ((Char - '-') | ('-' (Char - '-')))* '-->'

    Comment ::= '<!--' ((Char - '-') | ('-' (Char - '-')))* '-->'

    see [15]

  60. def xEQ: Unit

    scan [S] '=' [S]

    scan [S] '=' [S]

    definition classes: MarkupParserCommon
  61. def xEndTag(n: String): Unit

    [42] '<' xmlEndTag ::= '<' '/' Name S? '>'

    [42] '<' xmlEndTag ::= '<' '/' Name S? '>'

  62. def xEntityValue(): String

    entity value, terminated by either ' or "

    entity value, terminated by either ' or ". value may not contain <. AttValue ::= ' { _ } ' | " { _ } "

  63. def xHandleError(that: Char, msg: String): Unit

  64. def xName: String

    Name ::= (Letter | '_' | ':') (NameChar)*

    Name ::= (Letter | '_' | ':') (NameChar)*

    see [5] of XML 1.0 specification

  65. def xProcInstr: NodeSeq

    '<?' ProcInstr ::= Name [S ({Char} - ({Char}'>?' {Char})]'?>'

    '<?' ProcInstr ::= Name [S ({Char} - ({Char}'>?' {Char})]'?>'

    see [15]

  66. def xSpace: Unit

    scan [3] S ::= (#x20 | #x9 | #xD | #xA)+

    scan [3] S ::= (#x20 | #x9 | #xD | #xA)+

    definition classes: MarkupParserCommon
  67. def xSpaceOpt: Unit

    skip optional space S?

    skip optional space S?

    definition classes: MarkupParserCommon
  68. def xText: String

    parse character data

    parse character data. precondition: xEmbeddedBlock == false (we are not in a scala block)

  69. def xToken(that: Seq[Char]): Unit

  70. def xToken(that: Char): Unit

  71. def xmlProcInstr(): MetaData

    <? prolog ::= xml S

    <? prolog ::= xml S ... ?>