net.sf.mmm.util.xml.base
Class XmlUtilImpl

java.lang.Object
  extended by net.sf.mmm.util.component.base.AbstractComponent
      extended by net.sf.mmm.util.xml.base.XmlUtilImpl
All Implemented Interfaces:
XmlUtil

@Singleton
@Named
public class XmlUtilImpl
extends AbstractComponent
implements XmlUtil

This utility class contains methods that help to deal with XML.

Since:
1.0.2
Author:
Joerg Hohwiller (hohwille at users.sourceforge.net)
See Also:
DomUtilImpl

Field Summary
private static Map<String,Character> ENTITY_MAP
           
private static XmlUtil instance
           
 
Fields inherited from interface net.sf.mmm.util.xml.api.XmlUtil
NAMESPACE_PREFIX_SCHEMA, NAMESPACE_PREFIX_SCHEMA_INSTANCE, NAMESPACE_PREFIX_SVG, NAMESPACE_PREFIX_XHTML, NAMESPACE_PREFIX_XINCLUDE, NAMESPACE_PREFIX_XLINK, NAMESPACE_PREFIX_XML, NAMESPACE_PREFIX_XML_EVENTS, NAMESPACE_PREFIX_XMLNS, NAMESPACE_PREFIX_XPATH_FUNCTIONS, NAMESPACE_PREFIX_XSLT, NAMESPACE_URI_MATHML, NAMESPACE_URI_RELAXNG_ANNOTATION, NAMESPACE_URI_RELAXNG_STRUCTURE, NAMESPACE_URI_SCHEMA, NAMESPACE_URI_SCHEMA_INSTANCE, NAMESPACE_URI_SVG, NAMESPACE_URI_XHTML, NAMESPACE_URI_XINCLUDE, NAMESPACE_URI_XLINK, NAMESPACE_URI_XML, NAMESPACE_URI_XML_EVENTS, NAMESPACE_URI_XMLNS, NAMESPACE_URI_XPATH_FUNCTIONS, NAMESPACE_URI_XSLT
 
Constructor Summary
XmlUtilImpl()
          The constructor.
 
Method Summary
 Reader createXmlReader(InputStream inputStream)
          This method creates a Reader from the given inputStream that uses the encoding specified in the (potential) XML header of the InputStreams content.
 Reader createXmlReader(InputStream inputStream, Charset defaultCharset)
          This method creates a Reader from the given inputStream that uses the encoding specified in the (potential) XML header of the InputStreams content.
 String escapeXml(String string, boolean escapeQuotations)
          This method escapes the given string for usage in XML (or HTML, etc.).
 void escapeXml(String string, Writer writer, boolean escapeQuotations)
          This method writes the given string to the writer while escaping special characters for XML (or HTML, etc.).
 ParserState extractPlainText(String htmlFragment, StringBuilder buffer, ParserState parserState)
          This method extracts the plain text from the given htmlFragment and appends it to the given buffer.
static XmlUtil getInstance()
          This method gets the singleton instance of this XmlUtilImpl.
 Character resolveEntity(String entityName)
          This method resolves an HTML entity given by entityName.
 
Methods inherited from class net.sf.mmm.util.component.base.AbstractComponent
doInitialize, doInitialized, getInitializationState, initialize
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

instance

private static XmlUtil instance
See Also:
getInstance()

ENTITY_MAP

private static final Map<String,Character> ENTITY_MAP
See Also:
resolveEntity(String)
Constructor Detail

XmlUtilImpl

public XmlUtilImpl()
The constructor.

Method Detail

getInstance

public static XmlUtil getInstance()
This method gets the singleton instance of this XmlUtilImpl.
This design is the best compromise between easy access (via this indirection you have direct, static access to all offered functionality) and IoC-style design which allows extension and customization.
For IoC usage, simply ignore all static getInstance() methods and construct new instances via the container-framework of your choice (like plexus, pico, springframework, etc.). To wire up the dependent components everything is properly annotated using common-annotations (JSR-250). If your container does NOT support this, you should consider using a better one.

Returns:
the singleton instance.

createXmlReader

public Reader createXmlReader(InputStream inputStream)
This method creates a Reader from the given inputStream that uses the encoding specified in the (potential) XML header of the InputStreams content. If no XML header is specified, the default encoding is used.

Specified by:
createXmlReader in interface XmlUtil
Parameters:
inputStream - is a fresh input-stream that is supposed to point to the content of an XML document.
Returns:
a reader on the given inputStream that takes respect on the encoding specified in the (potential) XML header.

createXmlReader

public Reader createXmlReader(InputStream inputStream,
                              Charset defaultCharset)
This method creates a Reader from the given inputStream that uses the encoding specified in the (potential) XML header of the InputStreams content. If no XML header is specified, the default encoding is used.

Specified by:
createXmlReader in interface XmlUtil
Parameters:
inputStream - is a fresh input-stream that is supposed to point to the content of an XML document.
defaultCharset - is the Charset used if NO encoding was specified via an XML header.
Returns:
a reader on the given inputStream that takes respect on the encoding specified in the (potential) XML header.

escapeXml

public String escapeXml(String string,
                        boolean escapeQuotations)
This method escapes the given string for usage in XML (or HTML, etc.).

Specified by:
escapeXml in interface XmlUtil
Parameters:
string - is the string to escape.
escapeQuotations - if true also the ASCII quotation characters (apos '\'' and quot '"') will be escaped, else if false quotations are untouched. Set this to true if you are writing the value of an attribute.
Returns:
the escaped string.

escapeXml

public void escapeXml(String string,
                      Writer writer,
                      boolean escapeQuotations)
This method writes the given string to the writer while escaping special characters for XML (or HTML, etc.).

Specified by:
escapeXml in interface XmlUtil
Parameters:
string - is the string to escape.
writer - is where to write the string to.
escapeQuotations - if true also the ASCII quotation characters (apos '\'' and quot '"') will be escaped, else if false quotations are untouched. Set this to true if you are writing the value of an attribute.

resolveEntity

public Character resolveEntity(String entityName)
This method resolves an HTML entity given by entityName.

Specified by:
resolveEntity in interface XmlUtil
Parameters:
entityName - is the bare name of the entity (e.g. "amp" or "uuml"). Please note that entity-names are case-sensitive.
Returns:
the value of the entity or null if no entity exists for the given entityName.

extractPlainText

public ParserState extractPlainText(String htmlFragment,
                                    StringBuilder buffer,
                                    ParserState parserState)
This method extracts the plain text from the given htmlFragment and appends it to the given buffer. This includes removing tags, un-escaping entities and parsing CDATA sections. Unlike DOM parsers this method is completely fault tolerant, fast and uses a minimum amount of memory.
ATTENTION:
Be aware that the caller is responsible for reading the HTML with the proper encoding (according to Content-Type from HTTP header and/or META tag).

Specified by:
extractPlainText in interface XmlUtil
Parameters:
htmlFragment - is the HTML fragment to parse.
buffer - is the buffer where the plain text will be appended to.
parserState - is the state to continue on a subsequent call for multiple htmlFragments of the same HTML-document or null for a fresh start.
Returns:
the state at the end of htmlFragment. You can pass this as parserState argument on subsequent call to continue parsing.


Copyright © 2001-2010 mmm-Team. All Rights Reserved.