|
||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||
java.lang.Objectorg.htmlparser.visitors.NodeVisitor
org.opencms.util.CmsHtmlParser
public class CmsHtmlParser
Base utility class for OpenCms
implementations, which provides some often used utility functions.
NodeVisitor
This base implementation is only a "pass through" class, that is the content is parsed, but the generated result is exactly identical to the input.
| Field Summary | |
|---|---|
protected boolean |
m_echo
Indicates if "echo" mode is on, that is all content is written to the result by default. |
protected List<String> |
m_noAutoCloseTags
List of upper case tag name strings of tags that should not be auto-corrected if closing divs are missing. |
protected StringBuffer |
m_result
The buffer to write the out to. |
protected static String[] |
TAG_ARRAY
The array of supported tag names. |
protected static List<String> |
TAG_LIST
The list of supported tag names. |
| Constructor Summary | |
|---|---|
CmsHtmlParser()
Creates a new instance of the html converter with echo mode set to false. |
|
CmsHtmlParser(boolean echo)
Creates a new instance of the html converter. |
|
| Method Summary | |
|---|---|
protected String |
collapse(String string)
Collapse HTML whitespace in the given String. |
protected org.htmlparser.PrototypicalNodeFactory |
configureNoAutoCorrectionTags()
Internally degrades Composite tags that do have children in the DOM tree to simple single tags. |
String |
getConfiguration()
Returns the configuartion String of this visitor or the empty String if was not provided before. |
List<String> |
getNoAutoCloseTags()
Returns a list of upper case tag names for which parsing / visiting will not correct missing closing tags. |
String |
getResult()
Returns the text extraction result. |
String |
getTagHtml(org.htmlparser.Tag tag)
Returns the HTML for the given tag itself (not the tag content). |
String |
process(String html,
String encoding)
Extracts the text from the given html content, assuming the given html encoding. |
void |
setConfiguration(String configuration)
Set a configuartion String for this visitor. |
void |
setNoAutoCloseTags(List<String> noAutoCloseTagList)
Sets a list of upper case tag names for which parsing / visiting should not correct missing closing tags. |
void |
visitEndTag(org.htmlparser.Tag tag)
Visitor method (callback) invoked when a closing Tag is encountered. |
void |
visitRemarkNode(org.htmlparser.Remark remark)
Visitor method (callback) invoked when a remark Tag (HTML comment) is encountered. |
void |
visitStringNode(org.htmlparser.Text text)
Visitor method (callback) invoked when a remark Tag (HTML comment) is encountered. |
void |
visitTag(org.htmlparser.Tag tag)
Visitor method (callback) invoked when a starting Tag (HTML comment) is encountered. |
| Methods inherited from class org.htmlparser.visitors.NodeVisitor |
|---|
beginParsing, finishedParsing, shouldRecurseChildren, shouldRecurseSelf |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
protected List<String> m_noAutoCloseTags
protected static final String[] TAG_ARRAY
protected static final List<String> TAG_LIST
protected boolean m_echo
protected StringBuffer m_result
| Constructor Detail |
|---|
public CmsHtmlParser()
false.
public CmsHtmlParser(boolean echo)
echo - indicates if "echo" mode is on, that is all content is written to the result| Method Detail |
|---|
protected org.htmlparser.PrototypicalNodeFactory configureNoAutoCorrectionTags()
setNoAutoCloseTags(List)public String getConfiguration()
I_CmsHtmlNodeVisitor
getConfiguration in interface I_CmsHtmlNodeVisitorI_CmsHtmlNodeVisitor.getConfiguration()public String getResult()
I_CmsHtmlNodeVisitor
getResult in interface I_CmsHtmlNodeVisitorI_CmsHtmlNodeVisitor.getResult()public String getTagHtml(org.htmlparser.Tag tag)
tag - the tag to create the HTML for
public String process(String html,
String encoding)
throws org.htmlparser.util.ParserException
I_CmsHtmlNodeVisitor
process in interface I_CmsHtmlNodeVisitorhtml - the content to extract the plain text fromencoding - the encoding to use
org.htmlparser.util.ParserException - if something goes wrongI_CmsHtmlNodeVisitor.process(java.lang.String, java.lang.String)public void setConfiguration(String configuration)
I_CmsHtmlNodeVisitorThis will most likely be done with data from an xsd, custom jsp tag, ...
setConfiguration in interface I_CmsHtmlNodeVisitorconfiguration - the configuration of this visitor to set.I_CmsHtmlNodeVisitor.setConfiguration(java.lang.String)public void visitEndTag(org.htmlparser.Tag tag)
I_CmsHtmlNodeVisitor
visitEndTag in interface I_CmsHtmlNodeVisitorvisitEndTag in class org.htmlparser.visitors.NodeVisitortag - the tag that is ended.I_CmsHtmlNodeVisitor.visitEndTag(org.htmlparser.Tag)public void visitRemarkNode(org.htmlparser.Remark remark)
I_CmsHtmlNodeVisitor
visitRemarkNode in interface I_CmsHtmlNodeVisitorvisitRemarkNode in class org.htmlparser.visitors.NodeVisitorremark - the remark Tag to visit.I_CmsHtmlNodeVisitor.visitRemarkNode(org.htmlparser.Remark)public void visitStringNode(org.htmlparser.Text text)
I_CmsHtmlNodeVisitor
visitStringNode in interface I_CmsHtmlNodeVisitorvisitStringNode in class org.htmlparser.visitors.NodeVisitortext - the text that is visited.I_CmsHtmlNodeVisitor.visitStringNode(org.htmlparser.Text)public void visitTag(org.htmlparser.Tag tag)
I_CmsHtmlNodeVisitor
visitTag in interface I_CmsHtmlNodeVisitorvisitTag in class org.htmlparser.visitors.NodeVisitortag - the tag that is visited.I_CmsHtmlNodeVisitor.visitTag(org.htmlparser.Tag)protected String collapse(String string)
string - the string to collapse
public List<String> getNoAutoCloseTags()
public void setNoAutoCloseTags(List<String> noAutoCloseTagList)
setNoAutoCloseTags in interface I_CmsHtmlNodeVisitornoAutoCloseTagList - a list of upper case tag names for which parsing / visiting
should not correct missing closing tags to set.
|
||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||