public final class Encode extends Object
<input value="<%=Encode.forHtml(value)%>" />
There are two versions of each contextual encoding method. The first
takes a String
argument and returns the encoded version as a
String
. The second version writes the encoded version directly
to a Writer
.
Please make sure to read and understand the context that the method encodes for. Encoding for the incorrect context will likely lead to exposing a cross-site scripting vulnerability.
Modifier and Type | Method and Description |
---|---|
static String |
forCDATA(String input)
Encodes data for an XML CDATA section.
|
static void |
forCDATA(Writer out,
String input)
See
forCDATA(String) for description of encoding. |
static String |
forCssString(String input)
Encodes for CSS strings.
|
static void |
forCssString(Writer out,
String input)
See
forCssString(String) for description of encoding. |
static String |
forCssUrl(String input)
Encodes for CSS URL contexts.
|
static void |
forCssUrl(Writer out,
String input)
See
forCssUrl(String) for description of encoding. |
static String |
forHtml(String input)
Encodes for (X)HTML text content and text attributes.
|
static void |
forHtml(Writer out,
String input)
See
forHtml(String) for description of encoding. |
static String |
forHtmlAttribute(String input)
This method encodes for HTML text attributes.
|
static void |
forHtmlAttribute(Writer out,
String input)
See
forHtmlAttribute(String) for description of encoding. |
static String |
forHtmlContent(String input)
This method encodes for HTML text content.
|
static void |
forHtmlContent(Writer out,
String input)
See
forHtmlContent(String) for description of encoding. |
static String |
forHtmlUnquotedAttribute(String input)
Encodes for unquoted HTML attribute values.
|
static void |
forHtmlUnquotedAttribute(Writer out,
String input)
See
forHtmlUnquotedAttribute(String) for description of encoding. |
static String |
forJava(String input)
Encodes for a Java string.
|
static void |
forJava(Writer out,
String input)
See
forJava(String) for description of encoding. |
static String |
forJavaScript(String input)
Encodes for a JavaScript string.
|
static void |
forJavaScript(Writer out,
String input)
See
forJavaScript(String) for description of encoding. |
static String |
forJavaScriptAttribute(String input)
This method encodes for JavaScript strings contained within
HTML script attributes (such as
onclick ). |
static void |
forJavaScriptAttribute(Writer out,
String input)
See
forJavaScriptAttribute(String) for description of encoding. |
static String |
forJavaScriptBlock(String input)
This method encodes for JavaScript strings contained within
HTML script blocks.
|
static void |
forJavaScriptBlock(Writer out,
String input)
See
forJavaScriptBlock(String) for description of encoding. |
static String |
forJavaScriptSource(String input)
This method encodes for JavaScript strings contained within
a JavaScript or JSON file.
|
static void |
forJavaScriptSource(Writer out,
String input)
See
forJavaScriptSource(String) for description of encoding. |
static String |
forUri(String input)
Performs percent-encoding of a URL according to RFC 3986.
|
static void |
forUri(Writer out,
String input)
See
forUri(String) for description of encoding. |
static String |
forUriComponent(String input)
Performs percent-encoding for a component of a URI, such as a query
parameter name or value, path or query-string.
|
static void |
forUriComponent(Writer out,
String input)
See
forUriComponent(String) for description of encoding. |
static String |
forXml(String input)
Encoder for XML and XHTML.
|
static void |
forXml(Writer out,
String input)
See
forXml(String) for description of encoding. |
static String |
forXmlAttribute(String input)
Encoder for XML and XHTML attribute content.
|
static void |
forXmlAttribute(Writer out,
String input)
See
forXmlAttribute(String) for description of encoding. |
static String |
forXmlComment(String input)
Encoder for XML comments.
|
static void |
forXmlComment(Writer out,
String input)
See
forXmlComment(String) for description of encoding. |
static String |
forXmlContent(String input)
Encoder for XML and XHTML text content.
|
static void |
forXmlContent(Writer out,
String input)
See
forXmlContent(String) for description of encoding. |
public static String forHtml(String input)
Encodes for (X)HTML text content and text attributes. Since
this method encodes for both contexts, it may be slightly less
efficient to use this method over the methods targeted towards
the specific contexts (forHtmlAttribute(String)
and
forHtmlContent(String)
. In general this method should
be preferred unless you are really concerned with saving a few
bytes or are writing a framework that utilizes this
package.
<div><%=Encode.forHtml(unsafeData)%></div> <input value="<%=Encode.forHtml(unsafeData)%>" />
Input | Result |
---|---|
“& ” |
“& ” |
“< ” |
“< ” |
“> ” |
“> ” |
“" ” |
“" ” |
“' ” |
“' ” |
>
) is not
strictly required, but is included for maximum
compatibility."
) as it shorter than the also valid "
.#x9
| #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] |
[#x10000-#x10FFFF]
are considered valid.input
- the data to encodepublic static void forHtml(Writer out, String input) throws IOException
forHtml(String)
for description of encoding. This
version writes directly to a Writer without an intervening string.out
- where to write encoded outputinput
- the input string to encodeIOException
- if thrown by writerpublic static String forHtmlContent(String input)
This method encodes for HTML text content. It does not escape quotation characters and is thus unsafe for use with HTML attributes. Use either forHtml or forHtmlAttribute for those methods.
<div><%=Encode.forHtmlContent(unsafeData)%></div>
Input | Result |
---|---|
“& ” |
“& ” |
“< ” |
“< ” |
“> ” |
“> ” |
'
) and double-quote
character ("
) do not require encoding in HTML
blocks, unlike other HTML contexts.>
) is not
strictly required, but is included for maximum
compatibility."
) as it shorter than the also valid "
.#x9
| #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] |
[#x10000-#x10FFFF]
are considered valid.input
- the input to encodepublic static void forHtmlContent(Writer out, String input) throws IOException
forHtmlContent(String)
for description of encoding. This
version writes directly to a Writer without an intervening string.out
- where to write encoded outputinput
- the input string to encodeIOException
- if thrown by writerpublic static String forHtmlAttribute(String input)
This method encodes for HTML text attributes.
<div><%=Encode.forHtml(unsafeData)%></div>
Input | Result |
---|---|
“& ” |
“& ” |
“< ” |
“< ” |
“" ” |
“" ” |
“' ” |
“' ” |
'
) and the
double-quote character ("
) are encoded so this is safe
for HTML attributes with either enclosing character.>
) is not
required for attributes."
) as it shorter than the also valid "
.#x9
| #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] |
[#x10000-#x10FFFF]
are considered valid.input
- the input to encodepublic static void forHtmlAttribute(Writer out, String input) throws IOException
forHtmlAttribute(String)
for description of encoding. This
version writes directly to a Writer without an intervening string.out
- where to write encoded outputinput
- the input string to encodeIOException
- if thrown by writerpublic static String forHtmlUnquotedAttribute(String input)
Encodes for unquoted HTML attribute values. forHtml(String)
or forHtmlAttribute(String)
should
usually be preferred over this method as quoted attributes are
XHTML compliant.
When using this method, the caller is not required to provide quotes around the attribute (since it is encoded for such context). The caller should make sure that the attribute value does not abut unsafe characters--and thus should usually err on the side of including a space character after the value.
Use of this method is discouraged as quoted attributes are generally more compatible and safer. Also note, that no attempt has been made to optimize this encoding, though it is still probably faster than other encoding libraries.
<input value=<%=Encode.forHtmlUnquotedAttribute(input)%> >
Input | Result |
---|---|
U+0009 (horizontal tab) | “	 ” |
U+000A (line feed) | “ ” |
U+000C (form feed) | “ ” |
U+000D (carriage return) | “ ” |
U+0020 (space) | “  ” |
“& ” | “& ” |
“< ” | “< ” |
“> ” | “> ” |
“" ” | “" ” |
“' ” | “' ” |
“/ ” | “/ ” |
“= ” | “= ” |
“` ” | “` ” |
U+0085 (next line) | “… ” |
U+2028 (line separator) | “
 ” |
U+2029 (paragraph separator) | “
 ” |
0-9, a-z, A-Z
, “!
”, “#
”, “$
”, “%
”,
“(
”, “)
”, “*
”, “+
”, “,
”,
“-
”, “.
”, “[
”, “\
”, “]
”,
“^
”, “_
”, “
}”.input
- the attribute value to be encoded.public static void forHtmlUnquotedAttribute(Writer out, String input) throws IOException
forHtmlUnquotedAttribute(String)
for description of encoding. This
version writes directly to a Writer without an intervening string.out
- where to write encoded outputinput
- the input string to encodeIOException
- if thrown by writerpublic static String forCssString(String input)
<div style="background: url('<=Encode.forCssString(...)%>');"> <style type="text/css"> background: url('<%=Encode.forCssString(...)%>'); </style>
U+0000
- U+001f
,
“"
”,
“'
”,
“\
”,
“<
”,
“&
”,
“(
”,
“)
”,
“/
”,
“>
”,
U+007f
,
line separator (U+2028
),
paragraph separator (U+2029
).\xxx
where xxx
is the shortest hexidecimal representation of
its Unicode code point (after decoding surrogate pairs if
necessary). This encoding is never zero padded. Thus, for
example, the tab character is encoded as \9
, not \0009
.'1
” is encoded as “\27
1
”, and not as “\271
”. If a space
is not necessary, it is not included, thus “'x
” is encoded as “\27x
”, and not as
“\27 x
”.input
- the input to encodepublic static void forCssString(Writer out, String input) throws IOException
forCssString(String)
for description of encoding. This
version writes directly to a Writer without an intervening string.out
- where to write encoded outputinput
- the input string to encodeIOException
- if thrown by writerpublic static String forCssUrl(String input)
"url("
and ")"
. It is safe for use in both style blocks and attributes in HTML.
Note: this does not do any checking on the quality or safety of the URL
itself. The caller should insure that the URL is safe for embedding
(e.g. input validation) by other means.
<div style="background:url(<=Encode.forCssUrl(...)%>);"> <style type="text/css"> background: url(<%=Encode.forCssUrl(...)%>); </style>
U+0000
- U+001f
,
“"
”,
“'
”,
“\
”,
“<
”,
“&
”,
“/
”,
“>
”,
U+007f
,
line separator (U+2028
),
paragraph separator (U+2029
).\xxx
where xxx
is the shortest hexidecimal representation of
its Unicode code point (after decoding surrogate pairs if
necessary). This encoding is never zero padded. Thus, for
example, the tab character is encoded as \9
, not \0009
.'1
” is encoded as “\27
1
”, and not as “\271
”. If a space
is not necessary, it is not included, thus “'x
” is encoded as “\27x
”, and not as
“\27 x
”.input
- the input to encodepublic static void forCssUrl(Writer out, String input) throws IOException
forCssUrl(String)
for description of encoding. This
version writes directly to a Writer without an intervening string.out
- where to write encoded outputinput
- the input string to encodeIOException
- if thrown by writerpublic static String forUri(String input)
URI
instead. Note: this is a
particularly dangerous context to put untrusted content in, as for
example a "javascript:" URL provided by a malicious user would be
"properly" escaped, and still execute.
The following characters are not encoded:
U+20: ! # $ & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; = ? U+40: @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ ] _ U+60: a b c d e f g h i j k l m n o p q r s t u v w x y z ~
'
) is not encoded.<a
href="<%=Encode.forHtmlAttribute(Encode.forUri(uri))%>">...</a>
.
(Note, the single-quote character ('
) is not
encoded.)%xx
where xx
is the two-digit hexidecimal representation of the
byte. (The implementation does this as one step for
performance.)-
) character.input
- the input to encodepublic static void forUri(Writer out, String input) throws IOException
forUri(String)
for description of encoding. This
version writes directly to a Writer without an intervening string.out
- where to write encoded outputinput
- the input string to encodeIOException
- if thrown by writerpublic static String forUriComponent(String input)
<a href="http://www.owasp.org/<%=Encode.forUriComponent(...)%>?query#fragment"> <a href="/search?value=<%=Encode.forUriComponent(...)%>&order=1#top">
The following characters are not encoded:
U+20: - . 0 1 2 3 4 5 6 7 8 9 U+40: @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z _ U+60: a b c d e f g h i j k l m n o p q r s t u v w x y z ~
forUri(String)
this method is safe to be
used in most containing contexts, including: HTML/XML, CSS,
and JavaScript contexts.%xx
where xx
is the two-digit hexidecimal representation of the
byte. (The implementation does this as one step for
performance.)-
) character.input
- the input to encodepublic static void forUriComponent(Writer out, String input) throws IOException
forUriComponent(String)
for description of encoding. This
version writes directly to a Writer without an intervening string.out
- where to write encoded outputinput
- the input string to encodeIOException
- if thrown by writerpublic static String forXml(String input)
forHtml(String)
for a
description of the encoding and context.input
- the input to encodeforHtml(String)
public static void forXml(Writer out, String input) throws IOException
forXml(String)
for description of encoding. This
version writes directly to a Writer without an intervening string.out
- where to write encoded outputinput
- the input string to encodeIOException
- if thrown by writerpublic static String forXmlContent(String input)
forHtmlContent(String)
for description of encoding and
context.input
- the input to encodeforHtmlContent(String)
public static void forXmlContent(Writer out, String input) throws IOException
forXmlContent(String)
for description of encoding. This
version writes directly to a Writer without an intervening string.out
- where to write encoded outputinput
- the input string to encodeIOException
- if thrown by writerpublic static String forXmlAttribute(String input)
forHtmlAttribute(String)
for description of encoding and
context.input
- the input to encodeforHtmlAttribute(String)
public static void forXmlAttribute(Writer out, String input) throws IOException
forXmlAttribute(String)
for description of encoding. This
version writes directly to a Writer without an intervening string.out
- where to write encoded outputinput
- the input string to encodeIOException
- if thrown by writerpublic static String forXmlComment(String input)
<--if[IE]-->
).
For (X)HTML it is recommend that unsafe content never be included
in a comment.
The caller must provide the comment start and end sequences.
This method replaces all invalid XML characters with spaces, and replaces the "--" sequence (which is invalid in XML comments) with "-~" (hyphen-tilde). This encoding behavior may change in future releases. If the comments need to be decoded, the caller will need to come up with their own encode/decode system.
out.println("<?xml version='1.0'?>"); out.println("<data>"); out.println("&;lt;!-- "+Encode.forXmlComment(comment)+" -->"); out.println("</data>");
input
- the input to encodepublic static void forXmlComment(Writer out, String input) throws IOException
forXmlComment(String)
for description of encoding. This
version writes directly to a Writer without an intervening string.out
- where to write encoded outputinput
- the input string to encodeIOException
- if thrown by writerpublic static String forCDATA(String input)
"]]>"
, it will be replaced by
"]]>]]<![CDATA[>"
.
As with all XML contexts, characters that are invalid according to the
XML specification will be replaced by a space character. Caller must
provide the CDATA section boundaries.
<xml-data><![CDATA[<%=Encode.forCDATA(...)%>]]></xml-data>
input
- the input to encodepublic static void forCDATA(Writer out, String input) throws IOException
forCDATA(String)
for description of encoding. This
version writes directly to a Writer without an intervening string.out
- where to write encoded outputinput
- the input string to encodeIOException
- if thrown by writerpublic static String forJava(String input)
out.println("public class Hello {"); out.println(" public static void main(String[] args) {"); out.println(" System.out.println(\"" + Encode.forJava(message) + "\");"); out.println(" }"); out.println("}");
input
- the input to encodepublic static void forJava(Writer out, String input) throws IOException
forJava(String)
for description of encoding. This
version writes directly to a Writer without an intervening string.out
- where to write encoded outputinput
- the input string to encodeIOException
- if thrown by writerpublic static String forJavaScript(String input)
Encodes for a JavaScript string. It is safe for use in HTML
script attributes (such as onclick
), script
blocks, JSON files, and JavaScript source. The caller MUST
provide the surrounding quotation characters for the string.
Since this performs additional encoding so it can work in all
of the JavaScript contexts listed, it may be slightly less
efficient then using one of the methods targetted to a specific
JavaScript context (forJavaScriptAttribute(String)
,
forJavaScriptBlock(java.lang.String)
, forJavaScriptSource(java.lang.String)
).
Unless you are interested in saving a few bytes of output or
are writing a framework on top of this library, it is recommend
that you use this method over the others.
<button onclick="alert('<%=Encode.forJavaScript(data)%>');"> <script type="text/javascript"> var data = "<%=Encode.forJavaScript(data)%>"; </script>
Input Character | Encoded Result | Notes | |
---|---|---|---|
U+0008 | BS | \b |
Backspace character |
U+0009 | HT | \t |
Horizontal tab character |
U+000A | LF | \n |
Line feed character |
U+000C | FF | \f |
Form feed character |
U+000D | CR | \r |
Carriage return character |
U+0022 | " |
\x22 |
The encoding \" is not used here because
it is not safe for use in HTML attributes. (In HTML
attributes, it would also be correct to use
"\"".) |
U+0026 | & |
\x26 |
Ampersand character |
U+0027 | ' |
\x27 |
The encoding \' is not used here because
it is not safe for use in HTML attributes. (In HTML
attributes, it would also be correct to use
"\'".) |
U+002F | / |
\/ |
This encoding is used to avoid an input sequence "</" from prematurely terminating a </script> block. |
U+005C | \ |
\\ |
|
U+0000 to U+001F | \x## |
Hexadecimal encoding is used for characters in this range that were not already mentioned in above. |
input
- the input string to encodeforJavaScriptAttribute(String)
,
forJavaScriptBlock(String)
public static void forJavaScript(Writer out, String input) throws IOException
forJavaScript(String)
for description of encoding. This
version writes directly to a Writer without an intervening string.out
- where to write encoded outputinput
- the input string to encodeIOException
- if thrown by writerpublic static String forJavaScriptAttribute(String input)
This method encodes for JavaScript strings contained within
HTML script attributes (such as onclick
). It is
NOT safe for use in script blocks. The caller MUST provide the
surrounding quotation characters. This method performs the
same encode as forJavaScript(String)
with the
exception that /
is not escaped.
Unless you are interested in saving a few bytes of
output or are writing a framework on top of this library, it is
recommend that you use forJavaScript(String)
over this
method.
<button onclick="alert('<%=Encode.forJavaScriptAttribute(data)%>');">
input
- the input string to encodeforJavaScript(String)
,
forJavaScriptBlock(String)
public static void forJavaScriptAttribute(Writer out, String input) throws IOException
forJavaScriptAttribute(String)
for description of encoding. This
version writes directly to a Writer without an intervening string.out
- where to write encoded outputinput
- the input string to encodeIOException
- if thrown by writerpublic static String forJavaScriptBlock(String input)
This method encodes for JavaScript strings contained within
HTML script blocks. It is NOT safe for use in script
attributes (such as onclick
). The caller must
provide the surrounding quotation characters. This method
performs the same encode as forJavaScript(String)
with
the exception that "
and '
are
encoded as \"
and \'
respectively.
Unless you are interested in saving a few bytes of
output or are writing a framework on top of this library, it is
recommend that you use forJavaScript(String)
over this
method.
<script type="text/javascript"> var data = "<%=Encode.forJavaScriptBlock(data)%>"; </script>
input
- the input string to encodeforJavaScript(String)
,
forJavaScriptAttribute(String)
public static void forJavaScriptBlock(Writer out, String input) throws IOException
forJavaScriptBlock(String)
for description of encoding. This
version writes directly to a Writer without an intervening string.out
- where to write encoded outputinput
- the input string to encodeIOException
- if thrown by writerpublic static String forJavaScriptSource(String input)
This method encodes for JavaScript strings contained within
a JavaScript or JSON file. This method is NOT safe for
use in ANY context embedded in HTML. The caller must
provide the surrounding quotation characters. This method
performs the same encode as forJavaScript(String)
with
the exception that /
and &
are not
escaped and "
and '
are encoded as
\"
and \'
respectively.
Unless you are interested in saving a few bytes of
output or are writing a framework on top of this library, it is
recommend that you use forJavaScript(String)
over this
method.
<%@page contentType="text/javascript; charset=UTF-8"%> var data = "<%=Encode.forJavaScriptSource(data)%>";This example is serving up JSON data (users of this use-case are encouraged to read up on "JSON Hijacking"):
<%@page contentType="application/json; charset=UTF-8"%> <% myapp.jsonHijackingPreventionMeasure(); %> {"data":"<%=Encode.forJavaScriptSource(data)%>"}
input
- the input string to encodeforJavaScript(String)
,
forJavaScriptAttribute(String)
,
forJavaScriptBlock(String)
public static void forJavaScriptSource(Writer out, String input) throws IOException
forJavaScriptSource(String)
for description of encoding. This
version writes directly to a Writer without an intervening string.out
- where to write encoded outputinput
- the input string to encodeIOException
- if thrown by writerCopyright © 2011-2014 OWASP (Open Web-Application Security Project). All Rights Reserved.