@Beta @GwtCompatible public abstract class UnicodeEscaper extends Escaper
Escaper
that converts literal text into a format safe for
inclusion in a particular context (such as an XML document). Typically (but
not always), the inverse process of "unescaping" the text is performed
automatically by the relevant parser.
For example, an XML escaper would convert the literal string "Foo<Bar>"
into "Foo<Bar>"
to prevent "<Bar>"
from
being confused with an XML tag. When the resulting XML document is parsed,
the parser API will return this text as the original literal string "Foo<Bar>"
.
Note: This class is similar to CharEscaper
but with one
very important difference. A CharEscaper can only process Java
UTF16 characters in
isolation and may not cope when it encounters surrogate pairs. This class
facilitates the correct escaping of all Unicode characters.
As there are important reasons, including potential security issues, to handle Unicode correctly if you are considering implementing a new escaper you should favor using UnicodeEscaper wherever possible.
A UnicodeEscaper
instance is required to be stateless, and safe
when used concurrently by multiple threads.
Several popular escapers are defined as constants in classes like HtmlEscapers
, XmlEscapers
, and SourceCodeEscapers
. To create
your own escapers extend this class and implement the escape(int)
method.
Modifier and Type | Method and Description |
---|---|
String |
escape(String string)
Returns the escaped form of a given literal string.
|
asFunction
public String escape(String string)
If you are escaping input in arbitrary successive chunks, then it is not
generally safe to use this method. If an input string ends with an
unmatched high surrogate character, then this method will throw
IllegalArgumentException
. You should ensure your input is valid UTF-16 before calling this
method.
Note: When implementing an escaper it is a good idea to override
this method for efficiency by inlining the implementation of
nextEscapeIndex(CharSequence, int, int)
directly. Doing this for
PercentEscaper
more than doubled the
performance for unescaped strings (as measured by CharEscapersBenchmark
).
escape
in class Escaper
string
- the literal string to be escapedstring
NullPointerException
- if string
is nullIllegalArgumentException
- if invalid surrogate characters are
encounteredCopyright © 2010 - 2020 Adobe. All Rights Reserved