public class DefaultEncoder extends Object implements Encoder
Encoder
Modifier and Type | Class and Description |
---|---|
static class |
DefaultEncoder.UriSegment |
Constructor and Description |
---|
DefaultEncoder(List<String> codecNames)
Instantiates a new
DefaultEncoder based on the specified list of
codec names. |
Modifier and Type | Method and Description |
---|---|
protected String |
buildUrl(Map<DefaultEncoder.UriSegment,String> parseMap)
All the parts should be canonicalized by this point.
|
String |
canonicalize(String input)
This method is equivalent to calling
Encoder.canonicalize(input, restrictMultiple, restrictMixed); . |
String |
canonicalize(String input,
boolean strict)
This method is the equivalent to calling
Encoder.canonicalize(input, strict, strict); . |
String |
canonicalize(String input,
boolean restrictMultiple,
boolean restrictMixed)
Canonicalization is simply the operation of reducing a possibly encoded
string down to its simplest form.
|
String |
decodeForHTML(String input)
Decodes HTML entities.
|
byte[] |
decodeFromBase64(String input)
Decode data encoded with BASE-64 encoding.
|
String |
decodeFromJSON(String input)
Decode data encoded for JSON strings.
|
String |
decodeFromURL(String input)
Decode from URL.
|
String |
encodeForBase64(byte[] input,
boolean wrap)
Encode for Base64.
|
String |
encodeForCSS(String input)
Encode data for use in Cascading Style Sheets (CSS) content.
|
String |
encodeForDN(String input)
Encode data for use in an LDAP distinguished name.
|
String |
encodeForHTML(String input)
Encode data for use in HTML using HTML entity encoding
|
String |
encodeForHTMLAttribute(String input)
Encode data for use in HTML attributes.
|
String |
encodeForJavaScript(String input)
Encode data for insertion inside a data value or function argument in JavaScript.
|
String |
encodeForJSON(String input)
Encode data for use in JSON strings.
|
String |
encodeForLDAP(String input)
Encode data for use in LDAP queries.
|
String |
encodeForLDAP(String input,
boolean encodeWildcards)
Encode data for use in LDAP queries.
|
String |
encodeForOS(Codec codec,
String input)
Encode for an operating system command shell according to the selected codec (appropriate codecs include the WindowsCodec and UnixCodec).
|
String |
encodeForSQL(Codec codec,
String input)
Encode input for use in a SQL query, according to the selected codec
(appropriate codecs include the MySQLCodec and OracleCodec).
|
String |
encodeForURL(String input)
Encode for use in a URL.
|
String |
encodeForVBScript(String input)
Encode data for insertion inside a data value in a Visual Basic script.
|
String |
encodeForXML(String input)
Encode data for use in an XML element.
|
String |
encodeForXMLAttribute(String input)
Encode data for use in an XML attribute.
|
String |
encodeForXPath(String input)
Encode data for use in an XPath query.
|
String |
getCanonicalizedURI(URI dirtyUri)
Get a version of the input URI that will be safe to run regex and other validations against.
|
static Encoder |
getInstance() |
Map<String,List<String>> |
splitQuery(URI uri)
The meat of this method was taken from StackOverflow: http://stackoverflow.com/a/13592567/557153
It has been modified to return a canonicalized key and value pairing.
|
public static Encoder getInstance()
public String canonicalize(String input)
Encoder.canonicalize(input, restrictMultiple, restrictMixed);
.
The default values for restrictMultiple
and restrictMixed
come from ESAPI.properties
.
Encoder.AllowMultipleEncoding=false Encoder.AllowMixedEncoding=falseand the default codecs that are used for canonicalization are the list of codecs that comes from:
Encoder.DefaultCodecList=HTMLEntityCodec,PercentCodec,JavaScriptCodec(If the
Encoder.DefaultCodecList
property is null or not set,
these same codecs are listed in the same order. Note that you may supply
your own codec by using a fully cqualified class name of a class that
implements org.owasp.esapi.codecs.Codec<T>
.canonicalize
in interface Encoder
input
- the text to canonicalizeEncoder.canonicalize(String, boolean, boolean)
,
W3C specificationspublic String canonicalize(String input, boolean strict)
Encoder.canonicalize(input, strict, strict);
.canonicalize
in interface Encoder
input
- the text to canonicalizestrict
- true if checking for multiple and mixed encoding is desired, false otherwiseEncoder.canonicalize(String, boolean, boolean)
,
W3C specificationspublic String canonicalize(String input, boolean restrictMultiple, boolean restrictMixed)
Everyone says you shouldn't do validation without canonicalizing the data first. This is easier said than done. The canonicalize method can be used to simplify just about any input down to its most basic form. Note that canonicalize doesn't handle Unicode issues, it focuses on higher level encoding and escaping schemes. In addition to simple decoding, canonicalize also handles:
Using canonicalize is simple. The default is just...
String clean = ESAPI.encoder().canonicalize( request.getParameter("input"));You need to decode untrusted data so that it's safe for ANY downstream interpreter or decoder. For example, if your data goes into a Windows command shell, then into a database, and then to a browser, you're going to need to decode for all of those systems. You can build a custom encoder to canonicalize for your application like this...
ArrayList list = new ArrayList(); list.add( new WindowsCodec() ); list.add( new MySQLCodec() ); list.add( new PercentCodec() ); Encoder encoder = new DefaultEncoder( list ); String clean = encoder.canonicalize( request.getParameter( "input" ));or alternately, you can just customize
Encoder.DefaultCodecList
property
in the ESAPI.properties
file with your preferred codecs; for
example:
Encoder.DefaultCodecList=WindowsCodec,MySQLCodec,PercentCodecand then use:
Encoder encoder = ESAPI.encoder(); String clean = encoder.canonicalize( request.getParameter( "input" ));as you normally would. However, the downside to using the
ESAPI.properties
file approach does not allow you to vary your
list of codecs that are used each time. The downside to using the
DefaultEncoder
constructor is that your code is now timed to
specific reference implementations rather than just interfaces and those
reference implementations are what is most likely to change in ESAPI 3.x.
In ESAPI, the Validator
uses the canonicalize
method before it does validation. So all you need to
do is to validate as normal and you'll be protected against a host of encoded attacks.
String input = request.getParameter( "name" ); String name = ESAPI.validator().isValidInput( "test", input, "FirstName", 20, false);However, the default canonicalize() method only decodes HTMLEntity, percent (URL) encoding, and JavaScript encoding. If you'd like to use a custom canonicalizer with your validator, that's pretty easy too.
... setup custom encoder as above Validator validator = new DefaultValidator( encoder ); String input = request.getParameter( "name" ); String name = validator.isValidInput( "test", input, "name", 20, false);Although ESAPI is able to canonicalize multiple, mixed, or nested encoding, it's safer to not accept this stuff in the first place. In ESAPI, the default is "strict" mode that throws an IntrusionException if it receives anything not single-encoded with a single scheme. This is configurable in
ESAPI.properties
using the properties:
Encoder.AllowMultipleEncoding=false Encoder.AllowMixedEncoding=falseThis method allows you to override the default behavior by directly specifying whether to restrict multiple or mixed encoding. Even if you disable restrictions, you'll still get warning messages in the log about each multiple encoding and mixed encoding received.
// disabling strict mode to allow mixed encoding String url = ESAPI.encoder().canonicalize( request.getParameter("url"), false, false);WARNING #1!!! Please note that this method is incompatible with URLs and if there exist any HTML Entities that correspond with parameter values in a URL such as "¶" in a URL like "https://foo.com/?bar=foo¶meter=wrong" you will get a mixed encoding validation exception.
If you wish to canonicalize a URL/URI use the method Encoder.getCanonicalizedURI(URI dirtyUri);
WARNING #2!!! Even if you use WindowsCodec
or UnixCodec
as appropriate, file path names in the input
parameter will NOT
be canonicalized. It the failure of such file path name canonicalization
presents a potential security issue, consider using one of the
Validator.getValidDirectoryPath()
methods instead of or in addition to this method.
canonicalize
in interface Encoder
input
- the text to canonicalizerestrictMultiple
- true if checking for multiple encoding is desired, false otherwiserestrictMixed
- true if checking for mixed encoding is desired, false otherwiseEncoder.canonicalize(String)
,
Encoder.getCanonicalizedURI(URI dirtyUri)
,
Validator.getValidDirectoryPath(java.lang.String, java.lang.String, java.io.File, boolean)
public String encodeForHTML(String input)
Note that the following characters: 00-08, 0B-0C, 0E-1F, and 7F-9F
cannot be used in HTML.
encodeForHTML
in interface Encoder
input
- the untrusted data to output encode for HTMLpublic String decodeForHTML(String input)
decodeForHTML
in interface Encoder
input
- the String
to decodeString
public String encodeForHTMLAttribute(String input)
encodeForHTMLAttribute
in interface Encoder
input
- the untrusted data to output encode for an HTML attributepublic String encodeForCSS(String input)
encodeForCSS
in interface Encoder
input
- the untrusted data to output encode for CSSpublic String encodeForJavaScript(String input)
<script> window.setInterval('<%= EVEN IF YOU ENCODE UNTRUSTED DATA YOU ARE XSSED HERE %>'); </script>
encodeForJavaScript
in interface Encoder
input
- the untrusted data to output encode for JavaScriptpublic String encodeForVBScript(String input)
encodeForVBScript
in interface Encoder
input
- the untrusted data to output encode for VBScriptpublic String encodeForSQL(Codec codec, String input)
PreparedStatement
interface is the preferred approach. However, if for some reason
this is impossible, then this method is provided as a weaker
alternative.
The best approach is to make sure any single-quotes are double-quoted.
Another possible approach is to use the {escape} syntax described in the
JDBC specification in section 1.5.6.
However, this syntax does not work with all drivers, and requires
modification of all queries.encodeForSQL
in interface Encoder
codec
- a Codec that declares which database 'input' is being encoded for (ie. MySQL, Oracle, etc.)input
- the text to encode for SQLpublic String encodeForOS(Codec codec, String input)
encodeForOS
in interface Encoder
codec
- a Codec that declares which operating system 'input' is being encoded for (ie. Windows, Unix, etc.)input
- the text to encode for the command shellpublic String encodeForLDAP(String input)
encodeForLDAP
in interface Encoder
input
- the text to encode for LDAPpublic String encodeForLDAP(String input, boolean encodeWildcards)
encodeForLDAP
in interface Encoder
input
- the text to encode for LDAPencodeWildcards
- whether or not wildcard (*) characters will be encoded.public String encodeForDN(String input)
encodeForDN
in interface Encoder
input
- the text to encode for an LDAP distinguished namepublic String encodeForXPath(String input)
encodeForXPath
in interface Encoder
input
- the text to encode for XPathpublic String encodeForXML(String input)
The use of a real XML parser is strongly encouraged. However, in the hopefully rare case that you need to make sure that data is safe for inclusion in an XML document and cannot use a parser, this method provides a safe mechanism to do so.
encodeForXML
in interface Encoder
input
- the text to encode for XMLpublic String encodeForXMLAttribute(String input)
The use of a real XML parser is highly encouraged. However, in the hopefully rare case that you need to make sure that data is safe for inclusion in an XML document and cannot use a parse, this method provides a safe mechanism to do so.
encodeForXMLAttribute
in interface Encoder
input
- the text to encode for use as an XML attributepublic String encodeForURL(String input) throws EncodingException
encodeForURL
in interface Encoder
input
- the text to encode for use in a URLEncodingException
- if encoding failspublic String decodeFromURL(String input) throws EncodingException
decodeFromURL
in interface Encoder
input
- the text to decode from an encoded URLEncodingException
- if decoding failspublic String encodeForBase64(byte[] input, boolean wrap)
encodeForBase64
in interface Encoder
input
- the text to encode for Base64wrap
- the encoder will wrap lines every 64 characters of outputpublic byte[] decodeFromBase64(String input) throws IOException
decodeFromBase64
in interface Encoder
input
- the Base64 text to decodeIOException
public String getCanonicalizedURI(URI dirtyUri) throws IntrusionException
getCanonicalizedURI
in interface Encoder
dirtyUri
- IntrusionException
protected String buildUrl(Map<DefaultEncoder.UriSegment,String> parseMap)
parseMap
- The parts of the URL to put back together.public Map<String,List<String>> splitQuery(URI uri) throws UnsupportedEncodingException
uri
- The URI to analyze.UnsupportedEncodingException
public String encodeForJSON(String input)
encodeForJSON
in interface Encoder
input
- the text to escape for JSON stringpublic String decodeFromJSON(String input)
decodeFromJSON
in interface Encoder
input
- the JSON string to decodeCopyright © 2023 The Open Web Application Security Project (OWASP). All rights reserved.