java.lang.Object
- org.owasp.html.HtmlPolicyBuilder

```
@NotThreadSafe
public class HtmlPolicyBuilder
extends Object
```
Conveniences for configuring policies for the HtmlSanitizer.
Usage

To create a policy, first construct an instance of this class; then call allow… methods to turn on tags, attributes, and other processing modes; and finally call build(renderer) or toFactory().
```
 // Define the policy.
 Function<HtmlStreamEventReceiver, HtmlSanitizer.Policy> policy
     = new HtmlPolicyBuilder()
         .allowElements("a", "p")
         .allowAttributes("href").onElements("a")
         .toFactory();

 // Sanitize your output.
 HtmlSanitizer.sanitize(myHtml, policy.apply(myHtmlStreamRenderer));
 
```
Embedded Content

Embedded URLs are filtered by protocol. There is a canned policy so you can easily white-list widely used policies that don't violate the current pages origin. See "Customization" below for ways to do further filtering. If you allow links it might be worthwhile to require rel=nofollow.

This class simply throws out all embedded JS. Use a custom element or attribute policy to allow through signed or otherwise known-safe code. Check out the Caja project if you need a way to contain third-party JS.

This class does not attempt to faithfully parse and sanitize CSS. It does provide one styling option that allows through a few CSS properties that allow textual styling, but that disallow image loading, history stealing, layout breaking, code execution, etc.

Customization

You can easily do custom processing on tags and attributes by supplying your own element policy or attribute policy when calling allow…. E.g. to convert headers into <div>s, you could use an element policy
```
 new HtmlPolicyBuilder()
   .allowElement(
     new ElementPolicy() {
       public String apply(String elementName, List<String> attributes){
         attributes.add("class");
         attributes.add("header-" + elementName);
         return "div";
       }
     },
     "h1", "h2", "h3", "h4", "h5", "h6")
   .build(outputChannel)
 
```
Rules of Thumb

Throughout this class, several rules hold:
- Everything is denied by default. There are disallow… methods, but those reverse allows instead of rolling back overly permissive defaults.
- The order of allows and disallows does not matter. Disallows trump allows whether they occur before or after them. The only method that needs to be called in a particular place is build(org.owasp.html.HtmlStreamEventReceiver). Allows or disallows after build is called have no effect on the already built policy.
- Element and attribute policies are applied in the following order: element specific attribute policy, global attribute policy, element policy. Element policies come last so they can observe all the post-processed attributes, and so they can add attributes that are exempt from attribute policies. Element specific policies go first, so they can normalize content to a form that might be acceptable to a more simplistic global policy.
Thread safety and efficiency

This class is not thread-safe. The resulting policy will not violate its security guarantees as a result of race conditions, but is not thread safe because it maintains state to track whether text inside disallowed elements should be suppressed.
The resulting policy can be reused, but if you use the toFactory() method instead of build(org.owasp.html.HtmlStreamEventReceiver), then binding policies to output channels is cheap so there's no need.
Author:

Mike Samuel ([email protected])

Nested Class Summary

Nested Classes
Modifier and Type	Class	Description
`class`	`HtmlPolicyBuilder.AttributeBuilder`	Builds the relationship between attributes, the values that they may have, and the elements on which they may appear.

Field Summary

Fields
Modifier and Type	Field	Description
`static com.google.common.collect.ImmutableSet<String>`	`DEFAULT_RELS_ON_TARGETTED_LINKS`	These `rel` attribute values leaking information to the linked site, and prevents the linked page from redirecting your page to a phishing site when opened from a third-party link from your site.
`static com.google.common.collect.ImmutableSet<String>`	`DEFAULT_SKIP_IF_EMPTY`	The default set of elements that are removed if they have no attributes.

Constructor Summary

Constructors
Constructor Description

HtmlPolicyBuilder()

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method	Description
`HtmlPolicyBuilder.AttributeBuilder`	`allowAttributes(String... attributeNames)`	Returns an object that lets you associate policies with the given attributes, and allow them globally or on specific elements.
`HtmlPolicyBuilder`	`allowCommonBlockElements()`	A canned policy that allows a number of common block elements.
`HtmlPolicyBuilder`	`allowCommonInlineFormattingElements()`	A canned policy that allows a number of common formatting elements.
`HtmlPolicyBuilder`	`allowElements(String... elementNames)`	Allows the named elements.
`HtmlPolicyBuilder`	`allowElements(ElementPolicy policy, String... elementNames)`	Allow the given elements with the given policy.
`HtmlPolicyBuilder`	`allowStandardUrlProtocols()`	A canned URL protocol policy that allows `http`, `https`, and `mailto`.
`HtmlPolicyBuilder`	`allowStyling()`	Convert `style="<CSS>"` to sanitized CSS which allows color, font-size, type-face, and other styling using the default schema; but which does not allow content to escape its clipping context.
`HtmlPolicyBuilder`	`allowStyling(CssSchema whitelist)`	Convert `style="<CSS>"` to sanitized CSS which allows color, font-size, type-face, and other styling using the given schema.
`HtmlPolicyBuilder`	`allowTextIn(String... elementNames)`	Allows text content in the named elements.
`HtmlPolicyBuilder`	`allowUrlProtocols(String... protocols)`	Adds to the set of protocols that are allowed in URL attributes.
`HtmlPolicyBuilder`	`allowUrlsInStyles(AttributePolicy newStyleUrlPolicy)`	Allow URLs in CSS styles.
`HtmlPolicyBuilder`	`allowWithoutAttributes(String... elementNames)`	Assuming the given elements are allowed, allows them to appear without attributes.
`HtmlSanitizer.Policy`	`build(HtmlStreamEventReceiver out)`	Produces a policy based on the allow and disallow calls previously made.
`<CTX> HtmlSanitizer.Policy`	`build(HtmlStreamEventReceiver out, HtmlChangeListener<? super CTX> listener, CTX context)`	Produces a policy based on the allow and disallow calls previously made.
`HtmlPolicyBuilder.AttributeBuilder`	`disallowAttributes(String... attributeNames)`	Reverse an earlier attribute `allow`.
`HtmlPolicyBuilder`	`disallowElements(String... elementNames)`	Disallows the named elements.
`HtmlPolicyBuilder`	`disallowTextIn(String... elementNames)`	Disallows text in elements with the given name.
`HtmlPolicyBuilder`	`disallowUrlProtocols(String... protocols)`	Reverses a decision made by `allowUrlProtocols(java.lang.String...)`.
`HtmlPolicyBuilder`	`disallowWithoutAttributes(String... elementNames)`	Disallows the given elements from appearing without attributes.
`HtmlPolicyBuilder`	`requireRelNofollowOnLinks()`	Adds `rel=nofollow` to links.
`HtmlPolicyBuilder`	`requireRelsOnLinks(String... linkValues)`	Adds `rel="..."` to `<a href="...">` tags beyond those in `DEFAULT_RELS_ON_TARGETTED_LINKS`.
`HtmlPolicyBuilder`	`skipRelsOnLinks(String... linkValues)`	Opts out of some of the `DEFAULT_RELS_ON_TARGETTED_LINKS` from being added to links, and reverses previous calls to requireRelsOnLinks with the given link values.
`PolicyFactory`	`toFactory()`	Like `build(org.owasp.html.HtmlStreamEventReceiver)` but can be reused to create many different policies each backed by a different output channel.
`HtmlPolicyBuilder`	`withPostprocessor(HtmlStreamEventProcessor pp)`	Inserts a post-processor into the pipeline between the policy and the output sink.
`HtmlPolicyBuilder`	`withPreprocessor(HtmlStreamEventProcessor pp)`	Inserts a pre-processor into the pipeline between the lexer and the policy.

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - DEFAULT_SKIP_IF_EMPTY
```
public static final com.google.common.collect.ImmutableSet<String> DEFAULT_SKIP_IF_EMPTY
```
    The default set of elements that are removed if they have no attributes. Since <img> is in this set, by default, a policy will remove <img src=javascript:alert(1337)> because its URL is not allowed and it has no other attributes that would warrant it appearing in the output.
  - DEFAULT_RELS_ON_TARGETTED_LINKS
```
public static final com.google.common.collect.ImmutableSet<String> DEFAULT_RELS_ON_TARGETTED_LINKS
```
    These rel attribute values leaking information to the linked site, and prevents the linked page from redirecting your page to a phishing site when opened from a third-party link from your site.
    
    See Also:
    
    About rel=noopener
- Constructor Detail
  - HtmlPolicyBuilder
```
public HtmlPolicyBuilder()
```
- Method Detail
  - allowElements
```
public HtmlPolicyBuilder allowElements(String... elementNames)
```
    Allows the named elements.
  - disallowElements
```
public HtmlPolicyBuilder disallowElements(String... elementNames)
```
    Disallows the named elements. Elements are disallowed by default, so there is no need to disallow elements, unless you are making an exception based on an earlier allow.
  - allowElements
```
public HtmlPolicyBuilder allowElements(ElementPolicy policy,
                                       String... elementNames)
```
    Allow the given elements with the given policy.
    
    Parameters:
    
    policy - May remove or add attributes, change the element name, or deny the element.
  - allowCommonInlineFormattingElements
```
public HtmlPolicyBuilder allowCommonInlineFormattingElements()
```
    A canned policy that allows a number of common formatting elements.
  - allowCommonBlockElements
```
public HtmlPolicyBuilder allowCommonBlockElements()
```
    A canned policy that allows a number of common block elements.
  - allowTextIn
```
public HtmlPolicyBuilder allowTextIn(String... elementNames)
```
    Allows text content in the named elements. By default, text content is allowed in any allowed elements that can contain character data per the HTML5 spec, but text content is not allowed by default in elements that contain content of other kinds (like JavaScript in <script> elements.
    To write a policy that whitelists <script> or <style> elements, first allowTextIn("script").
  - disallowTextIn
```
public HtmlPolicyBuilder disallowTextIn(String... elementNames)
```
    Disallows text in elements with the given name.
    This is useful when an element contains text that is not meant to be displayed to the end-user. Typically these elements are styled display:none in browsers' default stylesheets, or, like <template> contain text nodes that are eventually for human consumption, but which are created in a separate document fragment.
  - allowWithoutAttributes
```
public HtmlPolicyBuilder allowWithoutAttributes(String... elementNames)
```
    Assuming the given elements are allowed, allows them to appear without attributes.
    
    See Also:
    
    DEFAULT_SKIP_TAG_MAP_IF_EMPTY_ATTR, disallowWithoutAttributes(java.lang.String...)
  - disallowWithoutAttributes
```
public HtmlPolicyBuilder disallowWithoutAttributes(String... elementNames)
```
    Disallows the given elements from appearing without attributes.
    
    See Also:
    
    DEFAULT_SKIP_TAG_MAP_IF_EMPTY_ATTR, allowWithoutAttributes(java.lang.String...)
  - allowAttributes
```
public HtmlPolicyBuilder.AttributeBuilder allowAttributes(String... attributeNames)
```
    Returns an object that lets you associate policies with the given attributes, and allow them globally or on specific elements.
  - disallowAttributes
```
public HtmlPolicyBuilder.AttributeBuilder disallowAttributes(String... attributeNames)
```
    Reverse an earlier attribute allow.
    For this to have an effect you must call at least one of HtmlPolicyBuilder.AttributeBuilder.globally() and HtmlPolicyBuilder.AttributeBuilder.onElements(java.lang.String...).
    Attributes are disallowed by default, so there is no need to call this with a laundry list of attribute/element pairs.
  - requireRelNofollowOnLinks
```
public HtmlPolicyBuilder requireRelNofollowOnLinks()
```
    Adds rel=nofollow to links.
    
    See Also:
    
    DEFAULT_RELS_ON_TARGETTED_LINKS, skipRelsOnLinks(java.lang.String...)
  - requireRelsOnLinks
```
public HtmlPolicyBuilder requireRelsOnLinks(String... linkValues)
```
    Adds rel="..." to <a href="..."> tags beyond those in DEFAULT_RELS_ON_TARGETTED_LINKS.
    
    See Also:
    
    skipRelsOnLinks(java.lang.String...)
  - skipRelsOnLinks
```
public HtmlPolicyBuilder skipRelsOnLinks(String... linkValues)
```
    Opts out of some of the DEFAULT_RELS_ON_TARGETTED_LINKS from being added to links, and reverses previous calls to requireRelsOnLinks with the given link values.
    
    See Also:
    
    requireRelsOnLinks(java.lang.String...)
  - allowUrlProtocols
```
public HtmlPolicyBuilder allowUrlProtocols(String... protocols)
```
    Adds to the set of protocols that are allowed in URL attributes. For each URL attribute that is allowed, we further constrain it by only allowing the value through if it specifies no protocol, or if it specifies one in the allowedProtocols white-list. This is done regardless of whether any protocols have been allowed, so allowing the attribute "href" globally with the identity policy but not white-listing any protocols, effectively disallows the "href" attribute globally.
    Do not allow any *script such as javascript protocols if you might use this policy with untrusted code.
  - disallowUrlProtocols
```
public HtmlPolicyBuilder disallowUrlProtocols(String... protocols)
```
    Reverses a decision made by allowUrlProtocols(java.lang.String...).
  - allowStandardUrlProtocols
```
public HtmlPolicyBuilder allowStandardUrlProtocols()
```
    A canned URL protocol policy that allows http, https, and mailto.
  - allowStyling
```
public HtmlPolicyBuilder allowStyling()
```
    Convert style="<CSS>" to sanitized CSS which allows color, font-size, type-face, and other styling using the default schema; but which does not allow content to escape its clipping context.
  - allowStyling
```
public HtmlPolicyBuilder allowStyling(CssSchema whitelist)
```
    Convert style="<CSS>" to sanitized CSS which allows color, font-size, type-face, and other styling using the given schema.
  - allowUrlsInStyles
```
public HtmlPolicyBuilder allowUrlsInStyles(AttributePolicy newStyleUrlPolicy)
```
    Allow URLs in CSS styles. For example, <span style="background-image: url(http://example.com/image.png)">.
    URLs in CSS are typically loaded without user-interaction, the way links are, so a greater degree of scrutiny is warranted.
    
    Parameters:
    
    newStyleUrlPolicy - receives URLs from the CSS that pass the allowed protocol policies, and may return null to veto the URL or the URL to use. URLs will be reported as content in <img src=...>.
  - withPreprocessor
```
public HtmlPolicyBuilder withPreprocessor(HtmlStreamEventProcessor pp)
```
    Inserts a pre-processor into the pipeline between the lexer and the policy. Pre-processors receive HTML events before the policy, so the policy will be applied to anything they add. Pre-processors are not in the TCB since they cannot bypass the policy.
  - withPostprocessor
```
public HtmlPolicyBuilder withPostprocessor(HtmlStreamEventProcessor pp)
```
    Inserts a post-processor into the pipeline between the policy and the output sink. Post-processors can insert events into the stream that are not vetted by the policy, so they are in the TCB.
    Try doing what you want with a pre-processor instead of a post-processor but if you're thinking of doing search/replace on a sanitized string, then definitely use either a pre or post-processor instead.
  - build
```
public HtmlSanitizer.Policy build(HtmlStreamEventReceiver out)
```
    Produces a policy based on the allow and disallow calls previously made.
    
    Parameters:
    
    out - receives calls to open only tags allowed by previous calls to this object. Typically a HtmlStreamRenderer.
  - build
```
public <CTX> HtmlSanitizer.Policy build(HtmlStreamEventReceiver out,
                                        @Nullable
                                        HtmlChangeListener<? super CTX> listener,
                                        @Nullable
                                        CTX context)
```
    Produces a policy based on the allow and disallow calls previously made.
    
    Parameters:
    
    out - receives calls to open only tags allowed by previous calls to this object. Typically a HtmlStreamRenderer.
    
    listener - is notified of dropped tags and attributes so that intrusion detection systems can be alerted to questionable HTML. If null then no notifications are sent.
    
    context - if (listener != null) then the context value passed with alerts. This can be used to let the listener know from which connection or request the questionable HTML was received.
  - toFactory
```
public PolicyFactory toFactory()
```
    Like build(org.owasp.html.HtmlStreamEventReceiver) but can be reused to create many different policies each backed by a different output channel.

Class HtmlPolicyBuilder

Usage

Embedded Content

Customization

Rules of Thumb

Thread safety and efficiency

Nested Class Summary

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

DEFAULT_SKIP_IF_EMPTY

DEFAULT_RELS_ON_TARGETTED_LINKS

Constructor Detail

HtmlPolicyBuilder

Method Detail

allowElements

disallowElements

allowElements

allowCommonInlineFormattingElements

allowCommonBlockElements

allowTextIn

disallowTextIn

allowWithoutAttributes

disallowWithoutAttributes

allowAttributes

disallowAttributes

requireRelNofollowOnLinks

requireRelsOnLinks

skipRelsOnLinks

allowUrlProtocols

disallowUrlProtocols

allowStandardUrlProtocols

allowStyling

allowStyling

allowUrlsInStyles

withPreprocessor

withPostprocessor

build

build

toFactory