posted 17-Mar-2012 | 34 comments | , , , ,

Jul/4/2012: Updated the wrapper code to pre-compile the patterns (making them static) to improve performance by avoiding their re-compilation on each run.

Here is a good and simple anti cross-site scripting (XSS) filter written for Java web applications. What it basically does is remove all suspicious strings from request parameters before returning them to the application. It’s an improvement over my previous post on the topic.

You should configure it as the first filter in your chain (web.xml) and it’s generally a good idea to let it catch every request made to your site.

The actual implementation consists of two classes, the actual filter is quite simple, it wraps the HTTP request object in a specialized HttpServletRequestWrapper that will perform our filtering.

public class XSSFilter implements Filter {

    @Override
    public void init(FilterConfig filterConfig) throws ServletException {
    }

    @Override
    public void destroy() {
    }

    @Override
    public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain)
        throws IOException, ServletException {
        chain.doFilter(new XSSRequestWrapper((HttpServletRequest) request), response);
    }

}

The wrapper overrides the getParameterValues(), getParameter() and getHeader() methods to execute the filtering before returning the desired field to the caller. The actual XSS checking and striping is performed in the stripXSS() private method.

import java.util.regex.Pattern;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletRequestWrapper;

public class XSSRequestWrapper extends HttpServletRequestWrapper {

    private static Pattern[] patterns = new Pattern[]{
        // Script fragments
        Pattern.compile("<script>(.*?)</script>", Pattern.CASE_INSENSITIVE),
        // src='...'
        Pattern.compile("src[\r\n]*=[\r\n]*\\\'(.*?)\\\'", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL),
        Pattern.compile("src[\r\n]*=[\r\n]*\\\"(.*?)\\\"", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL),
        // lonely script tags
        Pattern.compile("</script>", Pattern.CASE_INSENSITIVE),
        Pattern.compile("<script(.*?)>", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL),
        // eval(...)
        Pattern.compile("eval\\((.*?)\\)", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL),
        // expression(...)
        Pattern.compile("expression\\((.*?)\\)", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL),
        // javascript:...
        Pattern.compile("javascript:", Pattern.CASE_INSENSITIVE),
        // vbscript:...
        Pattern.compile("vbscript:", Pattern.CASE_INSENSITIVE),
        // onload(...)=...
        Pattern.compile("onload(.*?)=", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL)
    };

    public XSSRequestWrapper(HttpServletRequest servletRequest) {
        super(servletRequest);
    }

    @Override
    public String[] getParameterValues(String parameter) {
        String[] values = super.getParameterValues(parameter);

        if (values == null) {
            return null;
        }

        int count = values.length;
        String[] encodedValues = new String[count];
        for (int i = 0; i < count; i++) {
            encodedValues[i] = stripXSS(values[i]);
        }

        return encodedValues;
    }

    @Override
    public String getParameter(String parameter) {
        String value = super.getParameter(parameter);

        return stripXSS(value);
    }

    @Override
    public String getHeader(String name) {
        String value = super.getHeader(name);
        return stripXSS(value);
    }

    private String stripXSS(String value) {
        if (value != null) {
            // NOTE: It's highly recommended to use the ESAPI library and uncomment the following line to
            // avoid encoded attacks.
            // value = ESAPI.encoder().canonicalize(value);

            // Avoid null characters
            value = value.replaceAll("\0", "");

            // Remove all sections that match a pattern
            for (Pattern scriptPattern : patterns){
                value = scriptPattern.matcher(value).replaceAll("");
            }
        }
        return value;
    }
}

Notice the comment about the ESAPI library, I strongly recommend you check it out and try to include it in your projects.

If you want to dig deeper on the topic I suggest you check out the OWASP page about XSS and RSnake’s XSS (Cross Site Scripting) Cheat Sheet.

  • John

    Glorious!  Thank you Ricardo!

  • Vincent

    Actually, you’ve made a great filter that filters almost all XSS attacks. However, you miss some patterns. You’ll find these other patterns at http://ha.ckers.org/xss.html. This will for sure improve your XSS filter !

  • http://ricardozuasti.com/ Ricardo Zuasti

    Vincent, hi. Actually most of the patterns are based of RSnake’s list (I even included a link in the article itself :)), I’ll review it again though to see if I left something out.

  • http://profiles.google.com/sebbalex Alessandro Sebastiani

    Great article!
    by the way will be very useful something similar for sql injection for those who don’t use hibernate!
    Thanks!

  • Pingback: JavaPins

  • inProblem

    “;alert(‘XSS’);// this pattern is not included :(

  • chanakya

    Thank man…it worked like charm….

  • kumar

    hi, i am trying to implement the xss filter. When will the getParameterValues will be executed?
    how can i validate every form field regarding cross site scripting

  • http://ricardozuasti.com/ Ricardo Zuasti

    Kumar, hi. Once the filter is executed all the server side components that access the request will do so through the wrapper class, therefore using the controlled getParameter and getParameterValues functions.

    This implies that any servlet, JSP page, or other form of server side component that access a POST or GET parameter submitted to your server will be protected against the covered XSS attacks. The only components that may be unprotected are other filters that are configured to be executed prior to the XSS one on the app chain.

    rgds,
    r.

  • kumar

    Hi Ricardo,
    Thanks for the quick reply.

    we are using spring web flows. I configured the two files(Filter and request Wrapper) as suggested by you. I have a form with 5 fileds and i entered hello in one of the fields. I can see the logs in getParameter,getHeader of ReqeustWrapper.But am unable to see the logs from getParameterValues.. where i think the elimination of patterns from the form fields takes  place.
    Can you please help me on this issue..
    Regards
    Surendra Batchu

  • http://ricardozuasti.com/ Ricardo Zuasti

    Kumar, which method of the wrapper gets executed depends on the calling party, usually a component prefers one way to access parameters and therefore doesn’t use the other(s). If your problematic input (hi) makes it through to the spring component, maybe its because the framework (spring) is using a method other than getParameter or getParameterValues to access the request data.

    The most likely option that comes to mind is that spring is using the getParameterMap method, if so, the fix is easy, just override that function in the request wrapper class and execute the stripXSS method on each map value before returning the map.

    cheers
    r.

  • Gopi

    It won’t works for event handlers…. XSS

  • martin

    Why do you create a new XSSRequestWrapper on every doFilter call? Wouldn’t it improve performance to create the class in the init method? Or am I missing something here?

  • http://ricardozuasti.com/ Ricardo Zuasti

    Martin, each request wrapper instance “holds” a single http request representing a users request received by the web server. Then its important each instance is isolated from the others to keep your app safe (both from a security and functional consistency perspectives).

    cheers,
    r.

  • Tester

    <img src = “/nothing.js” alt=”"/> (non r n whitespace) can bypass your Regex pattern

  • mj kim

    how abt scrscriptipt attack ?

  • Eric

    Hi Ricardo, is this code available for all to use freely?

  • http://ricardozuasti.com/ Ricardo Zuasti

    Yes of course… no restrictions whatsoever

  • regex

    Hi Ricardo, anything for the meta refrsh attacks:

    Could you please help me build a regex to filter meta http-equiv=”refresh” .. Thanks in advance.

  • http://ricardozuasti.com/ Ricardo Zuasti

    Hi, it should be fairly easy to build a regex for that, you got many options depending on how aggressive you want to be on the matching. I recommend you use a site/tool to build/check your regex (like http://rubular.com/).

    cheers,
    r.

  • Mansingh Shitole

    Hi Ricardo,

    Request you to brief about the flow. How this filter will work. Nowhere you are calling getParameterValues().

    Only I see the constructor call of XSSRequestWrapper from Filter class. Please explain the entire flow. Thanks in advance.

  • http://ricardozuasti.com/ Ricardo Zuasti

    Mansingh, by wrapping the request, the wrapper gets called everytime you access the request, that way the processing code gets executed upon every access to the request parameters.

    Check out the Java Servlet API documentation for an in depth explanation.

    cheers!
    r.

  • Mansingh Shitole

    Ricardo, thanks a ton. Yes got an idea… :)

  • Guest

    Hi Ricardo,

    Implemen

  • Mansingh Shitole

    Hi Ricardo,

    Implemented filter as above. Facing issue while updating form fields in XSSWrapperRequest. Only getParameter n getHeader is called, hence overridden getParameterMap but still this method also not called.

    Then tried to find request fields in XSSWrapperRequest constructor but here also not found any form values.

    Other than form fields, other things are getting stripped successfuly like ip addr, header values…

    We are using Liferay portal and JSF framework. Please guide us. Thanks in advance.

  • Mansingh

    Hi Ricardo,

    Added below two line for filter url-pattern in web.xml:
    FORWARD
    REQUEST

    Now I am able to get form fields in XSSRequestWrapper constructor but still unable to strip those form fields. Which method need to override. For all those form fields no XSSRequestWrapper method is getting called. Not any of these getParameter, getParameterValues, getParameterMap.

    Using JSF framework, any other way to strip those form fields?
    Thanks in advance.

    Regards
    Mansingh

  • http://ricardozuasti.com/ Ricardo Zuasti

    Mansingh, maybe your framework is using attributes instead of parameters? Check if overriding the getAttributes and related methods solves it for you.

    You can check out all the overridable methods in :

    http://docs.oracle.com/javaee/6/api/javax/servlet/http/HttpServletRequestWrapper.html

    cheers,
    r.

  • Raju Bandaru

    Hi Ricardo,

    Implemented filter as above and am facing issue while updating form fields in XSSWrapperRequest. getParameterValues is called in basic Spring MVC app but in actual application (Fatwire) both getParameterValues and getParameterMap() are not called . I overridden both the methods and called from Constructor.

    If i called from Constructor, validating form values are not setting to map.Kindly suggest why default getParameterMap() is not invoked and also not setting values even from constructor.

    //Constructor
    public XSSRequestWrapper(HttpServletRequest request) {
    super(request);
    getParameterMap();
    }

    @Override

    public Map getParameterMap() {
    Map parameterMap = super.getParameterMap();
    Iterator parameterIterator = parameterMap.keySet().iterator();
    Map newMap = new LinkedHashMap();
    while (parameterIterator.hasNext()) {
    String key = parameterIterator.next().toString();
    String[] values = parameterMap.get(key);
    if (values == null) {
    return null;
    }
    String[] newValues = new String[values.length];
    for (int i = 0; i < values.length; i++) {
    newValues[i] = stripXSS(values[i]);
    }
    newMap.put(key, newValues);
    }
    return newMap;
    }

  • Mansingh Shitole

    Hi Ricardo,

    Thanks a ton. We are successfully fixed the XSS issue and the credit goes to you.

    Regards,

  • Ben

    This filter is trivial to bypass, with input such as

    <script>alert(1)</script>

    My advice is to drop malicious content rather than try to strip it, and also do output encoding (which can also be done with a filter I believe)

    Note: I am a Pen-tester, not a developer (my day to day job, is bug-hunting and breaking into web applications, not fixing them up)

  • Ben

    So in summary; if anyone thinks they fixed their XSS issue by implementing this, they probably didn’t fix it properly.

  • Ben

    There are many, many options for attackers to bypass this filter.
    Here is another simple example of an XSS that this code doesn’t work for:

    So, anyway, what you should be doing in web-apps is to drop invalid input, and also output-encode everything (just to be sure).

    If a field should only be text or numeric, drop requests that contain something else, and give an error to the user (which does not contain any of the original input).

    Also, don’t rely on filtering in JavaScript on the client-side – I’ve seen many companies doing that, and that is trivial to bypass as well. (Filtering client-side is to prevent ordinary users putting in crap, and does nothing to defend the application from malicious attackers).

  • http://ricardozuasti.com/ Ricardo Zuasti

    Ben, hi. Thanks for all the feedback.

    Some thoughts about it:

    - The patterns indeed don’t cover all possible attacks, that would be virtually impossible. You could however improve them regularly to include new patterns as they became evident.

    - I agree dropping malicious input may be a better idea than filtering it in most cases, but the implementation would be almost the same. You should just stop the chain and raise an error on the filter instead of moving forward. The pros and cons of the code above doesn’t change imho.

    - I agree this doesn’t completely prevent XSS on a given app, it’s just one more measure, as with all things security related.

    later!
    r.

  • Ben

    Bear in mind that if it doesn’t completely prevent XSS it will be very quick for an attacker to find an input string to exploit the vulnerability (and the filtering would have no impact from the attackers’ or victims’ perspective).

    There are a variety of automated vulnerability scanners out there (I mostly use the Burpsuite commercial version, but some free others also) and using these with a variety of test strings it is very quick (5 mins or so per function) to find ways to exploit any present XSS, bypassing any ineffective filtering if present. Also, I very much doubt that (once deployed) devs will parse their logs and try to “reactively improve their rules” (as you hint at).

    Indeed I would be surprised if you can build an effective defence by filtering in this way – without extensive and very strict rules on what is allowable input (and if you apply that kind of filtering application-wide, that might break some important valid input areas).

    I agree that stopping the chain is the best course of action, with an exception that logs but does not continue – also some of my customers additionally apply mod_security rule-sets (though those have bypasses as well, even in the latest iterations).

    Anyway, I appreciate you are trying to help people – so I don’t want to be too critical – it’s just that judging from the comments below, people seem to feel that this is the solution to their XSS concerns – if this was used on a application I was testing, I would just “smile a wry smile”, create a PoC to pwn the app, and write up the vulnerabilities in the report (I see so much XSS it gets pretty boring TBH).

    Anyway, if anyone is interested, and excellent resource to understand these attacks is the Web Application Hackers Handbook, which I can highly recommend. Also, if you are not familiar with the potential for exploiting XSS and what is at stake, take a look at the BeEF XSS framework.