VladimirKochetkov April 29, 2013 at 15:50

ASP.NET Query Validation: From

“The ability to validate queries in ASP.NET is designed to perform basic input control. It is not intended to make decisions regarding the security of web applications being developed. Only developers themselves can determine what content their code should process. Microsoft recommends that you verify all input received from any source. We strive to facilitate the development of secure applications by our customers, and the request validation functionality was designed and implemented in order to help developers in this direction. For more information about our development recommendations, read the MSDN article: msdn.microsoft.com/en-us/library/ff649487.aspx#pagguidelines0001_inputdatavalidation . "

The official status of the ASP.NET Request Validation according to the Microsoft Security Response Center

Despite such a sudden response by MSRC to a recent report from the Quotium Research Center about the discovery of another way to bypass request validation in ASP.NET, it’s worth noting that it is nevertheless intended specifically for decision-making, regarding web application security. This is supported by the name of the class that implements the main set of checks (System.Web.CrossSiteScriptingValidation) and its very essence, which consists in preventing a certain subset of XSS Type-1 attacks (reflected cross-site scripting), and the original articlefrom the developers of the web stack. Another question is how effectively this functionality could be implemented and how to get a full-fledged web application firewall from the existing primitive regular filter that protects against any XSS Type-1 vectors?

To answer this question, it is necessary to understand the details of the implementation of query validation in various versions of the .NET Framework, its limitations, well-known workarounds and the possibilities of expanding its functionality.

1. The evolution of ASP.NET request validation

In all versions of ASP.NET (coinciding with versions of the .NET Framework), starting from v1.1 and ending with v4.5, query validation is reduced to searching in various elements of the HTTP request for occurrences of chains of a regular set, which describes a blacklist of dangerous values. From the point of view of coding, it is implemented by a recognition automaton, implemented manually for performance reasons, without using standard regular expressions. Many dangerous values contain elements of the HTML language that can violate the integrity of the output document if they are used in it without sufficient preliminary processing.

For the first time, the query validation mechanism was implemented in ASP.NET v1.1 and used a fairly wide blacklist. The request processing was blocked if any parameters of the query string or form field values corresponded to any of the regular expressions:

(? i: script) \ s? \:
(? i: on [az]) * \ s * =
(? i: ex) pression \ (

It is not surprising that the developers preferred to completely disable this feature due to the large number of false positives. Therefore, in ASP.NET v2.0, many dangerous values were greatly reduced and reached v4.5 unchanged:

During the preparation phase of ASP.NET v4.0, the developers also announced that the set (? I: script) \ s ?: would be returned to the list, but this did not happen in v4.0 or v4.5.

Not only many dangerous values have changed from version to version, but also the scope of validation and the possibilities available to developers to control this process. So, in v2.0 it became possible to disable query validation for individual pages, and in v4.0 a new mode of so-called was introduced. deferred granular validation, in which each parameter is checked when accessing it from the web application code, and not at the stage of preliminary processing of the request. Starting with v4.0, in addition to query string parameters and form field values, the validation area also includes:

values of all elements from Request.Cookies;
names of the downloaded files from Request.Files;
Request.RawUrl, Request.Path and Request.PathInfo values

2. Validation of requests in ASP.NET v4.x

In recent versions of ASP.NET, as part of request validation, a number of additional checks are also performed, which are performed at the earliest stages of its life cycle. Their full list is given in the table:

CHECK	SETTINGS AND VALUES
Checking the length of Request.Path	MaxUrlLength attribute in section . It can be defined both globally for the entire application, and for individual virtual paths or pages. Blocks the processing of an HTTP request containing a path longer than 260 characters. This value can be increased to the limits defined in IIS or http.sys .
Checking the length of the Request.RawUrl fragment containing the query string	MaxQueryStringLength attribute in section . It can be defined both globally for the entire application, and for individual virtual paths or pages. Blocks the processing of an HTTP request containing a query string longer than 2048 characters. This value can be increased to the limits of IIS or http.sys .
Scan Request.Path for the presence of characters identified in ASP.NET as potentially dangerous	RequestPathInvalidCharacters attribute in section . It can be defined both globally for the entire application, and for individual virtual paths or pages. Blocks the processing of an HTTP request if the path in it contains any of the characters: <(XSS attacks) > (XSS attacks) * (attacks on canonicalization of file names) % (attacks on the URL decoder) : (attacks on alternative NTFS data streams) & (attacks on the query string parser) \ (attacks on canonicalization of file paths) ? (attacks on the query string parser) In the requestPathInvalidCharacters attribute, illegal characters are enclosed in double quotes and separated by commas. The character sequence of the path "\ .." is not included in this list due to the fact that IIS v6 + automatically implements URI canonization, correctly processing such sequences. In practice, errors associated with the appearance of a forward slash in the path also do not occur, because in the process of canonization they are replaced by the reverse.
Finding the appropriate managed configuration for each Request.Path value	RelaxedUrlToFileSystemMapping attribute in section . It can only be determined globally for the entire application. By default, this attribute is set to false, which requires ASP.NET to consider the component path in the URL as a valid file path that complies with NTFS rules. This restriction can be disabled by setting the attribute value to true.
Checking Request.QueryString, Request.Form, Request.Files, Request.Cookies, Request.Path, Request.PathInfo, Request.RawUrl for potentially dangerous values	RequestValidationMode attribute in section . It can be defined both globally for the entire application, and for individual virtual paths or pages. Sets the mode in which the requests to the web application will be validated. A value of 4.0 (the default) includes deferred granular validation, which occurs when the web application code directly accesses elements from the validation area. Setting this attribute to 2.0 returns the validation mode used in previous versions of ASP.NET. RequestValidationType attribute in section. It can only be determined globally for the entire application. Sets the type of the inheritor of the RequestValidator class that implements the query validation functionality. By default, the System.Web.Util.RequestValidator class is used.

The last check is just that visible part of the iceberg, called query validation, and available to web application developers to expand its functionality.

3. Internal structure of request validation

The source code for the IsValidRequestString method of the System.Web.Util.RequestValidator class, used by default to validate requests in ASP.NET v2.0 +, looks something like this:

protected internal virtual bool IsValidRequestString(
    HttpContext context,
    string value,
    RequestValidationSource requestValidationSource,
    string collectionKey,
    out int validationFailureIndex)
{
    if (requestValidationSource == RequestValidationSource.Headers)
    {
        validationFailureIndex = 0;
        return true;
    }
    return !CrossSiteScriptingValidation.IsDangerousString(value, out validationFailureIndex);
}

It should be noted that even before calling the IsValidRequestString method, all occurrences of the null byte are cut from the string passed in the value parameter. This behavior is implemented in the ValidateString method of the HttpRequest class and cannot be overridden by the developer.

As you can see from the source code, the main functionality for query validation is implemented in the IsDangerousString method of the CrossSiteScriptingValidation class:

internal static bool IsDangerousString(string s, out int matchIndex)
{
    matchIndex = 0;
    int startIndex = 0;
    while (true)
    {
        int num2 = s.IndexOfAny(startingChars, startIndex);
        if (num2 < 0)
        {
            return false;
        }
        if (num2 == (s.Length - 1))
        {
            return false;
        }
        matchIndex = num2;
        char ch = s[num2];
        if (ch != '&')
        {
            if ((ch == '<') && ((IsAtoZ(s[num2 + 1]) || (s[num2 + 1] == '!')) || ((s[num2 + 1] == '/') || (s[num2 + 1] == '?'))))
            {
                return true;
            }
        }
        else if (s[num2 + 1] == '#')
        {
            return true;
        }
        startIndex = num2 + 1;
    }
}

Obviously, this filter is an automaton that recognizes occurrences of chains of a regular set in a given string


internal static bool IsDangerousUrl(string s)
{
    if (string.IsNullOrEmpty(s))
    {
        return false;
    }
    s = s.Trim();
    int length = s.Length;
    if (((((length > 4) && ((s[0] == 'h') || (s[0] == 'H'))) && ((s[1] == 't') || (s[1] == 'T'))) && (((s[2] == 't') || (s[2] == 'T')) && ((s[3] == 'p') || (s[3] == 'P')))) && ((s[4] == ':') || (((length > 5) && ((s[4] == 's') || (s[4] == 'S'))) && (s[5] == ':'))))
    {
        return false;
    }
    if (s.IndexOf(':') == -1)
    {
        return false;
    }
    return true;
}
internal static bool IsValidJavascriptId(string id)
{
    if (!string.IsNullOrEmpty(id))
    {
        return CodeGenerator.IsValidLanguageIndependentIdentifier(id);
    }
    return true;
}

The first checks the URL and considers all values that do not satisfy the set ^ (? I: https? :) | [^:] dangerous. The second checks the value of the argument against the grammar rule for the language identifiers: ^ (? I: [a-z _] [a-z0-9 _]) $. Both methods were called from IsDangerousString as part of query validation in ASP.NET v1.1. In all other versions, they are used only in some ASP.NET WebForms controls as functional verification methods and do not raise a RequestValidationException.

4. Disadvantages of the standard implementation and ways to bypass it

Obviously, the standard implementation of query validation has several drawbacks that make it really unsuitable for making decisions regarding the security of a web application.

Firstly, the examined checks can protect only from a limited subset of XSS Type-1 attacks that require opening a tag to carry them out. In the event that an attack of reflected XSS is possible as a result of embedding the parameter value inside the tag, attribute or code of the client script, standard query validation will not be able to prevent it.

Secondly, blacklist control, by itself, is not a sufficient measure of security. This is due to the presence of several well-known ways to bypass standard query validation:

Restriction on ). In this case, the HTML parser IE v9- will consider this a valid tag definition. In some cases, if a web application implements Unicode-canonicalization of query parameters, it is also possible to crawl using Unicode-wide values (% uff1c img% 20src% 3D% 23% 20onerror% 3Dalert% 281% 29% 2f% uff1e).
The restriction on (? I: script) \ s? \: And (? I: ex) pression \ (gets around using whitespace inside the script and between expression and the opening bracket (java% 09script: alert (1) and expression% 09 ( alert (1))).
Ограничение на #& не учитывает существования именованных ссылок на сущности HTML, которые также можно использовать в ряде векторов (javascript%26Tab;:alert(1)). Здесь необходимо также отметить, что стандартная реализация HTML-декодера ASP.NET (HttpUtility.HtmlDecode) «знает» лишь о существовании 253 именованных ссылок на сущности HTML, в то время как в стандарте HTML их определено существенно больше. Это позволяет пробрасывать в выходной документ множество HTML-сущностей, даже если веб-приложение осуществляет HTML-декодирование значений параметров на этапе предварительной обработки входных данных.

But the main drawback of the standard implementation is the request processing stage, at which its validation is performed. Even with deferred mode turned on, without information about the contents of an already generated response document, it is impossible to make correct assumptions about the dangers of a particular parameter for a particular class of attacks. For example, if a parameter containing HTML markup elements does not fall into the server’s response, it’s rather strange to assert its potential danger from the point of view of XSS Type-1. This is about the same as asserting that SELECT values are dangerous without knowing whether they end up in the SQL query. Following similar logic, ASP.NET developers should also include in their query validation a search in its parameters for SQL syntax elements, path passes, XPath expressions and other character sequences that are specific to injection into any language, rather than restricting yourself to just a small subset of a specific type of XSS attack. Of course, such an approach generates a lot of false positives, which leads to a complete disabling of validation for the entire application, and to the appearance of tools that allow you to do this without much effort (for example,nuget.org/packages/DisableRequestValidation ).

Nevertheless, all these shortcomings can be eliminated by taking advantage of the opportunity discussed in the next section.

5. Extension of query validation functionality

Starting with ASP.NET v4.0, developers have the opportunity to expand the functionality of validation of requests, including completely redefining the standard implementation. In order to accomplish this, just create an inheritor of the System.Web.Util.RequestValidator class by overriding the IsValidRequestString method in it. This method is called when it is necessary to check the next request parameter and takes the following arguments:

HttpContext context - the context of the HTTP request within which the check is performed;
string value - the value to be checked;
RequestValidationSource requestValidationSource - the source to which the checked value belongs;
string collectionKey - name of the checked value in the source;
out int validationFailureIndex - an output parameter containing an offset inside value, starting from which a dangerous character was detected or -1 otherwise;

For example, to eliminate the ability to bypass validation using the combination of characters <%, you can implement the following extension:


using System;
using System.Web;
using System.Web.Util;
namespace RequestValidationExtension
{
    public class BypassRequestValidator : RequestValidator
    {
        public BypassRequestValidator() { }
        protected override bool IsValidRequestString(
            HttpContext context, 
            string value, 
            RequestValidationSource requestValidationSource, 
            string collectionKey, 
            out int validationFailureIndex)
        {
            validationFailureIndex = -1;
            for (var i = 0; i < value.Length; i++)
            {
                if (value[i] == '<' && (i < value.Length - 1) && value[i + 1] == '%')
                {
                    validationFailureIndex = i;
                    return false;
                }
            }
            return base.IsValidRequestString(
                context, 
                value, 
                requestValidationSource, 
                collectionKey, 
                out validationFailureIndex);
        }
    }
}

After which, setting the value of the requestValidationType = "RequestValidationExtension.BypassRequestValidator" attribute in the httpRuntime section of the web application configuration, get a web application that is protected from this method of bypassing validation.

6. Improved query validation

Using the ability to expand the functionality of query validation, it is quite possible to eliminate all existing problems of the current implementation and get a full-fledged XSS Type-1 WAF. To do this, verification of the values of the request parameters must be carried out immediately before sending the response when it is already generated and its contents are known. This is a necessary condition for a reliable assessment of the influence of query parameters on the integrity of the response. However, the ASP.NET request validation architecture does not provide the ability to execute it within its functional, which leads to the need to break the entire procedure into three stages.

The functionality of the first stage is implemented directly in the extension of the standard query validation. At this stage, each parameter to be checked is mapped to the set ^? I [A-Za-z0-9 _] + $ and, if the comparison fails, the parameter is marked as being checked later. Thus, information is collected about all potentially dangerous request parameters for which a check was requested (that is, the parameters actually used in the web application when processing the request). This ensures full integration with the existing architecture of query validation, and also eliminates the need to subject additionally known-dangerous parameters to additional checks.

The second stage is an ASP.NET output filter and implements monitoring of the stream into which all fragments of the response are generated, which are formed at various stages of the request life cycle. The content of the entire response is also stored for processing in the third step.

The third stage, implemented as an HTTP module that processes the EndRequest event, assesses the impact of all parameters collected in the first stage on the integrity of the response received in the second. If a violation of the integrity of the response is detected, the check is considered failed. The assessment of the influence of query parameters on integrity is based on a fuzzy search in the response of the so-called insertion points - places where the response fragment approximately matches the value of any of the parameters. The set of insertion points forms an insertion map, with the help of which a much more reasonable check can be made for the presence of prohibited characters in the parameter values, as well as to reveal the nature of their influence on the integrity of the response.

The last task is solved using parsers of all languages that can be found in the response (HTML, JavaScript, CSS). Comparison of parse trees obtained by parsing various fragments of the response with data from the insert map gives complete information about which of the nodes of a tree were embedded in the response with the value of one or another parameter.

Verification algorithm detailed description

If the output contains zero bytes, then the check fails.
For each element of the parameter list P, a fuzzy search is performed with a threshold of 0.75 of all occurrences of its value in the response text R. The boundaries of each occurrence found determine the insertion area. Many of the insertion areas map the insertions M.
If M is empty, then the check is considered passed.
For each element M, a check is made for its value matching the regular set
The output text R is parsed by the HTML parser into the tree R '.
If any errors occurred as a result of parsing and the places of their occurrence have intersections with elements of M, then the check fails.
All subsequent steps are repeated for each node N of the tree R 'describing the HTML tag or comment.
If the initial position N in R has intersections with the elements of M, then the check is considered failed.
If N describes an HTML tag, then for each of its attributes, the position of which in R has an intersection with the elements of M, a check is performed according to the algorithm described below.
If N describes a tag, then for its innerText value (script code), a check is performed according to the algorithm described below.
If N describes a tag, then for its value innerText (the code for defining styles), a check is performed according to the algorithm described below.
A check is considered passed if it was not failed in the previous steps.

Algorithm for checking the attributes of HTML elements (accepts the value of attribute A, the insertion map M and the response text R):

If the initial position of A in R has an intersection with the elements of M, then the check is considered failed.
If the position of the value of A in R does not intersect with the elements of M, then the check is considered passed.
If the name A is contained in the list of event handler attributes, then its value is checked according to the algorithm described below.
If the name A matches the element of the list of attributes of the reference type, then the following steps are performed:
1. If the value A contains the substring "& #" or a named link to the HTML entity, then the check is considered failed.
2. If the value A does not contain ":", then the check is considered passed.
3. The value of A is parsed by the URI parser into an object U.
4. If errors occurred during parsing, then the check is considered failed.
5. If U does not describe the absolute path, then the check is considered failed.
6. If U describes a path with a circuit present in the dangerous list, then the check is considered failed.
If the name A = "style", then its value is checked according to the algorithm described below.

Algorithm for checking the code of client scenarios and the values of event handler attributes (takes the value Vs containing the code of the script being tested and the value Vm of the element of the set M with which the intersection was detected):

If the largest common substring L of Vs and Vm is less than 7, then the check is considered passed.
Vs value parsed by JavaScript parser into Vs' tree
If errors occurred during parsing, then the check is considered failed.
If the number of tokens in Vs 'is less than 5 or the number of nodes in Vs' is less than 2, then the check is considered passed.
If the value of L is entirely the value of one token Vt of the tree Vs', then the check is considered passed.
The JavaScript-decoded value of Vt is subjected to a recursive check, as if it were the text of a response fully formed from the parameter Vm.

The algorithm for checking the style definition code is exactly the same as the previous one, except for using the CSS parser and other threshold values for elements of the parse tree and the largest common substring.

The Proof-of-Concept implementation of the described algorithm is available on GitHub . At the time of writing, there are no known ways to bypass this filter. Tests showed that there is no tangible effect on the performance of web applications in cases where the request does not contain dangerous values and 7-15% slower response generation otherwise. Considering that the Proof-of-Concept version uses third-party parsers that solve a much more general problem than is required within the framework of the response validation algorithm, the optimal implementation of these components will achieve sufficient performance for reliable application of the solution in productive environments.

7. Conclusions

Implementation of query validation functionality in current versions of ASP.NET is ineffective and does not solve the problem of protection against attacks of the XSS Type-1 class. However, its current architecture and expansion capabilities allow this problem to be independently solved using the response validation method described in this article.

However, we should not forget that the most effective protection against such attacks are not third-party mounted solutions, but the correct implementation of the processing of input and output data by the developers themselves. And the use of Irv or more complex products (such as mod-security for IIS or .NetIDS ) does not save the developer from having to follow the basic rules for developing secure code (for example,www.troyhunt.com/2011/12/free-ebook-owasp-top-10-for-net.html or wiki.mozilla.org/WebAppSec/Secure_Coding_Guidelines ).

Tags: