This article provides an overview of the URL rewrite module and explains the configuration concepts used by the module.
Table Of Content
Functionality Overview
The URL rewrite module provides a rule-based rewriting mechanism for changing request URLs before they are processed by a Web server. The module can be used to express URL rewriting logic that can use regular expressions or wildcards and can make rewriting decisions based on HTTP headers and server variables. While the primary purpose of the module is to rewrite requested URLs, it also has functionality to perform redirects, send custom responses, or abort requests based on the logic expressed in the rewrite rules.
The URL rewrite module can be used for a wide variety of tasks: from rewriting search engine friendly URLs to the internal representation, to performing redirection, to blocking access to specified content on a Web site.
Rewrite Rules Overview
The main configuration concept used in the URL rewrite module is the concept of a rewrite rule. A rewrite rule is used to express the logic of what to compare or match the requested URL with and what to do if the comparison was successful.
Conceptually, a rewrite rule consists of the following parts:
- Pattern – The rule pattern is used to specify either the regular expression or a wildcard pattern that will be used to match URL strings.
- Conditions – The optional conditions collection is used to specify additional logical operations to perform if a URL string matched the rule pattern. Within the conditions you can check for certain values of HTTP headers or server variables, or verify if the requested URL corresponds to a file or directory on a physical file system.
- Action – The action is used to specify what to do if the URL string matched the rule pattern and all the rule conditions were evaluated successfully.
Rewrite Rules Scope
Rewrite rules can be defined in two different collections:
- <globalRules> – Rules in this collection can be defined only on the server level. Global rules are used to define server-wide URL rewriting logic. These rules are defined within the applicationHost.config file and they cannot be overridden or disabled on any lower configuration levels. Global rules always operate on the absolute URLs path (that is, the requested URI without the server name). These rules are evaluated at the very early stage of IIS pipeline (PreBeginRequest event).
- <rules> – Rules in this collection are called distributed rules and they can be defined on any configuration level. Distributed rules are used to define URL rewriting logic specific to a particular configuration scope. This type of rule can be added on any configuration level by using web.config files or by using <location> tag within applicationHost.config or web.config files. Distributed rules operate on the URL path, relative to the location of the web.config file where they are defined. In cases when distributed rules are defined inside of a <location> tag, they will operate on the URL path, relative to the path specified for that <location> tag. These rules are evaluated on the BeginRequest event of IIS pipeline.
Rules Evaluation
Each configuration level in IIS can have zero or more rewrite rules defined. The rules are evaluated in the same order they are specified. The URL rewriter processes through the set of rules by using the following algorithm:
- First, the URL is matched against the pattern of a rule. If it does not match, the URL rewrite module immediately stops processing that rule, and goes on to the next rule.
- If a pattern matches and there are no conditions for the rule, the URL rewrite module performs the action specified for this rule and then goes on to the next rule, where it will use substituted URL as an input for that rule.
- If a pattern matches and there are conditions for the rule, the URL rewrite module will evaluate the conditions and if the evaluation is successful, then the specified rule action will be performed and then the rewritten URL will be given as an input to the rule that follows.
A rule may have the StopProcessing flag turned on. When this flag is turned on, then it means that no more subsequent rules will be processed and the URL produced by this rule will be passed to the IIS request pipeline (if the rule matched). By default, this flag is turned off.
Rules Inheritance
If rules are defined on multiple configuration levels, then URL rewrite modules evaluate the rules in the following order:
- Evaluate all the global rules.
- Evaluate a rule set that includes distributed rules from parent configuration levels as well as rules from current configuration level. The evaluation is performed in a parent-to-child order, which means that parent rules are evaluated first and the rules defined on a last child level are evaluated last.
Preserving Original URL
The URL rewrite module preserves the original requested URL path in the following server variables:
- HTTP_X_ORIGINAL_URL - this server variable contains the original URL in decoded format;
- UNENCODED_URL - this server variable contains the original URL exactly as it was requested by web client, with all original encoding preserved.
Accessing URL Parts from a Rewrite Rule
It is important to understand how certain parts of the URL string can be accessed from a rewrite rule.
For an HTTP URL in this form: http(s)://<host>:<port>/<path>?<querystring>
-
The <path> will be matched against the pattern of the rule.
-
The <querystring> will be available in the server variable called QUERY_STRING and can be accessed by using a condition within a rule.
-
The <host> will be available in the server variable HTTP_HOST and can be accessed by using a condition within a rule.
-
The <port> will be available in the server variable SERVER_PORT and can be accessed by using a condition within a rule.
-
Server variables SERVER_PORT_SECURE and HTTPS can be used to determine if a secure connection was used. These server variables can be accessed by using a condition within a rule.
-
The server variable REQUEST_URI can be used to access the entire requested URL path including the query string.
For example, if a request was made for this URL: http://www.mysite.com/content/default.aspx?tabid=2&subtabid=3 and a rewrite rule was defined on the site level then:
- The rule pattern will get this URL string: content/default.aspx as an input
-
The QUERY_STRING server variable will contain tabid=2&subtabid=3.
-
The HTTP_HOST server variable will contain www.mysite.com.
-
The SERVER_PORT server variable will contain 80.
-
The SERVER_PORT_SECURE server variable will contain 0 and HTTPS will contain OFF.
-
The REQUEST_URI server variable will contain /default.aspx?tabid=2&subtabid=3
Note that the input URL string passed to a distributed rule is always relative to the location of the Web.config file where the rule is defined. For example, if a request was made for http://www.mysite.com/content/default.aspx?tabid=2&subtabid=3 and a rewrite rule was defined in the /content directory, then the rule will get this URL string: default.aspx as an input.
Rewrite Rule Configuration
Rule Pattern
A rewrite rule pattern is used to specify what the current URL path should be matched to. “Current” means the value of the URL path when the rule is applied. If there were any rules that preceded the current rule, they may have matched the original requested URL and modified it. The URL string that is evaluated against the pattern does not include the query string. To include the query string in the rule evaluation you can use the QUERY_STRING server variable in the rule’s condition. For more information, refer to ”Using server variables in rewrite rules".
Pattern is specified within a <match> element of a rewrite rule.
Rule pattern syntax
Rule pattern syntax can be specified by using the patternSyntax attribute of a rule. This attribute can be set to one of the following options:
ECMAScript – Perl compatible (ECMAScript standard compliant) regular expression syntax. This is a default option for any rule. This is an example of the pattern format: ”^([_0-9a-zA-Z-]+/)?(wp-.*)”
Wildcard – Wildcard syntax used in IIS 7.0 HTTP redirection module. This is an example of pattern in this format: “/Scripts/*_in.???”, where asterisk (“*”) means “match any number of any characters and capture them in a back-reference” and “?” means match exactly one character (no back-reference will be created).
The scope of the patternSyntax attribute is per rule, meaning that it applies to the current rule’s pattern and to all patterns used within conditions of that rule.
Rule pattern properties
Pattern can be negated by using the negate attribute of the <match> element. When this attribute is used then the rule action will be performed only if the current URL does NOT match the specified pattern.
By default, case insensitive pattern match is used. To enable case sensitivity you can use the ignoreCase attribute of the <match> element of the rule.
Rule conditions
Rule conditions allow defining additional logic for rule evaluation, which can be based on inputs other than just a current URL string. Any rule can have zero or more conditions. Rule conditions are evaluated after the rule pattern match is successful.
Conditions are defined within a <conditions> collection of a rewrite rule. This collection has an attribute called logicalGrouping that controls how conditions are evaluated. If a rule has conditions, then the rule action will be performed only if rule pattern is matched and:
-
All conditions were evaluated to true, provided that logicalGrouping=“MatchAll” was used.
-
At least one of the conditions was evaluated to true, provided that logicalGrouping=”MatchAny” was used.
A condition is defined by specifying the following properties:
Condition input specifies which item to use as an input for the condition evaluation. Condition input is an arbitrary string that can include server variables and back-references to prior condition patterns and/or to rule patterns.
Match type can be one of these three options:
-
IsFile – This match type is used to check if the input string contains a physical path of a file on a file system. If a condition input string is not specified then the URL rewrite module will use the physical path of the requested file as a default value for the condition input. This match type can be used only for distributed rules.
-
IsDirectory – This match type is used to check if the input string contains a physical path of a directory on a file system. If a condition input string is not specified then the URL rewrite module will use the physical path of the requested file as a default value for the condition input. This match type can be used only for distributed rules.
-
Pattern – This match type is used to express a condition where an arbitrary input string is matched against a regular expression pattern. A condition pattern can be specified by using either regular expression syntax or by using wildcard syntax. The type of pattern to use in a condition depends on the value of the patternSyntax flag defined for the rule to which this condition belongs. This condition type has two related attributes that control pattern matching:
In addition, the result of the condition evaluation can be negated by using the negate attribute. This can be used to specify a condition that checks if the requested URL is NOT a file:
<add input=”{REQUEST_FILENAME}” matchType=”isFile” negate=”true”>
Rule action
A rewrite rule action is performed when the current URL matches the rule pattern and the condition evaluation has succeeded ( depending on rule configuration, either all conditions matched or any one or more of the conditions matched). There are several types of actions available and the “type” attribute of the <action> configuration element can be used to specify which action the rule has to perform. The following sections describe different action types and the configuration options related to specific action types.
Rewrite action
Rewrite action replaces the current URL string with a substitution string. A substitution string must always specify the URL path (for example, contoso/test/default.aspx). Note that substitutions that contain a physical path on a file system (for example, C:\inetpub\wwwroot) are not supported in IIS.
Rewrite action has the following configuration options:
-
url – This is the substitution string to use when rewriting the current URL. Substitution URL is a free form string that can include the following:
-
Back-references to the condition and rule patterns. (For more information, see the section about how to use back-references.)
-
Server variables. (For more information, see the section about how to use server variables.)
-
appendQueryString – Specifies whether the query string from the current URL should be preserved during substitution. By default, if the value of the appendQueryString flag is not specified then it is assumed to be TRUE. This means that the query string from the original URL will be appended to the substituted URL.
Redirect action
Redirect action tells the URL rewrite module to send a redirect response back to the client. The redirect status code (3xx) can be specified as parameters for this action. The Location field of the response will contain the substitution string that was specified in the rule.
The substitution URL for the redirect rule can be specified in one of the following forms:
Usage of Redirect action implies that no subsequent rules will be evaluated for the current URL after redirection is performed.
Redirect action has the following configuration options:
-
url – Uses a substitution string as a redirection URL. A substitution URL is a free-form string that can include the following:
-
appendQueryString – Specifies whether the query string from the current URL should be preserved during substitution. By default, if the AppendQueryString flag is not specified then it is assumed to be TRUE. This means that the query string from the original URL will be appended to the substituted URL.
-
redirectType – Specifies the status code to use during redirect:
- 301 – Permanent
-
302 – Found
-
303 – See other
-
307 – Temporary
CustomResponse action
CustomResponse action causes the URL rewrite module to respond to the HTTP client by using a user-specified status code, subcode, and reason. Usage of CustomResponse action implies that no subsequent rules will be evaluated for the current URL, after this action is performed.
CustomResponse action has the following configuration options:
-
statusCode – Specifies the status code to use in response to the client.
-
subStatusCode – Specifies the substatus code to use in response to the client.
-
statusReason – Specifies the reason phrase to use with the status code.
-
statusDescription – Specifies the one line description to put in the body of the response.
AbortRequest action
Abort request action causes the URL rewrite module to drop the HTTP connection for the current request. The action does not have any parameters. Usage of this action implies that no subsequent rules will be evaluated for the current URL after this action is performed.
None action
None action is used to specify that no action should be performed.
Using server variables in rewrite rules
Server variables provide additional information about current HTTP requests. This information can be used to make rewriting decisions or to compose the rewritten URL. Server variables can be referenced in the following locations within rewrite rules:
Server variables can be referenced by using the {VARIABLE_NAME} syntax. For example, the following condition uses the QUERY_STRING server variable:
<add input=”{QUERY_STRING}” pattern=”id=([0-9]+)” />
Server variables can also be used to access HTTP headers from the current request. Any HTTP header supplied by the current request is represented as a server variable that has a name generated in accordance to this naming convention:
-
All dash (“-”) symbols in the HTTP header name are converted to underscore symbols (“_”).
-
All letters in the HTTP header name are converted to capital case.
-
“HTTP_” prefix is added to the header name.
For example, in order to access the HTTP header “user-agent” from a rewrite rule, you can use the {HTTP_USER_AGENT} server variable.
Using back-references in rewrite rules
Parts of rules or conditions inputs can be captures in back-references. These can be then used to construct substitution URLs within rules actions or to construct input strings for rule conditions.
Back-references are generated in different ways, depending on which kind of pattern syntax is used for the rule. When ECMAScript pattern syntax is used, a back-reference can be created by putting parenthesis around the part of the pattern that must capture the back-reference. For example, the pattern ([0-9]+)/([a-z]+)\.html will capture 07 and article in back-references from this requested URL: 07/article.html. When “Wildcard” pattern syntax is used, the back-references are always created when an asterisk symbol (*) is used in the pattern. No back-references are created when “?” is used in the pattern. For example the pattern */*.html will capture contoso and test in back-references from this requested URL: contoso/test.html.
Usage of back-references is the same regardless of which pattern syntax was used to capture them. Back-references can be used in the following locations within rewrite rules:
-
In condition input string
-
In rule action, specifically:
-
In key parameter to the rewrite map
Back-references to condition patterns are identified by {C:N} where N is from 0 to 9; back-references to rule pattern are identified by {R:N} where N is from 0 to 9. Note that for both types of back-references, {R:0} and {C:0}, will contain the matched string.
For example in this pattern:
^(www\.)(.*)$
For the string: www.foo.com the back-references will be indexed as follows:
{C:0} - www.foo.com
{C:1} - www.
{C:2} - foo.com
Within a rule action, you can use the back-references to the rule pattern and to the last matched condition of that rule. Within a condition input string, you can use the back-references to the rule pattern and to the previously matched condition.
The following rule example demonstrates how back-references are created and referenced:
<rule name="Rewrite subdomain">
<match url=”^(.+)” > <!-- rule back-reference is captured here -->
<conditions>
<add input="{HTTP_HOST}" type=”Pattern” pattern="^([^.]+)\.mysite\.com$"> <!-- condition back-reference is captured here -->
</conditions>
<action type=”Rewrite” url="{C:1}/{R:1}" /> <!-- rewrite action uses back-references to condition and to rule when rewriting the url -->
</rule>
Interaction with IIS Output Caching
URL rewrite module controls the IIS output cache behavior in order to:
- Optimally utilize kernel mode and user mode output caching of responses for rewritten URLs, thus improving performance of the web application that uses URL rewrite module
- Prevent caching of responses, when caching logic may be violated due to URL rewriting.
The module controls output caching either by altering certain caching properties or by disabling the caching altogether. The module cannot enable output cache, if it has been disabled by IIS configuration or by any other module in IIS pipeline. The output caching is controlled in accordance to the following logic:
1. The module always sets user mode caching setting varyByHeader=”HTTP_X_ORIGINAL_URL”. This ensures that if user mode cache is enabled then it will take into account original URL to construct a key for the cache entry.
2. If a rewrite rule set uses server variables, whose values are either constant throughout the life of the process or are derived from the requested URL then this rule set is considered safe for output caching. This means that URL rewrite module will not alter existing caching policy in anyway, apart from setting varyByHeader as described above.
The following server variables, when used in rewrite rules, do not cause any effect on output caching policy:
"CACHE_URL",
"DOCUMENT_ROOT",
"HTTP_URL",
"HTTP_HOST",
"PATH_INFO",
"PATH_TRANSLATED",
"QUERY_STRING",
"REQUEST_FILENAME",
"REQUEST_URI",
"SCRIPT_FILENAME",
"SCRIPT_NAME",
"SCRIPT_TRANSLATED",
"UNENCODED_URL",
"URL",
"URL_PATH_INFO",
"APP_POOL_ID",
"APPL_MD_PATH",
"APPL_PHYSICAL_PATH",
"GATEWAY_INTERFACE",
"SERVER_SOFTWARE",
"SSI_EXEC_DISABLED"
3. If a rewrite rule set uses any server variable, not mentioned in the above list, then this rule set is considered unsafe for output caching. This means that URL rewrite module will disable kernel mode caching for all requests, regardless whether their URLs were rewritten or not. In addition to that, the module will alter the caching policy for user-mode cache by setting the caching property varyByValue to contain the concatenated string of all server variables values used in the rule set.
String functions
There are three string functions available for changing the values within rewrite rule action as well as any condition input:
- ToLower - returns the input string converted to lower case.
- UrlEncode - returns the input string converted to URL encoded format. This function can be used if substitution URL in rewrite rule contains special characters (for example non-ASCII or URI-unsafe characters).
- UrlDecode - decodes the URL encoded input string. This function can be used to decode a condition input before matching it against a pattern.
The functions can be invoked by using the following syntax:
{function_name:any_string}
Where "function_name" can be: "ToLower", "UrlEncode", "UrlDecode". "Any_string" can be either a literal string or a string built by using server variables or back-references. For example, the following are valid invocations of string functions:
{ToLower:DEFAULT.HTM}
{UrlDecode:{REQUEST_URI}}
{UrlEncode:{R:1}.aspx?p=[résumé]}
The string functions can be used in the following locations within rewrite rules:
An example of the rule that uses ToLower function:
<rule name="Redirect to canonical url">
<match url=”^(.+)” > <!-- rule back-reference is captured here -->
<conditions>
<!-- Check if the requested domain is not in canonical form -->
<add input="{HTTP_HOST}" type=”Pattern” pattern="^www\.mysite\.com$" negate="true">
</conditions>
<!-- Redirect to canonical url and convert URL path to lowercase -->
<action type=”Redirect” url="http://www.mysite.com/{tolower:{R:1}}" RedirectType="Found"/>
</rule>
An example of the rule that uses UrlEncode function:
<rules>
<rule name="UrlEncode example" stopProcessing="true">
<match url="resume" />
<action type="Rewrite" url="default.aspx?name={UrlEncode:résumé}"/>
</rule>
An example of the rule that uses UrlDecode function:
<rules>
<rule name="UrlDecode example">
<match url="default.aspx" />
<conditions>
<add input="{UrlDecode:{QUERY_STRING}}" pattern="résumé" />
</conditions>
<action type="Rewrite" url="default.aspx?type=resume" />
</rule>
</rules>
Rewrite maps
A rewrite map is an arbitrary collection of name-value pairs that can be used within rewrite rules to generate the substitution URL during rewriting. Rewrite maps are particularly useful when you have a large set of rewrite rules and all of these rules use static strings (that is, when there is no pattern matching used). In those cases, instead of defining a large set of simple rewrite rules, you can put all the mappings into the rewrite map –as keys and values–between the input URL and the substitution URL. Then, to look up the substitution URL based on the input URL, you will have one rewrite rule that references this rewrite map.
A rewrite map defines a named collection of name-value pair strings, for example:
<rewriteMap name="MyRewriteMap" defaultValue="">
<add key="a.html" value="b.html" />
<add key="c.aspx" value="d.aspx" />
<add key="e.php" value="f.php" />
</rewriteMap>
A rewrite map is uniquely identified by its name and it can contain zero or more key-value entries. In addition, rewrite map can specify the default value to use when key is not found. This is controlled by using defaultValue attribute. By default an empty string is used as a default value.
There can be any number of rewrite maps on any configuration level, except the file level. Rewrite maps are located within a collection element <rewriteMaps>.
Rewrite maps are referenced within a rewrite rule by using the following syntax:
{RewriteMapName:Key}
Where the “Key” parameter can be any arbitrary string that can include back-references to rule or condition patterns. For example, the following are valid usages of rewrite map:
{MyRewriteMap:contoso/{R:1}/test/{C:1}}
{MyRewriteMap:a.html}
{MyRewriteMap:{R:1}?{C:1}&contoso=test}
A reference to a rewrite map gets substituted with the value that was looked up by using the key passed as a parameter within a rewrite map reference. If a key was not found, then the default value for that rewrite map will be used.
Rewrite map can be referenced in the following locations within rewrite rules:
Example 1: With rewrite map defined in the following way:
<rewrite>
<rewriteMaps>
<rewriteMap name="StaticRewrites" defaultValue="">
<add key="/diagnostics" value="/default.aspx?tabid=2&subtabid=29" />
<add key="/webcasts" value="/default.aspx?tabid=2&subtabid=24" />
<add key="/php" value="/default.aspx?tabid=7116" />
</rewriteMap>
</rewriteMaps>
</rewrite>
And with the rewrite rule defined in the following way:
<rewrite>
<rule name="Rewrite Rule">
<match url=".*" />
<conditions>
<add input="{StaticRewrites:{REQUEST_URI}}" pattern="(.+)" />
</conditions>
<action type="Rewrite" url="{C:1}"/>
</rule>
</rewrite>
The requested URL /diagnostic will be rewritten as /default.aspx?tabid=2&subtabid=29.
The requested URL /webcasts will be rewritten to /default.aspx?tabid=2&subtabid=24.
The requested URL /php will be rewritten to /default.aspx?tabid=7116.
The requested URL /default.aspx will not be rewritten because rewrite map does not contain an element with key=”/default.aspx”; therefore the rewrite map will return an empty string which will not match the condition pattern, hence rule action will not be performed.
Example 2: With rewrite map defined in the following way:
<rewrite>
<rewriteMaps>
<rewriteMap name="StaticRedirects" defaultValue="">
<add key="/default.aspx?tabid=2&subtabid=29"" value="/diagnostics" />
<add key="/default.aspx?tabid=2&subtabid=24" value="/webcasts" />
<add key="/default.aspx?tabid=7116" value="/php" />
</rewriteMap>
</rewriteMaps>
</rewrite>
And with the rewrite rule defined in the following way:
<rewrite>
<rule name="Redirect rule">
<match url=".*" />
<conditions>
<add input="{StaticRedirects:{REQUEST_URI}}" pattern="(.+)" />
</conditions>
<action type="Redirect" url="http://www.contoso.com{C:1}" redirectType="Found" />
</rule>
</rewrite>
The requested URL /default.aspx?tabid=2&subtabid=29 will be redirected to http://www.contoso.com/diagnostics.
The requested URL /default.aspx?tabid=2&subtabid=24 will be redirected to http://www.contoso.com/webcasts.
The requested URL /default.aspx?tabid=7116 will be redirected to http://www.contoso.com/php.
The requested URL /default.aspx will not be redirected because rewrite map does not contain an element with key=”/default.aspx”; therefore the rewrite map will return an empty string which will not match the condition pattern, hence rule action will not be performed.
Related Content
Comments