How to use Apache mod_rewrite and other modules to have neat URLs.
Reasons for URL tidying
Rewriting URLs is useful for the following reasons
Easy to remember
It will make URLs easier to remember. For example, you may have a URL
http://example.com/view/index.php?user=bob
It is possible to rewrite it as
http://example.com/~bob
This will allow Bob to easily share his page with others.
Search Engine Optimization
Some search engines dislike URLs with GET arguments placed in the URL. e.g. you may have the URL
http://example.com/article.php?type=apache&article_id=231
Some search engines may skip over this, but if we rewrite the URL to
http://example.com/articles/apache/231
then it is more likely to be indexed by the search engines.
Document redirection
If you want to restructure your website, you will need to redirect users from the old URL to the new page. Apache can do this for you, so you don’t need to use the refresh meta tag for this.
It is better to give the browser a HTTP redirect instead of having the meta tags do the redirection.
Using mod_rewrite
Read the official documentation. This will save lots of time in the long run.
If the web host has mod_rewrite
enabled, it may allow you to upload a .htaccess
file to configure the
redirection. (See Apache Modules) to determine what is enabled on the server.
RewriteEngine
RewriteEngine on
The rewriting engine must be turned on.
RewriteRule
Rules perform substitutions on URIs. The syntax is
RewriteRule Pattern Substitution [flags]
Multiple rules are applied in sequence, but there are flags that will stop further processing. e.g.
L
orlast
indicates rewriting should stop.R
orredirect
will send a redirect to the browser. This is useful when documents have moved.
-
is a special substitution which lets the URI passthrough without any modification.
RewriteCond
RewriteCond affects the RewriteRule immediately following. Subsequent RewriteRules are unaffected.
Hiding a parameter to make the URL easier to read
If you want to change this URL:
http://example.com/user/view.php?name=peter
to
http://exampleserver.com/~peter
Add a .htaccess
file containing the following:
RewriteEngine on
RewriteRule ^~(.*) user/view.php?name=$1
This will allow the php script to run with the argument name equal to the given username.
A regular expression is used to do the substitution.
The ^
character is an anchor that matches the start of the line.
~
is the literal~
we are trying to match in the URL.(.*)
is broken up into three pieces..
This means any character.*
This follows the.
character, and means any number of them.()
This groups it as part of a match, so we can refer to it using $1 for the first match.
Moving directories
We’ve reorganised some pages on our website, so to preserve existing links to our site, we used the following rule
RewriteRule kb/app/tomcat/struts/(.*) /kb/prg/java/jsp/struts/$1 [R]
The second argument contains a leading /, because the files local path on the server was being added to the URL.
Domain Name prefix
You can redirect traffic for example.com
to www.example.com
RewriteEngine On
RewriteCond %{HTTP_HOST} !^(.*)\.example\.com\ [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
The RewriteCond
above ignores requests that already contain a subdomain.
Alternatively, you can redirect all www.* traffic to your normal domain.
RewriteEngine On
RewriteCond %{HTTP_HOST} ^www\.example\.com$ [NC]
RewriteRule ^(.*)$ http://example.com/$1 [R=301,L]
To redirect any host that doesn’t match, you could use something like
RewriteEngine On
RewriteCond %{HTTP_HOST} !^example\.com [NC]
RewriteRule ^(.*)$ http://example.com/$1 [R=301,L]
This helps to get rid of the numeric IP being used as the host name.
Moving Domain
You can also redirect traffic between domains
RewriteEngine On
RewriteRule old/pages/(.*) http://example.com/archive/$1 [R=301]
You can specify a new domain as part of the redirection.
301 means that it has been moved permanently.
If you use [R]
instead of [R=301]
, then it defaults to sending a 302, which is
moved temporarily.
Redirecting scheme from https to http
Add the following to your .htaccess
# turn on mod_rewrite
RewriteEngine On
# check that it is https
RewriteCond %{HTTPS} =on
# redirect to the plain http site
RewriteRule ^(.*)$ http://magicmonster.com/$1 [R=301,L]
Instead of .htaccess
you can also add this into the module conf file.
mod_rewrit
If you have mod_rewrit
instead, it isn’t a typo, but an incomplete and undocumented version of mod_rewrite
that is meant to be more secure. No support is provided for this.
Troubleshooting
Logging
From Version 2.4 turn on logging for the module using
LogLevel alert rewrite:trace3
For Version 2.2, if you have access to the apache configuration, then you can edit the httpd.conf
apache configuration file
and add logging to mod_rewrite.
RewriteLog /usr/local/apache/logs/mod_rewrite.log
RewriteLogLevel 9
9 is the highest level of logging, while 0 will turn it off. Make sure you turn it off once you’ve figured out the problem.
Directories don’t match
If you are using mod_userdir
or another module that does not map the URL directly to a directory path, you
may end up with an invalid path. e.g. We have placed our .htaccess
into the directory
/home/aelst/public_html
on the dev server, so that the URL below
http://dev/~aelst/fish.html
actually maps to the file
http://dev/~aelst/whale.html
Normally you would use the following .htaccess
file:
RewriteEngine on
RewriteRule fish\.html whale.html
However, because of the mod_userdir
, we get the following error:
Not Found
The requested URL /home/aelst/public_html/whale.html was not found on this server.
We should be translating to the following URL instead
http://dev/~aelst/whale.html
To fix this you can either added the directory path to the rule
RewriteEngine on
RewriteRule fish\.html /~aelst/whale.html
or add another line into the .htaccess
called RewriteBase
RewriteEngine on
RewriteBase /~aelst/
RewriteRule fish\.html whale.html