Redirects that do not work due to PHP variables

Problem:

The redirection of www.yourdomain.com/default.html or index.html to http://www.yourdomain.com/index.php?act=whatever redirects it to http://www.yourdomain.com/index.php%3fact=whatever In other words, why is "?" replaced with "%3f" (or anything else) in the address line?

Solution:

One of the more powerful tricks of the .htaccess hacker is the ability to rewrite URLs. This enables us to do some mighty manipulations on our links; useful stuff like transforming very long URL's into short, cute URLs, transforming dynamic ?generated=page&URL's into /friendly/flat/links, redirect missing pages, preventing hot-linking, performing automatic language translation and much, much more.

Make no mistake, mod_rewrite is complex. This is not the subject for a quick bite-size tech-snack, probably not even a week-end crash-course. I've seen guys pull off some real cute stuff with mod_rewrite, but with kudos-hat tipped firmly towards that crazy operator, Ralf S. Engelschall, author of the magic module itself. I have to admit that a great deal of it still seems so much voodoo to me.

The way that rules can work one minute and then seem not to the next, how browser and other in-between network caches interact with rules and testing rules is often baffling or maddening. When I feel the need to bend my mind completely out of shape, I mess around with mod_rewrite!

After all this, it does work. While I'm not planning on taking that week-end crash-course any time soon, I have picked up a few wee tricks myself, messing around with webservers and web sites, etc.

The plan here is to just drop some neat stuff, examples, things that has proven useful, stuff that works on a variety of server setups; there are apache's all over my LAN, I keep coming across old .htaccess files stuffed with past rewriting experiments that either worked; and I add them to my list, or failed dismally; and I'm surprised that more often these days, I can see exactly why!

Nothing here is my own invention. Even the bits I figured out myself were already well documented, I just hadn't understood the documents or could not find them. Sometimes, just looking at the same thing from a different angle can make all the difference, so perhaps this humble stab at URL Rewriting might be of some use. I'm writing it for me, of course. but I do get some credit for this:

     # time to get dynamic, see..
     rewriterule ^(.*).htm $1.php
  

beginning rewriting.. Whenever you use mod_rewrite (the part of apache that does all this magic), you need only to do this once per .htaccess file:

     Options +SymlinksIfOwnerMatches
     RewriteEngine on
  

before any ReWrite rules. +FollowSymLinks must be enabled for any rules to work. This is a security requirement of the rewrite engine. Normally it is enabled in the root and you should nothave to add it, but it doesn't hurt to do so and I'll insert it into all the examples on this page, just in case.

The next line simply switches on the rewrite engine for that folder. if this directive is in you main .htaccess file, then the ReWrite engine is theoretically enabled for your entire site. But, it is wise to always add that line before you write any redirections, anywhere.

Note: While some of the directives on this page may appear split onto two lines, in your .htaccess file, they must exist completely on one line.

Simple Rewriting

Simply put, Apache scans all incoming URL requests, checks for matches in our .htaccess file and rewrites those matching URLs to whatever we specify. something like this:
     #all requests to whatever.htm will be sent to whatever.php:
     Options +FollowSymlinks
     RewriteEngine on
     RewriteRule ^(.*).htm $1.php [nc]
  

This is handy for anyone updating a site from static HTM (you could use .html, or .htm(.*)) to dynamic PHP pages where requests to the old pages are automatically rewritten to our new URL's and no one notices a thing. Visitors and search engines can access your content either way. As an added bonus, this enables us to easily split PHP code and its included html structures into two separate files which makes it a nice idea and makes editing and updating a breeze. The [nc] part at the end means "No Case", or "case-insensitive", but we'll get to that.

Folks can link to whatever.htm or whatever.php, but they always get whatever.php in their browser and this works even if whatever.htm doesn't exist! but I'm straying..

As it stands, it's a bit tricky. Folks will still have whatever.htm in their browser address bar and will still keep bookmarking your old .htm URL's. Search engines, too, will keep on indexing your links as .htm. Some have even argued that serving up the same content from two different places could have you penalized by the search engines. This may or not bother you, but if it does, mod_rewrite can do some more magic.

     # this will do a "real" http redirection:
     Options +FollowSymlinks
     rewriteengine on
     rewriterule ^(.+).htm$ http://yourdomain.org/$1.php [r=301,nc]
  

This time we instruct mod_rewrite to send a proper HTTP "permanently moved" redirection, aka; "301". Now, instead of just redirecting on-the-fly, the user's browser is physically redirected to a new URL and whatever.php appears in their browser's address bar, search engines and other spidering entities will automatically update their links to the .php versions. Everyone wins. And you can take your time with the updating, too.

Not-So-Simple Rewriting

You may have noticed, the above examples use regular expression to match variables. What that simply means is.... match the part inside (.+) and use it to construct "$1" in the new URL. In other words, (.+) = $1 you could have multiple (.+) parts and for each, mod_rewrite automatically creates a matching $1, $2, $3, etc, in your target URL, something like:

     # a more complex rewrite rule:
     Options +FollowSymlinks
     RewriteEngine on
     RewriteRule ^files/(.+)/(.+).zip download.php?section=$1&file=$2 [nc]
  

would allow you to present a link as:

    http://mysite/files/games/hoopy.zip
 

and in the background have that translated to:

     http://mysite/download.php?section=games&file=hoopy
  

which some script could process. You see, many search engines simply do not follow our ?generated=links. So, if you create generating pages, this is useful. However, it is only the dumb search engines that cannot handle these kinds of links. We have to ask ourselves, "Do we really want to be listed by the dumb search engines?" Google will handle a good few parameters in your URL without any problems and the (hungry hungry) yet-to-actually-have-a-search-engine msn-bot stops at nothing to get that page, sometimes again and again and again.

I personally feel it's the search engines that should strive to keep up with modern web technologies. In other words; we shouldn't have to dumb-down for them. But,that is just my opinion. Many users will prefer /files/games/hoopy.zip to /download.php?section=games&file=hoopy. But, I don't mind either way. As someone pointed out to me recently, presenting links as/standard/paths means you're less likely to get folks doing typos in typed URL's, so something like:

     #an even more complex rewrite rule:
     Options +FollowSymlinks
     RewriteEngine on
     RewriteRule ^blog/([0-9]+)-([a-z]+) http://yourdomain.org/blog/index.php?archive=$1-$2 [nc]
  

would be a neat trick, enabling anyone to access my blog archives by doing:

     http://yourdomain.org/blog/2003-nov
  

in their browser and have it automagically transformed server-side into:

     http://yourdomain.org/blog/index.php?archive=2003-nov
  

which yourdomainblog would understand. It's easy to see that with a little imagination, and a basic understanding of posix regular expression, you can perform some highly cool URL manipulations.

Shortening URL's

One common use of mod_rewrite is to shorten URL's. Shorter URL's are easier to remember and, of course, easier to type. An example:

     # beware the regular expression:
     Options +FollowSymlinks
     RewriteEngine On
     RewriteRule ^grab(.*) /public/files/download/download.php$1
  

This rule would transform this user's URL:

     http://mysite/grab?file=my.zip
  

server-side, into:

     http://mysite/public/files/download/download.php?file=my.zip
  

which is a wee trick I use for my distro machine, among other things. Everyone likes short URL's and so will you. Using this technique, you can move /public/files/download/ to anywhere else in your site and all the old links still work fine. Just alter your .htaccess file to reflect the new location. edit one line and you are done, This is nice because even when stuff is way deep in your site, you can have cool links like this:

     http://yourdomain.org/img/hotlink.png [nc]
  

You may see the last line broken into two, but it's all one line (all the directives on this page are). Let's have a wee look at what it does.

We begin by enabling the rewrite engine, as always.

The first RewriteCond line allows direct requests (not from other pages - an "empty referrer") to pass unmolested. The next line means if the browser did send a referrer header and the word "yourdomain" is not in the domain part of it, then DO rewrite this request.

The all-important final RewriteRule line instructs mod_rewrite to rewrite all matched requests (anything without "yourdomain" in its referrer) asking for gifs, jpegs, or pngs to an alternative image. Mine says "No Hotlinking!". There are loads of ways you can write this rule. imple is best. You could send a wee message instead, direct them to some evil script, or something.

httpd.conf

Remember, if you put these rules in the main server conf file (usually httpd.conf) rather than an .htaccess file, you'll need to use ^/... ... instead of ^... ... at the beginning of the RewriteRule line, in other words, add a slash. But since httpd.conf is restricted to be access, this does not apply at MostHost.

Inheritance

If you are creating rules in sub-folders of your site, you need to read this.

You will remember how rules in top folders apply to all the folders inside those folders too. We call this "inheritance". Normally, this just works. but if you start creating other rules inside subfolders, you will, in effect, obliterate the rules already applying to that folder due to inheritance, or "decendancy", If you prefer not all the rules, just the ones applying to that subfolder. here is a wee demonstration.

Let's say I have a rule in my main /.htaccess which redirected requests for files ending .htm to their .php equivalent, just like the example at the top of this very page. Now, if for any reason I need to add some rewrite rules to my /osx/.htaccess file, the .htm >> .php redirection will no longer work for the /osx/ subfolder, I'll need to reinsert it, but with a crucial difference.

     # this works fine, site-wide, in my main .htaccess file
     # main (top-level) .htaccess file..
     # requests to file.htm goto file.php
     Options +FollowSymlinks
     rewriteengine on
     rewriterule ^(.*).htm$ http://yourdomain.org/$1.php [r=301,nc]
  

Here's my updated /osx/.htaccess file, with the .htm >> .php redirection rule reinserted, but I'll need to reinsert the rules for it to work in this sub-folder:

     # /osx/.htaccess file..
     Options +FollowSymlinks
     rewriteengine on
     rewriterule some rule that I need here
     rewriterule some other rule I need here
     rewriterule ^(.*).htm$ http://yourdomain.org/osx/$1.php [r=301,nc]
  

Spot the difference in the subfolder rule. You must add the current path to the new rule. Now it works again. If you remember this, you can go replicating rewrite rules all over the place.

Conclusion

In short, mod_rewrite allows you to send browsers from anywhere to anywhere. You can create rules based not simply on the requested URL, but also on such things as IP address, browser agent (send old browsers to different pages, for instance), and even the time of day; the possibilities are practically limitless.

The ins and outs of mod_rewrite syntax are topic for a much longer document than this, and, if you fancy experimenting with more advanced rewriting rules, I urge you to check out the apache documentation. If you are running some *nix operating system, (in fact, if you have apache installed on any operating system), there will likely be a copy of the apache manual on your own machine. Check out these excellent mod_rewriting guides for the juicy syntax bits.

http://www.ilovejackdaniels.com/apache/mod_rewrite-cheat-sheet/
http://httpd.apache.org/docs/1.3/mod/mod_rewrite.html
http://httpd.apache.org/docs/1.3/misc/rewriteguide.html
http://forum.modrewrite.com/
  • 36 Users Found This Useful
Was this answer helpful?

Related Articles

URL redirect/rewrite using the .htaccess file

Problem: How do I perform a URL redirect/rewrite using the .htaccess file? Solution: .htaccess...

How to host the Primary Domain from a subfolder (.htaccess)

How do I make a sub directory (or sub folder) act as the public_html for your main domain? The...

Redirect

Problem: How do I create a redirect? Solution: The Redirects tool will allow you to redirect...

Restrict subdomain access to addon domains

Problem: How do I stop people from being able to use the addon domain as a subdomain of the...

How to fix ExecGCI in .htaccess

How to run CGI scripts for Addon domains. Open the .htaccess file located in the...