1

I'm new to mod_rewrite but am trying my best to fix up my site with clean URLs.

Using RewriteRule I can get it so you can type in a clean URl and get to the right page, but what I'm having trouble with is automatically redirecting to the clean URL if a "messy" one is used (which is highly possible due to user submitted links and content etc)

Now, I have this bit of code which I found on another .htaccess forum, and it works in one situation, but not another. I'll explain:

# FORCE CLEAN URLS
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s/+index\.php\?(.*)=([^\s]+) [NC]
RewriteRule ^ %1\/%2? [R=301,L]

This works fine on an address like this, for example: www.domain.com/index.php?cmd=login it automatically redirects to www.domain.com/cmd/login

But the problem comes when there is more than one query, like: www.domain.com/index.php?cmd=view-profile&user=bob

I can't figure out how to make it sort out that kind of URL when there could be up to 3 or more queries in the address. I'm not fully competent with regex, so my attempts to amend the code snippet I have has failed thus far.

Any help would be appreciated! I would like them to be 301 redirects so that the site can get indexed properly and be SEO compliant no matter what type of clean or messy URL is used, but I'm open to suggestions!

EDIT

After playing around with the regex for a few hours, I've progressed but got stumped again.

If I make the expression to this:

index\.php\?(.*)=([^\s]+)(&(.*)=([^\s]+))?+
$1/$2/$3/$4/$5

It will match these 3 URLs from index.php onwards:

http://site.com/index.php?cmd=shop&cat=78

http://site.com/index.php?cmd=shop

http://site.com/index.php?cmd=shop&cat=78&product=68

BUT the resulted output varies depending on which it is. These are my results:

http://site.com/cmd=shop&cat/78///

http://site.com/cmd/shop///

http://site.com/cmd=shop&cat=78&product/68///

I'm nit sure how to get it to treat certain parts as optional so it groups properly.

MrLewk
  • 498
  • 1
  • 6
  • 24
  • Where do you want the more complicated example to be redirected to? /view-profile/bob? – Dave S. Jan 28 '13 at 14:40
  • @DaveSteinberg - yeah, I'd like the other parts of the URL to be handled like that if they are present – MrLewk Jan 28 '13 at 15:17
  • ¿How is this URL `www.domain.com/index.php?cmd=view-profile&user=bob` scheme different from those that should keep the query? – Felipe Alameda A Jan 28 '13 at 15:47
  • @faa With the rewrite as it is, it will turn: **?cmd=login** into **/cmd/login/** But it _won't_ turn **?cmd=view-profile&user=bob** into **/cmd/view-profile/user/bob** – MrLewk Jan 28 '13 at 16:24
  • @MrLewk I understand that, you want to remove the query (the string after `?`) no matter the number of parameters. But don't you need to keep other URLs with a query or the idea is to suppress it from all URLs? Doesn't look like a good idea if that's the case, that's why I am asking. – Felipe Alameda A Jan 28 '13 at 16:32
  • @faa Why is it not a good idea? I want to create clean URLs no matter how many or little parameters are in the querystring. The only time I'd probably need the query string to be visible as it is, is in a search result. Or am I misunderstanding your concern? – MrLewk Jan 28 '13 at 16:35
  • @MrLewk I am not sure. The whole idea of clean URLs is to show a "pretty" URL while silently a resource process the request. For example `www.domain.com/xxx/yyy` is shown but silently is mapped to `www.domain.com/index.php?xxx=yyyy`. This substitution URL might not show in the browser, but it is generated and holds a query that shouldn't be removed. It doesn't have to show for that to happen. – Felipe Alameda A Jan 28 '13 at 16:47
  • @faa I'm not wanting to remove anything, just tidy it up. The regex I have at the moment can only handle one parameter in the query string ie. www.domain.com/index.php?cmd=login becomes www.domain.com/cmd/login But I don't know how to make it handle extra parameters that are optional and may not always be in the URL, such as the profile example which has as ?cmd=profile and &user=name. I want the parameters to be there (eg: cmd/profile/user/name) - I just don't know enough regex to accomplish that. Nothing needs to be removed, just cleaned up. – MrLewk Jan 28 '13 at 17:04
  • Just add a trailing question mark `?` to the substitution URL. That will remove the whole query. – Felipe Alameda A Jan 28 '13 at 17:06
  • I've edited the question to update and explain better where I'm at now. – MrLewk Jan 28 '13 at 19:50

1 Answers1

1

You'll need to deal with each number of pairs of parameters separately. The one you have can be used to handle one name/value pair, then approach it similarly for 2, and 3 (and 4 if needed):

# To handle a single name/value pair in the query string:

RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s/+index\.php\?([^&=]+)=([^&\ ]+)(\ |$) [NC]
RewriteRule ^ /%1\/%2? [R=301,L]

# To handle 2:

RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s/+index\.php\?([^&=]+)=([^&\ ]+)&([^&=]+)=([^&\ ]+)(\ |$) [NC]
RewriteRule ^ /%1\/%2/%3/%4? [R=301,L]

# To handle 3:

RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s/+index\.php\?([^&=]+)=([^&\ ]+)&([^&=]+)=([^&\ ]+)&([^&=]+)=([^&\ ]+)(\ |$) [NC]
RewriteRule ^ /%1\/%2/%3/%4/%5/%6? [R=301,L]

Basically, you're adding another &([^&=]+)=([^&\ ]+) before the check for the end of the request, (\ |$), and adding another /%#/%# to the end of the target URI, where the #'s are appropriate incremented backreferences.

Jon Lin
  • 142,182
  • 29
  • 220
  • 220
  • Thanks, this works and does what I wanted. The URLs are "clean", just not all of them make the page display. index.php?cmd=shop becomes /cmd/shop and shows the right page, but index.php?cmd=shop&cat=78 _does_ become cmd/shop/cat/78 **but** doesn't show the right page, just goes back to the home page – MrLewk Jan 29 '13 at 12:46
  • @MrLewk You have the internal rewrite to take a URI like `/A/B/C/D` and rewrite it back to `/index.php?A=B&C=D`? – Jon Lin Jan 29 '13 at 20:33
  • Yeah I did that but they weren't making it work for some reason, so I'm gonna strip it down and do it line by line to see where it's failing. Thanks for your answer though, it's got me further forward than I was getting myself! – MrLewk Jan 30 '13 at 08:49