18

When I have multiple RewriteCond chained together, only the capture groups of the last RewriteCond can be referenced with %0-%9.

In the following problem the parameters in the query string of the url can be in any order. To parse them into a fancy url I would need to match each parameter in the query string individually:

RewriteCond %{QUERY_STRING} param1=([^&]+)
RewriteCond %{QUERY_STRING} param2=([^&]+)
RewriteCond %{QUERY_STRING} param3=([^&]+)
RewriteRule ^foo$ bar/%1/%2/%3/ [R]

Like I pointed out... this doesn't work. To fix this, I could reference the capture group of the previous RewriteCond in the next RewriteCond and 'propagate' each parameter to the actual RewriteRule:

RewriteCond %{QUERY_STRING} param1=([^&]+)
RewriteCond %1&%{QUERY_STRING} ^([^&]*)&.*param2=([^&]+)
RewriteCond %1&%2&%{QUERY_STRING} ^([^&]*)&([^&]*)&.*param3=([^&]+)
RewriteRule ^foo$ bar/%1/%2/%3/ [R]

This should work, but for each additional parameter it get's messier. An other solution could possibly be parsing one parameter and redirecting the client after each parameter (resulting in a lengthy chain of redirects, which I would like to avoid).

Is there an cleaner way of accessing the capture groups of all RewriteCond's in the RewriteRule (e.g. is it possible to name them or assign them to a variable so I can reference them somewhere else?)

Sumurai8
  • 20,333
  • 11
  • 66
  • 100
  • 2
    +1 for a great question. I have also faced this problem, and would be keen to know an answer – anubhava Jul 22 '13 at 11:12
  • 1
    The mod_rewrite docs say that RewriteCond backreferences only keep the **last matched RewriteCond** in the current set of conditions, so, yeah, your second answer may be the only answer in this route. Given the complexity of it, have you thought about setting up a script at `foo` that parses the GET)or POST) and redirects to the appropriate `bar`? – miah Jul 22 '13 at 15:00

2 Answers2

10

You could try constructing the target URL inside the rewrite conditions:

RewriteCond ##%{QUERY_STRING}      (.*)##(|.*&)param1=([^&]+)
RewriteCond %1/%3##%{QUERY_STRING} (.*)##(|.*&)param2=([^&]+)
RewriteCond %1/%3##%{QUERY_STRING} (.*)##(|.*&)param3=([^&]+)
RewriteCond %1/%3##%{QUERY_STRING} (.*)##(|.*&)param4=([^&]+)
RewriteCond %1/%3##%{QUERY_STRING} (.*)##(|.*&)param5=([^&]+)
RewriteCond %1/%3##%{QUERY_STRING} (.*)##(|.*&)param6=([^&]+)

RewriteRule ^foo$ /bar%1/%3? [L,R]

When I try to request:

/foo?param1=a&param2=b&param6=3&param3=4&param5=5&param4=6

I get redirected to:

/bar/a/b/4/6/5/3

Adding any additional required query string parameters won't make it look any more messy than it already is.

Jon Lin
  • 142,182
  • 29
  • 220
  • 220
  • What are the `##` is being used for? – Rahil Wazir Oct 11 '14 at 22:11
  • @RahilWazir it's just used as a delimiter to match against, we know we cant have `#`'s in uri's so its safe to use – Jon Lin Oct 11 '14 at 22:16
  • So this was neccessary? I mean without delimiters won't it work (still don't understand)? – Rahil Wazir Oct 11 '14 at 22:26
  • @RahilWazir you need to be able to separate the matching between the query string and the previous match, otherwise thr regex just gobbles up everything – Jon Lin Oct 11 '14 at 22:39
  • ala greedy regex processing, and its uncommon to have ## in a url. very interesting! so (|.*&) is the optional start of capture, ended by ([^&]+) which means ^start of a & didn't know you could put ^ halfway, or a plus + for magical reasons? – Tomachi Jun 27 '21 at 08:30
2

After experimenting some more, it would be possible to parse all parameters as environment variables and use them like that. I doubt it is very efficient though and I think any use-case that would need such a construction would be better of using a php page router. For fancy url's Jon Lin's solution would probably work better. It does however sort-of mimic what I had in mind.

I'll, however, put the code in here for demonstration:

#Parse all query key-value pairs to an environment variable with the q- prefix
RewriteCond %{QUERY_STRING} ^([^=]*)=([^&]*)(&(.*)|$)$
RewriteRule ^(.*)$ $1?%4 [E=q-%1:%2,N]

#If 'param1' existed in the query string...
RewriteCond %{ENV:q-param1} !^$
RewriteRule ^foo$ bar/%{ENV:q-param1} [END]

or even...

#Keep the original query string
RewriteCond %{ENV:qstring} ^$
RewriteCond %{QUERY_STRING} ^(.*)$
RewriteRule .* - [E=qstring:#%1]

#parse the query parameters to environment variables
RewriteCond %{ENV:qstring} ^#([^=]*)=([^&]*)(&(.*)|$)$
RewriteRule ^(.*)$ - [E=q-%1:%2,E=qstring:#%4,N]

#See that the original query string is still intact
RewriteCond %{ENV:q-param1} !^$
RewriteRule ^foo$ bar/%{ENV:q-param1} [QSA]
Sumurai8
  • 20,333
  • 11
  • 66
  • 100
  • wowsers is that thread-safe?! nice trick. i wonder if it breaks down when you get 200 different queries in a second? possibly fine so long as no blocking io can happen. nifty. – Tomachi Jun 27 '21 at 08:32
  • maybe ENV:q-param1 should be ENV:qstring at the end... – Tomachi Jun 27 '21 at 08:36
  • I have no reason to believe it wouldn't be thread-safe, since I think the environment variable only persists for this request and the request is handled by one thread. As for the last line, that is correct. It is supposed to show that you can do this while not modifying the query string (e.g. `/foo?param1=test` will be rewritten to `/bar/test?param1=test`. Please note that I would not recommend using this in a production environment, unless you have tested that it performs well even under load. It somewhat abuses mod_rewrite. – Sumurai8 Jun 27 '21 at 09:45