3

I'm using Apache mod_rewrite and I'm looking to dynamically set the Host header with RequestHeader based on the domain from the QUERY_STRING. How would I dynamically set the Host?

Given the following request URL:

https://example.com/p12?url=http://nonssldomain.com/331551/1041505584.jpg?dt=032620151151

What I've tried:

 RewriteCond %{QUERY_STRING} ^url=(.*)$ [NC]   

 #some regex to parse domain from Query_String (Doesn't work)
 RequestHeader set Host ^(?:https?:\/\/)?(?:[^@\n]+@)?(?:www\.)?([^:\/\n]+)
 RewriteRule ^.*/p12$ %1? [P,NC,L]

Would I need to dynamically set a variable first?

MrWhite
  • 12,647
  • 4
  • 29
  • 41
codejunkie
  • 51
  • 1
  • 1
  • 7
  • Yes, you would need to set a variable, as stated in hjpotter92's answer. Since `RequestHeader` (part of mod_headers) and mod_rewrite are different _modules_ they execute independently, not in the order stated in the config file. – MrWhite Sep 09 '16 at 09:17
  • Does a modification of the Host name affect the name based virtutal host selection? – ceving Feb 21 '20 at 10:02

2 Answers2

1

You have the following statement:

RequestHeader set Host ^(?:https?:\/\/)?(?:[^@\n]+@)?(?:www\.)?([^:\/\n]+)

If you look at the RequestHeader directive, it says:

For set, append, merge and add a value is given as the third argument.

So, your pattern is actually being considered as the value. Instead, what you should do is

  1. Use an <If> directive to see if your request is for p12?url=
  2. Use a SetEnvIf directive inside the <if> clause to set some custom environment variable dynamically to the value you want at the end (let's assume it is: my_new_host)
  3. Add the RequestHeader statement with the %{my_new_host}e as its 3rd argument.

Try the following:

<If "%{QUERY_STRING} =~ m#(?:https?://)?(?:[^@\n]+@)?(?:www\.)?(?<NEW_HOST>[^:/\n]+)#">
    RequestHeader set Host %{MATCH_NEW_HOST}e
</If>
hjpotter92
  • 670
  • 1
  • 10
  • 20
  • So would it look something like this? RewriteCond %{QUERY_STRING} ^url=(.*?)/(.*)$ [NC] SetEnvIf my_new_host %1/? RequestHeader set Host %{my_new_host}e RewriteRule ^.*p12$ %1/%2? [P,NC,L] – codejunkie Sep 09 '16 at 15:02
  • @George No. Not exactly. Go over the syntax for each of the directives. [``](https://devdocs.io/apache_http_server/mod/core#if), [`SetEnvIf`](https://devdocs.io/apache_http_server/mod/mod_setenvif#setenvif) – hjpotter92 Sep 09 '16 at 18:24
  • 1
    Does this look any better? This is very confusing :-/ SetEnvIf Request_URI ^.*?url=(?:https?:\/\/)?(?:[^@\n]+@)?(?:www\.)?([^:\/\n]+).* my_new_host=%1? RequestHeader set Host %{my_new_host}e – codejunkie Sep 09 '16 at 19:21
  • The working regex can be found here. https://regex101.com/r/dC0xZ4/1 – codejunkie Sep 09 '16 at 19:22
  • @George You're using Apache 2.2. From apache 2.4+, you could also refer the matched groups of your location contexts. I'll go through the directives for 2.2 first, and try to post an answer asap. – hjpotter92 Sep 09 '16 at 19:36
  • It's possible I'm using 2.4, I'm not really a linux expert. But I'm using the latest ec2 instance on AWS. I'll try and figure out what the exact version is. These are the only examples I've been able to find or at least close to understanding. – codejunkie Sep 09 '16 at 19:38
  • @George `sudo apachectl -v` should give you the version. – hjpotter92 Sep 09 '16 at 19:41
  • I'm pretty sure I'm using Apache 2.4 and will confirm once I'm able to gain access to the server. It looks as if AWS released 2.4 in 2012 and this server was created in 2015. I will confirm though. Is the code I'm presenting 2.4 code? – codejunkie Sep 09 '16 at 19:56
  • @George If you check the syntax for `RequestHeader` directive, you'll notice that apache added `[early|env=[!]varname|expr=expression]]` at the end for 2.4, which wasn't there in 2.2. :) I'll go to sleep now. It is almost 2 AM ~_~ – hjpotter92 Sep 09 '16 at 19:59
  • 1
    So are you thinking we could do something like this RequestHeader set Host "expr=%{REQUEST_URI} =~ m#^.*?url=(?:https?:\/\/)?(?:[^@\n]+@)?(?:www\.)?([^:\/\n]+).*#" and do away with the need to set SetEnvIf ? I seen an example here https://devdocs.io/apache_http_server/expr – codejunkie Sep 09 '16 at 20:09
  • @George Check the edit. – hjpotter92 Sep 10 '16 at 04:58
  • sadly I was wrong about apache being 2.4, it's actually still 2.2 :-/ So I'm assuming the code won't work? – codejunkie Sep 10 '16 at 05:14
  • 1
    I just upgraded apache to 2.4. About to test soon. – codejunkie Sep 10 '16 at 06:13
  • We are super close, the only problem we have now is the regex appears to be incorrect. It needs to just return the domain of the query parameter. So nonssldomain.com should be the only thing set in the Host. I do not think that is happening right now. – codejunkie Sep 10 '16 at 07:05
  • Any idea how to log %{MATCH_NEW_HOST}e? – codejunkie Sep 10 '16 at 08:04
  • @George hmm, you can set it as a new response header, and check there. The logging module would require you to put it in server/vhost level config and isn't worth all this trouble. Most probably, the issue here is that your `QUERY_STRING` is getting the `/` and `?` escaped (`%2F`/`%3F`) etc. type characters. – hjpotter92 Sep 10 '16 at 10:16
  • could you explain how the QUERY_STRING works? I'm not finding much info on it. I'm wondering if the regex is getting the root domain? I'm finding whenever there is a query parameter on other areas of the site that isn't a url= match, it still appears to run this code and break other areas of the site. Example, https://(null)/cars?make=chevrolet is the result when I click a filter with the query string make=chevrolet. It stripped the main domain for some reason and replace it with null. The goal is to only set the host when it's a p12?url= and then replace host with param domain – codejunkie Sep 10 '16 at 19:03
  • @George You can see the request variables here: https://httpd.apache.org/docs/2.4/expr.html#vars – hjpotter92 Sep 10 '16 at 20:05
  • Thanks, I don't believe is being populated. I'm guessing it's most likely null or %{MATCH_NEW_HOST}e isn't getting it's value. Any suggestions? – codejunkie Sep 10 '16 at 20:38
  • I was able to resolve it, please see me answer. Thank you for all your help. You pointed me in the right direction, however your sample code did not set the variable. – codejunkie Sep 10 '16 at 23:30
1

The solution was as followed

This regex enables us to get the entire query string as well as just the domain for the host. %1 gets us the entire query param and %2 gets us just the domain. Regex example can be found here https://regex101.com/r/dC0xZ4/2

 RewriteCond %{QUERY_STRING} url=((?:https?://)?(?:[^@\n]+@)?(?:www\.)?([^:/\n]+).*) [NC]

When we see p12 we proxy %1 if the rewrite condition is true and set our new host env variable.

  RewriteRule ^.*/p12$ %1 [P,NC,L,E=new_host:%2]

You need to check to see if the new_host env variable exist otherwise your setting your host to null.

  <If "-T reqenv('new_host')">
    //Set host with new_host variable
    RequestHeader set Host %{new_host}e
  </If>
codejunkie
  • 51
  • 1
  • 1
  • 7