1

I'm trying to redirect requests made to sitemap.xml to different sitemap files based on the domain name, like mydomain.sitemap.xml or anotherdomain.sitemap.xml etc...

So far I have this but it only works with non-www domains:

RewriteCond %{HTTP_HOST} ^(?!www\.)([^.]+) [NC]
RewriteRule ^sitemap\.xml$ /%1.sitemap.xml [L,NC]

Any suggestions?

Marco
  • 33
  • 7

2 Answers2

2

Changing it to this should do what you want:

RewriteCond %{HTTP_HOST} ^(?:www\.)?([^.]+)\.
RewriteRule ^sitemap\.xml$ /%1.sitemap.xml [L,NC]

Update

To explain the regex, here is what it does:

^(?:www\.)?([^.]+)\.

^ says match the start of the string, so this anchors the regex to the start of the string.

(?:) says only use this for grouping, don't capture what is matched (so it doesn't use up %1 in this case).

www\. just says match "www.", the dot has to be escaped with slash because otherwise it has special meaning.

The question mark after the parentheses (?:www\.)? says that the group is optional, it either exists or it doesn't, and both will be successful matches.

So at this point we're either still at the start of the string, or we're at the point just after "www.".

Now we go on to take everything up to the next dot with ([^.]+)\.. This works because...

() is a capturing group, so in this case it captures what it matches to %1.

[^.]+ says match anything that is not a dot with [^.] and in this case the dot does not need to be escaped because it is in this "character class". The caret ^ at the start makes it a negative match, so anything that is not specified is matched. The + after that says match one or more of these, and do it "greedily" so it matches the longest string it can.

So since we're matching greedily, it means the closing \. is not actually necessary, because the greedy match will go to there anyway, but I like to put anchors in regexes because it makes them easier to read and understand. And this won't do any harm, because any valid hostname will have another part after the one we matched.

Another option would be to have the option of the dot or the end of the string, using (?:\.|$) where the pipe is "alternation" saying "this or that" (or that, or that, if more are used). The non-capturing group is used to contain the alternation. So in that case it would become:

^(?:www\.)?([^.]+)(?:\.|$)

Which means it would then also work for things like "localhost".

  • Hi @SuperDuperApps i'm just learning regex can you explain yours? – Marco Jun 20 '17 at 17:16
  • Sure, I'll add an explanation to the answer. –  Jun 20 '17 at 17:16
  • @Marco I added an update explaining the regex. If you like my answer you can accept it with the tick at the top left of it. Welcome to SO :) –  Jun 20 '17 at 17:31
1
RewriteCond %{HTTP_HOST} (www\.)?([^.]+) [NC]
RewriteRule ^sitemap\.xml$ /%2.sitemap.xml [L,NC]
Marco
  • 33
  • 7
  • See my answer which is a slight improvement on this, although both will work. Same idea. –  Jun 20 '17 at 17:13