3

I have an old site wordpress site I am in the process of migrating.

There are about 50000 urls that need to be redirected.

For this I am using DBM files which seem to be working fine, however during load testing I noticed that I am losing about .5 seconds on each request.

Reviewing the logs it looks like the DBM file that has the 50000 is being hit every time.

I changed the DBM from 50000 to 10000 and notice I gained about .25 seconds on each request from the 50000 DBM file.

I would like to be able to do something like this however no matter how I mix and match the code I cannot get it working:

<If "%{REQUEST_URI} =~ m#^abc#">
    RewriteMap abcredirects "dbm:/etc/httpd/conf/dbm/abcredirects.dbm"
    <IfModule mod_rewrite.c>
        RewriteEngine On
        RewriteCond ${abcredirects:$1} !=""
        RewriteRule ^(.*) /${abcredirects:$1} [R=301,L]
    </IfModule>
</If>

<If "%{REQUEST_URI} =~ m#^xyz#">
    RewriteMap xyzredirects "dbm:/etc/httpd/conf/dbm/xyzredirects.dbm"
    <IfModule mod_rewrite.c>
        RewriteEngine On
        RewriteCond ${xyzredirects:$1} !=""
        RewriteRule ^(.*) /${xyzredirects:$1} [R=301,L]
    </IfModule>
</If>

Using the above as sudo code How can I get this to work?

ie.

If url example.com/abc12345.htm look in abcredirects.dbm else exit to VirtualHost

If url example.com/xyz12345.htm look in xyzredirects.dbm else exit to VirtualHost

If url example.com/abc12345.htm DO NOT look in xyzredirects.dbm

If url example.com/xyz12345.htm DO NOT look in abcredirects.dbm

If url example.com/hik12345.htm DO NOT look in xyzredirects.dbm or abcredirects.dbm

NO .htaccess unfortunately I cannot use an .htaccess

Apache If statement not working

Donna Delour
  • 424
  • 5
  • 10
  • Presumably this is all happening on the same server (same domain) as the new website? Is the new site also WordPress? – MrWhite Nov 21 '18 at 19:00

1 Answers1

0

I don't think the delay is in the RewriteMap definition, just when the lookup is first called. So the RewriteMaps can be defined at the top of your config.

There wouldn't seem to be any need to use an <If> condition since you can (and should) check the URL-path in the RewriteRule pattern.

Try something like the following in the vHost:

RewriteEngine On

RewriteMap abcredirects "dbm:/etc/httpd/conf/dbm/abcredirects.dbm"
RewriteMap xyzredirects "dbm:/etc/httpd/conf/dbm/xyzredirects.dbm"

RewriteCond ${abcredirects:$1} !=""
RewriteRule ^(/abc.*) /${abcredirects:$1} [R=301,L]

RewriteCond ${xyzredirects:$1} !=""
RewriteRule ^(/xyz.*) /${xyzredirects:$1} [R=301,L]

No need for the <IfModule> containers (unless your site is intended to work without mod_rewrite - but that is probably not the case).

The RewriteRule pattern is processed first. If that fails then the preceding condition (that looks up the rewrite map) are skipped and processing moves on to the next RewriteRule.

If /abc12345.htm literally consists of digits, followed by a .htm extension then include this in the RewriteRule regex to be as specific as possible and prevent unnecessary lookups. For example:

RewriteRule ^(/xyz\d+\.htm)$ /${xyzredirects:$1} [R=301,L]

Make sure your browser cache is clear before testing. It is often easier to test with 302 (temporary) redirects for this reason.


HOWEVER, if this is all happening on the new website/server then the above is probably not the best approach as it impacts every request, including all requests for new pages and static resources.

With this many URLs it is often preferable to script the redirect when your site has already determined it is a 404. Only at this late stage in the request should you lookup up the new URL in the site's database and trigger the redirect. This way it doesn't impact "normal" site performance.


UPDATE:

the abc urls could be a couple different ways, example.com/dir/dir/abc12345.htm or example.com/abc12345.htm or example.com/dir/dir/abc12345.xml, example.com/dir/abc12345.xml. The only constant in the URL would be abc.

In that case, change the RewriteRule pattern from ^(/abc.*) to something like:

RewriteRule (.*/abc.+\.(?:htm|xml))$ /${abcredirects:$1} [R=301,L]

As mentioned above, if the remainder of the file basename (before the file extension) always consists of digits (0-9) then be more specific, and match \d instead of .. Or if there are always 5 digits (as in your examples) then \d{5}.

Note that the above captures the entire URL-path that matches (eg. /dir/dir/abc12345.htm) and this is then passed as the argument to your rewrite map.

RewriteRule (.*/abc\d+\.(?:htm|xml))$ /${abcredirects:$1} [R=301,L]
MrWhite
  • 12,647
  • 4
  • 29
  • 41
  • using this configuration all of the redirects do work, however I can see when running each request, that if abc is in the url, i will check each DBM file. Which is not what I want to happen – Donna Delour Nov 26 '18 at 14:43
  • Is that by checking the debug log? That is not what is supposed to happen... if `abc` is at the start of the URL then the `RewriteRule` _pattern_ `^(/xyz.*)` _fails_ and processing should go no further on that rule (the preceding `RewriteCond` is not processed and neither is the _substitution_ so nothing should be looked up in the xyz map). – MrWhite Nov 26 '18 at 15:00
  • Please edit your question to add any additional information. – MrWhite Nov 26 '18 at 16:31
  • Where exactly are you placing these directives? It is intended that these directives go directly in a _virtualhost_ context, not inside a `` container - the regex will never match in a _directory_ context. If you change the `RewriteRule` _pattern_ to `^(.*)` then all rewrite maps will indeed be looked up - that is not the intention here. – MrWhite Nov 26 '18 at 16:38
  • In your "log" you have `/test1/abc12347.htm`, whereas you question states `/abc12345.htm`. The directives above are intended to match `/abc12345.htm` (as stated in your question), not `/test1/abc12345.htm` - please clarify. – MrWhite Nov 26 '18 at 16:44
  • 1
    thank you for pointing that out, the abc urls could be a couple different ways, `example.com/dir/dir/abc12345.htm` or `example.com/abc12345.htm` or `example.com/dir/dir/abc12345.xml` `example.com/dir/abc12345.xml` The only constant in the url would be `abc` – Donna Delour Nov 26 '18 at 17:06
  • Is "abc" always followed by digits (0-9)? Is there always a `.htm` or `.xml` file extension? Or a selection of known file extensions? And the URL looked up in your rewrite map is always the complete URL starting with a slash? – MrWhite Nov 26 '18 at 22:31
  • I've updated my answer. – MrWhite Nov 26 '18 at 22:41