0

I've been tasked with making the URLs for an old website "pretty"

It's a real estate website with a few dozen buildings and maybe 100ish total apartment listings.

Current urls are like: (i was able to remove .php extension easily with .htaccess)

example.com/about?building-id=65
example.com/about?apartment-id=35

It gets considerably more complicated than that, with some search pages, different listings views, a contact form page that gets sent a building or apt id to prefil the form with some info, etc...

But for this example...

Ideally the URLs should be

example.com/about/building-name/
example.com/about/building-name/apartment-name

The building name and apartment name are both values stored in the database, with the id#s as the primary keys.

After researching this issue a bit, I've determined a couple different approaches

1) Dynamically Generate the .htaccess file upon changes in admin

  • I think I can do this right?
  • It would explicitly rewrite just about every possible valid query string possibility ( up to couple hundred probably)
  • I imagine this will cause some performance issues, and is probably not best practice.
  • If its the easiest option, I'm ok with it being sloppy and bad practice as long as it works.

2) Create a controller and rewite all queries to index.php which would put together all the views from here.

  • This is a bit more out of my comfort zone, but probably considered the best practice?

The site is seriously old, and poorly put together (1000s of lines of custom (non OOP) php with lots of code duplication) For that matter, its really 2 different sites sharing the one database, and most of the code base (2nd was created some time ago with a copy-paste of original code base and has since grown apart)


My Question(s)

  • Are these viable options and did I miss any alternative?

  • Which approach should I take?

Zach Lysobey
  • 14,959
  • 20
  • 95
  • 149
  • An example with the id will help: `example.com/about/building-name/apartment-name` Where is the id in this example, or is the same name? – Felipe Alameda A Dec 03 '12 at 18:41
  • The ID is numeric, but the name is a string. The names are mapped to the ids in the database with the ID as the primary key. So.. `about?apartment-id=4324` would have to lookup the apartment with that id, get its name, and the associated building id. It would then get the building name from *that* id and build the url like this: `about/building-name/apartment-name` – Zach Lysobey Dec 03 '12 at 18:50
  • They *should always* be unique, though there is nothing restricting them as such. Actually, there may be situations where there are apartments with the same name for different buildings. – Zach Lysobey Dec 03 '12 at 18:55
  • @ZachL As the relation between IDs and Names is something that must be obtained from the database and that's not possible in .htaccess files, unless you want a rule for each name, I guess the best option is to send it to a PHP script for processing. – Felipe Alameda A Dec 03 '12 at 18:58
  • Yea, the concept was that I _would_ create a rule per name dynamically (by regenerating the .htaccess file) each time any of those values was updated in the database. Does that sound like a terrible idea? – Zach Lysobey Dec 03 '12 at 19:00
  • It depends on the modifications to the site ( regenerating) vs the visits (Processing) frequencies. – Felipe Alameda A Dec 03 '12 at 19:05
  • Yea, I was thinking that myself. In all likelyhood, the .htaccess will not need to be regenerated too frequently - however it is likely that when it is, the user will be making multiple changes in one sitting, so it would be an issue if the actual generation of this file took more than a few seconds. – Zach Lysobey Dec 03 '12 at 19:29
  • @ZachL That's not only the problem the way I see it. It's the fact that there is no guarante there won't be concurrent access and regenerations. In such a case, access behavior will comply with the old rules not the updated ones, which translates to problems. – Felipe Alameda A Dec 04 '12 at 01:13

1 Answers1

2

The actual generation of the .htaccess file wouldn't be an issue. I mean, writing 500 or 1000 lines of text to a file isn't such a big deal at all. However, the multitude of rules may actually very well give a performance hit, as the engine would check the requested URL against each for every HTTP request. It's probably not a very big performance hit, but it seems wasteful to me. Note that Apache, AFAIK, does no optimization on .htaccess rewrite rules or redirects, i.e. it doesn't build a big regex out of the small regexes, etc.

Option 2, parsing the request in PHP, is actually less work, I think, especially with a huge non-OOP site.

Consider a snippet like this:

$uri = explode('/', $_SERVER['REQUEST_URI']);
if($uri[0] == 'about') {
    $id_aray = lookup_building_and_apartment_by_name( $uri[1], $uri[2] );
    $_GET['building_id'] = $id_array['building'];
    $_GET['apartment_id'] = $id_array['apartment'];
    include('about.php');
} 
else if($uri[0] == 'something_else') {
    // something else.
}
else {
    // 404
}

Such a logic would serve perfectly well as a controller, and it isn't such a big work.

Alternarively, you could put a URI-parser snippet at the top of every entry point. Like, at the top of about.php:

$uri = explode('/', $_SERVER['REQUEST_URI']);
$id_aray = lookup_building_and_apartment_by_name( $uri[1], $uri[2] );
$_GET['building_id'] = $id_array['building'];
$_GET['apartment_id'] = $id_array['apartment'];

This would enable you to work incrementally, one subpage at a time, and the old links would continue to work as well.


Mostly for the record: there's an option 3, you can use capture groups in mod_rewrite rules, like so:

RewriteRule ^/about/([^/]+)/([^/]+)$ about.php?building_name=$1&apt_name=$2 [NC,L]
RewriteRule ^/about/([^/]+)$ about.php?building_name=$1 [NC,L]

(This can be done with a single rule, but it's considerably easier to read & write like this.) This still requires you to have a php-based translation in about.php, but has the advantage that you can put these rules in your httpd.conf.

There's also an option 4: RewriteMap. It's an awesome and powerful tool, and it may be just the thing for you, if you're able to cope with the terseness of the Apache manual. Learning to use it would probably take a lot more time than using the PHP-based solution, but it's a good thing to know, it's an efficient solution, and it's very elegant.

You could easily generate a text-based mapping, then use the mapping in the RewriteRule, like so:

RewriteMap building_map txt:/etc/apache2/buildings.txt
RewriteRule ^/about/([^/]+)$ about.php?building-id=${building_map:$1|0} [NC,L]
SáT
  • 3,633
  • 2
  • 33
  • 51
  • Wow thanks, this *does* actually appear quite simpler than I had suspected. I'll give a go at implementing this (probably tomorrow). I'll post another comment if I get stuck. – Zach Lysobey Dec 03 '12 at 20:50
  • 1
    You're welcome. I added two more options, you may want to look into RewriteMap if you want to learn fantastic things. Also, I now think I screwed up the indexing of the exploded REQUEST_URI (I guess REQUEST_URI begins with a slash, so the 0th element is an empty string). – SáT Dec 03 '12 at 21:53
  • `RewriteMap` looks awesome. Unfortunately it can't be run in `.htaccess`, and this site is on a shared host without access to `htppd.conf`. – Zach Lysobey Dec 09 '12 at 22:08
  • Inspired by your suggestions, but taking a somewhat different approach, I got something working locally, but for some reason its not working once I upload the changes :-/ If you want to take a peek: http://stackoverflow.com/questions/13897801/my-htaccess-rewriterule-works-on-localhost-mysite-but-not-at-mysite-com-on-a-sha – Zach Lysobey Dec 16 '12 at 02:22