3

I have a website that in a few months is going to have traffic from different countries, but it would depend on the country which website I should show.

At the moment I am redirecting the traffic using the free maxmind GeoIp, with php.

But I think that with 40k unique users a day, and arround 100k request a day this is going to be really slow.

I thought about doing this with .htaccess but I think the request time is going to take a little bit more.

My final idea now was to build the home in html of the site in different folders (or subdomain) according to the country, like us.website.com and redirect the users to there, but I do not know which of the way is the fastest for the user experience.

Server is LAMP (I can choose the distro)

Please help me decide!! Thanks for everything!

Saikios
  • 3,623
  • 7
  • 37
  • 51
  • 2
    What's the reason for having different sites for different countries? Is it just language? If so, that is available in headers, and no look-up is necessary. Also, MaxMind's database isn't too slow if you use their recommended methods. – Brad Jun 16 '12 at 05:35
  • It's because of the ads, links, and some information on the website customized specially for each country =) – Saikios Jun 16 '12 at 05:39
  • 1
    Ah, yeah, you'll want to do IP detection then for sure. I'd recommend sticking with that free database you have. They have a country-only version that is much more lightweight. – Brad Jun 16 '12 at 05:41
  • But is this going to be fast enough for a traffic of 100K request or 1M? – Saikios Jun 16 '12 at 05:42
  • You don't have much of an alternative. How fast it is depends on your server configuration. Fortunately, it is very scalable, as the data is nearly static. – Brad Jun 16 '12 at 05:46
  • but what do you think about the .htaccess option? – Saikios Jun 16 '12 at 05:47
  • What would you put in .htaccess? You're going to need PHP for this. this isn't a server configuration issue. – Brad Jun 16 '12 at 05:50
  • maxmind have a special thing for this mod_geoip http://www.maxmind.com/app/mod_geoip but I don't know if it's better or not :S – Saikios Jun 16 '12 at 05:53
  • That is completely different. You could give it a try to see if it meets your needs. – Brad Jun 16 '12 at 14:16

2 Answers2

1

A downloaded MaxMind GeoIP Country database (free or payed, makes no difference) is quite fast when accessed from PHP (even if their PHP code is not optimized - it is quite clearly badly translated from good C code).

Just time it on your machines (e.g., by the difference of two calls to microtime(true) with a realistic dataset), and you'll probably discover that you can afford accessing the GeoIP DB at the top of your code, in order to switch to country-specific code where needed.

The next step is using a country-code cookie. If the user already has the cookie, use that to switch to country-specific code, otherwise access the GeoIP DB to determine the country-code, set the cookie, and switch as usual (works even if the user doesn't accept cookies). Make it a session cookie, the user might travel. Be careful in case you have some page caching: it must not ignore the country-code cookie.

Your question mentions a redirect, which could be a country-specific header('Location: ...');, but you should probably do without that, since it makes things much more complicated, and increases your traffic a bit.

Walter Tross
  • 12,237
  • 2
  • 40
  • 64
1

Well, I think you're too much worried, but let's me explain:

  1. IP -> country IS an hash table, so I expect tiny time to resolve your query
  2. it's a static information, once resolved can be cached
  3. 100k page view with 40k users mean 1 user per 2.5 page view, a "short" navigation history (I explain later)

For this situation I suggest to:

  1. query the DB using the PHP (put the code in the "head" of execution) and store/cache the country in a cookie and serve the request. A simple if(empty($COOKIE['country'])) will allow to understand if query the DB or not
  2. avoid the redirect to a different site (us.domain.com) you will pass from 2.5 to 3.5 requests per user (~40% more)
  3. when the site need more resources you can add increase your cloud resources OR add a new machine with same address (www.domain.com) but new IP doing DNS round robin, work well with short navigation history.

PS

If you will going to cluster $_SESSION sharing will be the real challenge, so you can look since now to session manager

Ivan Buttinoni
  • 4,110
  • 1
  • 24
  • 44
  • MaxMind's GeoIP DB is not a hash, it's a [trie](http://en.wikipedia.org/wiki/Trie) – Walter Tross Jun 17 '12 at 07:38
  • Ehm, I don't really mean that the real implementation of the GeoIP DB is an hash table, but that the data itself "IS" an hash table, so I expect that performances of the real implementation are not too far from the hash table implementation. – Ivan Buttinoni Jun 17 '12 at 10:59
  • Maybe you mean that the data itself is a map. This map is much better implemented as a prefix tree (i.e., a trie) than as a hash table, though. Sorry for insisting on terminology, it's only that I think that using the right names for things is important (in general). And BTW I should correct my first comment: I should have written "a hash table", not "a hash"... – Walter Tross Jun 17 '12 at 17:46
  • No, you're right, terminology is really important. I simply didn't care what algorithm is inside GeoIP DB (my fault) because for Saikios there's no choice, he have to query the DB once per user, so this is a piece of the problem we can forget, if we cache the results of course ;) – Ivan Buttinoni Jun 17 '12 at 23:05