3

I've been trying to find a proper GeoIP database for our application and as far as I can tell MaxMind City is probably the best (though if I'm wrong please let me know!). I'm currently trying out the free version which isn't really accurate in the UK, but it will do for testing purposes.

So getting back to the topic:

Our app is written in PHP and eventually we'll be having a lot of incoming traffic. The reason why we'll be using the GeoIP database is because we need to display the city, region, country on every page load as well as save the data for a specific visitor onto a mysql database. So I started thinking that loading the binary data on every page request will cause the server to consume a lot of RAM and will increase the processing speed. That's why I'm wondering if it's possible to cache it inside RAM and fecth it with PHP, because it's 26Mb (I know it might not seem to be a lot), but still when we start to get thousands of page requests per second this will result in performance decrease which we obviously want to avoid.

The server:

Processor: 16 GHz (8 x 2 GHz)

RAM: 16384 MB

HDD: 400 GB

Apache + Nginx + Varnish, PHP5.3, MySQL5.1

We're quite new to large-scale app development so if you have any suggestions how to improve performance in general, or have any other tips related to increasing performance for our case let me know.

Thanks!

rabudde
  • 7,498
  • 6
  • 53
  • 91
Ignas
  • 1,965
  • 2
  • 17
  • 44

1 Answers1

4

We are using MaxMinds GeoIP City for an Adserver application (so there is really much traffic). Some tests based on test scripts from MaxMind which put the database into memory was not efficient enough for us. Because data was cached for every single PHP process. So we decided to put the two tables (ip2location and city) into MySQL (you have to download the CSV files), which will cache the tables in memory also, because they are really small. With a clustered index on ip2location table, we're able to query a loc_id for an IP in round about 10ms, so you just have to get the city data based on this loc_id, but this should be very fast also.

rabudde
  • 7,498
  • 6
  • 53
  • 91
  • Nice! We'll do some testing, but that's a good idea to just let the mysql cache it. Thanks! Don't know why we didn't realize to try it before :) – Ignas Feb 22 '12 at 13:38
  • 1
    For increasing performance we've done some manipulation on `ip2location` table. The "problem" is, that there are some ip blocks, that are larger than one class-A net. So we split these blocks (maybe 5 or 6 entries in this table) into multiple blocks and then we added a new column to this table which contains first octet of IP (so value of this `unsigned tinyint` column can be from 0 to 255). There is an index on this additional column and this column is referred in `WHERE` clause. That makes it very efficient. – rabudde Feb 22 '12 at 13:57
  • Thanks for suggesting using MySQL instead. I'm building an ad server application, too, and I will be needing the data of country code and time zone. If you already updated your application to use MaxMind's GeoIP2 database, I would appreciate if you could share your experiences with it. – Ian Y. Dec 17 '19 at 14:13
  • Sorry, I don't work in this company anymore. I didn't work with MaxMind for a long time now, so I can't help. – rabudde Dec 17 '19 at 14:51
  • Hi, sir. You suggested that caching inside RAM is not efficient enough because "data was cached for every single PHP process". But even if storing data in MySQL, aren't the codes of database queries (`$conn = mysqli_connect(......); $result = mysqli_query($conn, "SELECT * FROM ......"); $row = mysqli_fetch_assoc($result);`) executed for every single PHP process (page load), too? I don't understand how that is more efficient or how that enables MySQL to cache the data. I'm not a database expert so please bear with me if my question is silly. Thanks. – Ian Y. Dec 18 '19 at 15:25
  • 1
    At that time I didn't use a real cache like memcached which would have been a good choice. But even if you store it in memcached or MySQL, you have to query the data in any way for each page request. IMHO MySQL has a bit more overhead for connection handling, but as tables are small, MySQL would cache it in RAM also (i.e. if using InnoDB with appropriate `innodb_buffer_pool_size`. – rabudde Dec 19 '19 at 06:46
  • Thanks. So I guess MySQL is a better choice over memcached in this case. – Ian Y. Dec 20 '19 at 02:42