6

My application is very database intensive. Currently, I'm running MySQL 5.5.19 and using MyISAM, but I'm in the process of migrating to InnoDB. The only problem left is checksum performance.

My application does about 500-1000 "CHECKSUM TABLE" statements per second in peak times, because the clients GUI is polling the database constantly for changes (it is a monitoring system, so must be very responsive and fast).

With MyISAM, there are Live checksums that are precalculated on table modification and are VERY fast. However, there is no such thing in InnoDB. So, CHECKSUM TABLE is very slow...

I hoped to be able to check the last update time of the table, Unfortunately, this is not available in InnoDB either. I'm stuck now, because tests have shownn that the performance of the application drops drastically...

There are simply too much lines of code that update the tables, so implementing logic in the application to log table changes is out of the question...

The Database ecosystem consists of one master na 3 slaves, so local file checks is not an option. I thought of a method to mimic a checksum cache - a lookup table with two columns - table_name, checksum, and update that table with triggers when changes in a table occurs, but i have around 100 tables to monitor and this means 3 triggers per table = 300 triggers. Hard to maintain, and i'm not sure that this wont be a performance hog again.

So is there any FAST method to detect changes in InnoDB tables?

Thanks!

PeeHaa
  • 71,436
  • 58
  • 190
  • 262
Jacket
  • 844
  • 10
  • 18
  • 1
    duplicate: http://dba.stackexchange.com/questions/9569/fastest-way-to-check-if-innodb-table-has-changed – Sergio Tulentsev Dec 27 '11 at 09:35
  • You may want to check out [this question/answer](http://stackoverflow.com/questions/2785429/how-can-i-determine-when-an-innodb-table-was-last-changed) (if you didn't already do that). – Kohányi Róbert Dec 27 '11 at 09:36
  • Ah, OK, this is my thread also, but i didn't think that this is the same site... excuse me, i'm new here. – Jacket Dec 27 '11 at 09:37
  • And thanks for the link, so it seems that i'm on the right track with triggers, as ugly as it seems... It's a shame that InnoDB lacks such mechanism... – Jacket Dec 27 '11 at 09:44
  • You could add a `last_update TIMESTAMP` column (which by default should automatically store the timestamp of insert or last update) and run a `SELECT MAX(last_update) FROM tableX` instead of the checksum table statements – ypercubeᵀᴹ Dec 27 '11 at 09:50

2 Answers2

3

The simplest way is to add a nullable column with type TIMESTAMP, with the trigger: ON UPDATE CURRENT_TIMESTAMP.

Therefore, the inserts will not change because the column accepts nulls, and you can select only new and changed columns by saying:

SELECT * FROM `table` WHERE `mdate` > '2011-12-21 12:31:22'

Every time you update a row this column will change automatically.

Here are some more informations: http://dev.mysql.com/doc/refman/5.0/en/timestamp.html

To see deleted rows simply create a trigger which is going to log every deletion to another table:

DELIMITER $$
CREATE TRIGGER MyTable_Trigger
AFTER DELETE ON MyTable
FOR EACH ROW
BEGIN
    INSERT INTO MyTable_Deleted VALUES(OLD.id, NOW());
END$$
Cleankod
  • 2,220
  • 5
  • 32
  • 52
2

I think I've found the solution. For some time I was looking at Percona Server to replace my MySQL servers, and now i think there is a good reason for this.

Percona server introduces many new INFORMATION_SCHEMA tables like INNODB_TABLE_STATS, which isn't available in standard MySQL server. When you do:

SELECT rows, modified FROM information_schema.innodb_table_stats WHERE table_schema='db' AND table_name='table'

You get actual row count and a counter. The Official documentation says the following about this field:

If the value of modified column exceeds “rows / 16” or 2000000000, the statistics recalculation is done when innodb_stats_auto_update == 1. We can estimate the oldness of the statistics by this value.

So this counter wraps every once in a while, but you can make a checksum of the number of rows and the counter, and then with every modification of the table you get a unique checksum. E.g.:

SELECT MD5(CONCAT(rows,'_',modified)) AS checksum FROM information_schema.innodb_table_stats WHERE table_schema='db' AND table_name='table';

I was going do upgrade my servers to Percona server anyway so this bounding is not an issue for me. Managing hundreds of triggers and adding fields to tables is a major pain for this application, because it's very late in development.

This is the PHP function I've come up with to make sure that tables can be checksummed whatever engine and server is used:

function checksum_table($input_tables){
    if(!$input_tables) return false; // Sanity check
    $tables = (is_array($input_tables)) ? $input_tables : array($input_tables); // Make $tables always an array
    $where = "";
    $checksum = "";
    $found_tables = array();
    $tables_indexed = array();
    foreach($tables as $table_name){
        $tables_indexed[$table_name] = true; // Indexed array for faster searching
        if(strstr($table_name,".")){ // If we are passing db.table_name
            $table_name_split = explode(".",$table_name);
            $where .= "(table_schema='".$table_name_split[0]."' AND table_name='".$table_name_split[1]."') OR ";
        }else{
            $where .= "(table_schema=DATABASE() AND table_name='".$table_name."') OR ";
        }
    }
    if($where != ""){ // Sanity check
        $where = substr($where,0,-4); // Remove the last "OR"
        $get_chksum = mysql_query("SELECT table_schema, table_name, rows, modified FROM information_schema.innodb_table_stats WHERE ".$where);
        while($row = mysql_fetch_assoc($get_chksum)){
            if($tables_indexed[$row[table_name]]){ // Not entirely foolproof, but saves some queries like "SELECT DATABASE()" to find out the current database
                $found_tables[$row[table_name]] = true;
            }elseif($tables_indexed[$row[table_schema].".".$row[table_name]]){
                $found_tables[$row[table_schema].".".$row[table_name]] = true;
            }
            $checksum .= "_".$row[rows]."_".$row[modified]."_";
        }
    }

    foreach($tables as $table_name){
        if(!$found_tables[$table_name]){ // Table is not found in information_schema.innodb_table_stats (Probably not InnoDB table or not using Percona Server)
            $get_chksum = mysql_query("CHECKSUM TABLE ".$table_name); // Checksuming the old-fashioned way
            $chksum = mysql_fetch_assoc($get_chksum);
            $checksum .= "_".$chksum[Checksum]."_";
        }
    }

    $checksum = sprintf("%s",crc32($checksum)); // Using crc32 because it's faster than md5(). Must be returned as string to prevent PHPs signed integer problems.

    return $checksum;
}

You can use it like this:

// checksum a signle table in the current db
$checksum = checksum_table("test_table");

// checksum a signle table in db other than the current
$checksum = checksum_table("other_db.test_table");

// checksum multiple tables at once. It's faster when using Percona server, because all tables are checksummed via one select.
$checksum = checksum_table(array("test_table, "other_db.test_table")); 

I hope this saves some trouble to other people having the same problem.

Jacket
  • 844
  • 10
  • 18
  • Further story development for those who are interested: http://forum.percona.com/index.php?t=msg&th=2060&goto=7785& – Jacket Jan 14 '12 at 08:00