We have a large (and growing) MySQL db of bird records. At present we have three main tables, which simplified looks as follows:
RECORDS (id, birdid, tripid, gender, age) PRIMARY KEY id
BIRDTRIPS (id, locid, date, starttime, endtime, notes) PRIMARY KEY id
LOCATIONS (id, description, centerlat, centerlng) PRIMARY KEY id
All ids are INTs and date is DATE type.
In order to gain performance i´ve decided to do some denormalization by replicating date and locid into RECORDS:
RECORDS (id, birdid, tripid, gender, age, locid, date) PRIMARY KEY id
This way, many queries will avoid expensive joins to BIRDTRIPS and LOCATIONS.
MySQL has only one clustered index per table, and this is always the primary key. I am tempted
to try PRIMARY KEY (date, id)
on RECORDS to exploit the clustered index for faster range scans on the date column and partitioning of the table. The id is only included in the key because there might be many records on the same date.
Theoretically the primary key is often kind of "wasted" on the id, where a non-clustered UNIQUE index would be sufficient for lookups.
And now my question:
Do any of you have practical experience with this approach? Are there any drawbacks i might have overlooked?