Innodb, clustered indexes, and slow_query_log - hurt by the primary key?

Question

In the last few months we've migrated a few tables from MYiSAM to InnoDB. We did this, in theory, for the row locking advantage, since we are updating individual rows, through multiple web-scraping instances. I now have tens of thousands of slow_queries building up in my slow_query_log (10s). and a lot of full table scans. I am having event very simple updates to one row (updating 4 or 5 columns) take 28 seconds. (our i/o and efficiencies are very good, and failed/aborted attempts are very low < 0.5%)

The two tables we update the most have ID (int 11) as the primary key. In InnoDB the primary key is a clustered key, so are written to disk indexed in the order of ID. BUT our two most important record identifying columns are BillsofLading and Container (both varchar22). Most of our DML queries look up records based on these two columns. We also have indexes on BillsofLading and container. The way I understand it, InnoDB also uses the primary key when creating these two secondary indexes.

So, I could have a record with ID=1, and BillsofLading='z', and another record ID=9 and BillsofLading='a'. With InnoDB indexes, when updating record based on a SELECT where BillsofLading='a', would I not still have to do a full scan to find 'a' since the index is based on the ID?

Thanks in advance for helping with the logic here!

The first thing to do is to run EXPLAIN to find out what the query execution plan is and where the resources are being consumed. Use a GI like Mysql Workbench or Toad to run the explain as they present the results much better. — David Soussan, Mar 11 '15 at 21:59
@DavidSoussan I haven't had any obvious indicators from the EXPLAIN extended. Wondering if it is more an indicator of the application processing, and less so the actual DML statements. — user3242558, Mar 12 '15 at 22:33

score 1 · Answer 1 · answered Mar 11 '15 at 22:22

1

No, your example should not require a full scan, assuming that MySQL is choosing to use your BillsofLading index. You say

The way I understand it, InnoDB also uses the primary key when creating these two secondary indexes.

This is correct, but it does not mean what you think it means.

A primary key (PK) to InnoDB is like a line number to a human being. It's InnoDB's only way to uniquely identify a row. So what a secondary index does is map each value of the target column(s) to all the PKs (i.e. line numbers) that match that value. Thus, MySQL can quickly jump to just the BillsofLading you care about, and then scan just those rows (PKs) for the one you want.

answered Mar 11 '15 at 22:22

dg99

5,456
3
37
49

Ahh, so it finds the `BillsofLading` based on the billoflading index, then uses the primary key (line number) to go to that line for the rest of the columns? Thanks for the explanation. – user3242558 Mar 12 '15 at 22:37
1

Sure. To solve your overall query efficiency problems I would recommend starting with the worst offender in the slow query log (or perhaps the worst offender that is also a relatively simple query) and investigating why in particular that query is having problems. This may shed light on why all the other queries are having problems too. – dg99 Mar 13 '15 at 15:44

score 0 · Answer 2 · answered Mar 12 '15 at 02:36

0

Did you severely decrease key_buffer_size and set innodb_buffer_pool_size to about 70% of available RAM?

There are a number of subtle differences between MyISAM and InnoDB. There are too many to list in this answer, and nothing you have said brings to mind any particular issue. I suggest you review Converting MyISAM to InnoDB to see what might be causing trouble.

answered Mar 12 '15 at 02:36

Rick James

135,179
13
127
222

Good questions. I am looking into these. I know the i/o values look pretty good. – user3242558 Mar 12 '15 at 22:32

Innodb, clustered indexes, and slow_query_log - hurt by the primary key?

2 Answers2