0

I have a MariaDB database that I try to optimize. The current table is 132 Million rows, and starts to be difficult to work with.

Investigating why INSERT are so slow on my table (InnoDB), I notice what looks like duplicates INDEX. Here is the output of SHOW INDEX :

Table   U   Key_name    S_  Column_nam  Co  Cardina     Sub_pa  Packed  Nu  Index_type  Comment     Index_comment   
Twit    0   PRIMARY     1   ID_num      A   123302998   NULL    NULL        BTREE       
Twit    1   User_ID     1   User_ID     A   408288      NULL    NULL        BTREE       
Twit    1   Date_cr     1   Date_cr     A   123302998   NULL    NULL        BTREE       
Twit    1   User_ID_2   1   User_ID     A   515912      NULL    NULL        BTREE       
Twit    1   index_Orig  1   Original    A   61651499    NULL    NULL    YES BTREE   

I don't understand why the two INDEX on User_ID have different cardinality. Would there be a way to remove one of them Safely in order to speed-up INSERT queries?

Thanks.

ylnor
  • 4,531
  • 2
  • 22
  • 39

4 Answers4

0

Related: Can cardinality differ for duplicate indexes in mysql?

Seems to me that you should delete the one with the least-correct cardinality.

Cargo23
  • 3,064
  • 16
  • 25
  • Good related topic indeed. My second problem is that cardinality of User_ID Index should be between 40,000 and 50,000. I don't believe I can trust any of these numbers. – ylnor Aug 17 '17 at 19:00
  • @ylnor did you try optimize and analyse table ? For calculating your cardinality you can use the distinct – Noob Aug 17 '17 at 19:01
  • Thanks, I just ran Optimize, should take a while, I'll update my question after that. – ylnor Aug 17 '17 at 19:07
  • You could also use SELECT DISTINCT on the table. This can help you to calculate if the carnality is roughly correct. – Noob Aug 17 '17 at 19:11
0

Most of InnoDB statistics are rough estimates based on random samples, so they can be pretty variable and imprecise, but still good enough to guide the query optimizer.

Running ANALYZE TABLE will re-calculate based on a new randomly selected set of pages, and then you might see the estimates change. This doesn't take long, and doing this for a big table doesn't take any longer than it does for a small table.

To answer your original question, yes, you can definitely have duplicate indexes, i.e. more than one index for the exact same columns. MySQL does not prevent this. It trusts that you are in charge and you know what you're doing.

You can drop any index you want to. Dropping an index is a fast operation and does not rebuild the table (unless you use a very old version of MySQL). For details, read https://dev.mysql.com/doc/refman/5.5/en/innodb-create-index.html and its subsections.

OPTIMIZE TABLE shouldn't be necessary. It does rebuild the table, including all data and all indexes. It takes a long time. But once it is done, it will also get a re-sampled analysis of index statistics, just like ANALYZE TABLE does.

But I wouldn't expect a large performance change from dropping one out of your four secondary indexes (I'm not counting the possibility of dropping your PRIMARY index, since that is the clustered index, which is the table itself).

You might also like to read https://dev.mysql.com/doc/refman/5.6/en/insert-optimization.html and https://dev.mysql.com/doc/refman/5.6/en/optimizing-innodb-bulk-data-loading.html

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
0

(Comments, is a rather random order.)

  • Cardinality values are based on "random dives" into the index, and hence are approximate. They can be off by a factor of 2 either way.

  • There is no "histogram" (yet) to give clues of uneven distribution.

  • You do have a duplicate index. You can drop either one of them safely. This will save disk space and speed up the Change buffer processing a little.

  • I see no 'composite' indexes. Depending on the queries you have, they may be very important to have.

  • An INSERT into an InnoDB table must first check all UNIQUE keys for the row being inserted being a duplicate. The PK is the only UNIQUE key on this table. The secondary index updates are delayed via the "Change Buffer".

  • Eventually the secondary indexes must be updated. Perhaps this is where you are now. How many inserts per second are you performing? Are they 'batched'? Is each row in a separate transaction (such as via autocommit=ON)? (More on this when you answer.)

  • How much RAM? What is the setting of innodb_buffer_pool_size? Those are critical to performance in all of this discussion.

  • What is the setting of innodb_flush_log_at_trx_commit? This controls an extra disk hit on every transaction.

  • Dropping the PRIMARY KEY is usually not something to do. And I am pretty sure it forces a costly rebuild of the table in any version.

  • OPTIMIZE TABLE is almost never useful for InnoDB. Yeah, it does some defragmentation. But you will quickly expand the table again.

  • Is ID_num AUTO_INCREMENT? If so, the insert into the data (and PK) is quite cheap since it writes to the 'end' of the table.

  • If any/all of the secondary indexes are scattered, then the eventual read-modify-write operation to update the index may be happening a lot and thereby slowing down the system. It would slow down your inserts only in that the Change buffer needs to happen to make room for virtual updates.

Rick James
  • 135,179
  • 13
  • 127
  • 222
0

Duplicate Index has impact on the database performance, you'd better use drop index to fix it:

mysql> alter table your_table_name drop index key_name_of_duplicate_index;

For example in your case:

mysql> alter table Twit drop index User_ID_2;

You could want to understand how the duplicate index comes. Here is one example case to import the duplicate index:

mysql> alter table Twit add index User_ID;
mysql> alter table Twit add index User_ID;

Run this add index command twice will cause duplicate index (could run in different time because somebody forgot that it was already added). The best practice to execute add index is to always use key_name to avoid duplicate index:

mysql> alter table Twit add index User_ID(User_ID);
mysql> alter table Twit add index User_ID(User_ID);

Then the 2nd command will fail.

gary
  • 1,569
  • 3
  • 20
  • 30