5

I am looping through a number of values (1 to 100 for example) and executing a prepared statement inside the loop.

Is there and advantage to using a transaction - committing after the loop ends - compared to a direct execution inside the loop?

The values are not dependant on each other so a transaction is not needed from that point of view.

OMG Ponies
  • 325,700
  • 82
  • 523
  • 502
jeroen
  • 91,079
  • 21
  • 114
  • 132

3 Answers3

4

If your queries are INSERTs, the page 7.2.19. Speed of INSERT Statements of the MySQL manual gives two interesting informations, depending on whether your are using a transactionnal engine or not :

When using a non-transactionnal engine :

To speed up INSERT operations that are performed with multiple statements for nontransactional tables, lock your tables.

This benefits performance because the index buffer is flushed to disk only once, after all INSERT statements have completed. Normally, there would be as many index buffer flushes as there are INSERT statements. Explicit locking statements are not needed if you can insert all rows with a single INSERT.

And, with a transactionnal engine :

To obtain faster insertions for transactional tables, you should use START TRANSACTION and COMMIT instead of LOCK TABLES.

So I am guessing using transactions might be a good idea -- but I suppose that could depend on the load on your server, and whether there are multiple uses using the same table at the same moment, and all that...

There are more informations on the page I linked to, so don't hesitate to read it ;-)


And, if you are doing update statements :

Another way to get fast updates is to delay updates and then do many updates in a row later. Performing multiple updates together is much quicker than doing one at a time if you lock the table.

So, I'm guessing the same can be said than for inserts.


BTW : to be sure, you can try both solutions, benchmarking them with microtime, on the PHP side, for instance ;-)

Pascal MARTIN
  • 395,085
  • 80
  • 655
  • 663
  • Thanks a lot for the information, I guess I will just have to try and benchmark. – jeroen Sep 12 '09 at 18:19
  • 2
    I suppose so ^^ Let us know about the results ;-) It might interested other people! And, btw : another solution would be to use one insert query to do several inserts at the same time, reducing the total number of queries -- a bit harder to code, but I've seen great improvements with that, as it means less calls from one server to the other. – Pascal MARTIN Sep 12 '09 at 18:23
  • Yes, reducing the number of queries would be the best solution, but like I mentioned to James Black as well, I have never tried that with INSERT ON DUPLICATE KEY UPDATE, don´t even know if it´s possible. – jeroen Sep 12 '09 at 20:56
3

For a faster time you could do all the inserts in one shot, or group them together, perhaps 5 or 10 at a time, as if one insert fails the entire batch will.

http://www.desilva.biz/mysql/insert.html

A transaction will slow you down, so if you don't need it then don't use it.

A prepared statement would be a good choice though even if you did batch inserts, as you don't have to keep building up the query each time.

James Black
  • 41,583
  • 10
  • 86
  • 166
  • I was thinking about combining them, but I have never tried that with INSERT ON DUPLICATE KEY UPDATE. – jeroen Sep 12 '09 at 18:20
3

I faced the same question when I had to implement a CSV file (possibly quite long) data import (I know you can use the LOAD DATA INFILE syntax for that but I had to apply some processing on my fields before insertion).

So I made an experiment with transactions and a file with about 15k rows. The result is that if I insert all records inside one unique transaction, it takes only a few seconds and it's cpu bound. If I don't use any transaction at all, it takes several minutes and it's IO bounded. By committing every N rows, I got intermediate results.

flm
  • 960
  • 8
  • 14