7

Our database servers (mainly based on the Debian stable packages (=currently Wheezy) seem to have about 4 times more load for the same workload in kernel 3.2.0-4-amd64 then in it's previous 2.6.32-5-amd64 kernel. With all packages the same & booting in the other kernel we can clearly see the difference, and I'm at a loss as to why. The problem is, I don't see that much IO or CPU load difference.

Setting the default kernel.sched_min_granularity_ns & kernel.sched_latency_ns back to it's 2.6.32 values helps a little (thrice the load instead of 4 times), but not to the level we'd like. As a lot of kernel settings changed, we can hardly just blindly set the new kernel to the old default values of the 2.6 one.

Has anybody else had experience with this? If so, what caused this (and ideally: how could it be solved)?

As it's deep kernel-related, perhaps a difference in sysctl values might be of interest: here is a diff of the 2 (pastebinned to prevent an overly long question).

edit: currently we're investigating this SO answer to see if that applies.

Wrikken
  • 981
  • 9
  • 22
  • 1
    I'm guessing you've checked that the my.cnf files are the same on both machines? – symcbean Aug 14 '13 at 15:11
  • Yes, quite simply... because on _the same machine_ it's just a matter of booting in 1 kernel or the other. Nothing on the disk is different, this behavior is consistent across boots in 2.6 & 3.2. (And stays the same over a longer period of time, i.e.: running a week in one & running a week in the other yields the same result, so it's not a cold cache/memory issue). – Wrikken Aug 14 '13 at 15:21
  • and ALL packages, all other configuration files are are same? same hardware configuration, same memory configuration, same hw settings etc? there is a huge difference between last kernel versions in scheduler, but this difference shouldn't be noticeable. – GioMac Aug 14 '13 at 20:17
  • Everything, as I said, the very same server, only with another kernel, is showing this behavior. I _could_ imagine that the 'default' settings of a kernel wouldn't be ideal for the beast that is our DB machine, but it has never bothered us before... – Wrikken Aug 14 '13 at 20:22
  • even file locations on physical level? :) just try to swap kernel. – GioMac Aug 14 '13 at 20:26
  • That _IS_ what we're doing. Same machine. 2 kernels installed. Boot in one vs. boot in the other results in this. – Wrikken Aug 14 '13 at 20:30
  • Hm, [currently investigating this SO answer](http://stackoverflow.com/questions/12111954/context-switches-much-slower-in-new-linux-kernels) – Wrikken Aug 16 '13 at 12:00
  • Which filesystem tyoe? – Dennis Kaarsemaker Aug 19 '13 at 17:06

5 Answers5

2

Linux kernels 3.0 - 3.8 should be avoided or upgraded to address IO performance degradation

Linux kernel IO performance degradation demonstrated by Josh Berkus using a private benchmark workload running against PostgreSQL 9.3 on Ubuntu 12.04 with kernel 3.2.0.

"...you really need to avoid every kernel between 3.0 and 3.8. While RHEL has been sticking to the 2.6 kernels (which have their own issues, but not as bad as this), Ubuntu has released various 3.X kernels for 12.04...upgraded...to kernel 3.13.0, and ran the same exact workload...an 80% reduction in IO. We can thank the smart folks in the Linux FS/MM group for hammering down a whole slew of performance issues."

Please see http://www.databasesoup.com/2014/09/why-you-need-to-avoid-linux-kernel-32.html

  • Oeh, interesting. Debian is still at 3.2 in stable, but I see some newer kernels in wheezy-backports, I'll check them out soon! – Wrikken Oct 30 '14 at 17:28
1

I addressed an issue in the DBA StackExchange about the kernel and journaling. I learned this from Percona back in May that a certain flush behavior is actually simulated.

  • You may have to change how journaling is done.
  • You may have to tune InnoDB
RolandoMySQLDBA
  • 16,544
  • 3
  • 48
  • 84
  • Very interesting read, thank you, especially the [linked article at mysql performance blog](http://www.mysqlperformanceblog.com/2014/05/23/improve-innodb-performance-write-bound-loads/). I'll check out the file system here, seems they have data=ordered currently. – Wrikken Aug 14 '14 at 07:55
0

Maybe the reported load is simply not correct, like in this bug report: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=693942

Can you see that there is anything actually slower? or does vmstat look like the server is really doing more work? otherwise i'd assume you've just hit that reported bug, the same happened to me some time ago, the performance of the server was not different only the outputted load average was higher.

replay
  • 3,240
  • 14
  • 17
0

I don't have the reputation to make this a comment.. but as you were upgrading the kernel did you also upgrade the version of MySQL? Can you list which MySQL 5.5.X you are running?

Ironically bugs in some of the newer versions of MySQL have actually made performance noticeably worse.. they've gone on to fix them of course but it did create a significant red-hearing for me while making changes in my app.

"InnoDB: The fix for Bug#17699331 caused a high rate of read/write lock creation and destruction which resulted in a performance regression. (Bug #18345645, Bug #71708)"

http://dev.mysql.com/doc/relnotes/mysql/5.6/en/news-5-6-19.html

"InnoDB: A regression introduced by Bug #14329288 would result in a performance degradation when a compressed table does not fit into memory. (Bug #18124788, Bug #71436)"

http://dev.mysql.com/doc/relnotes/mysql/5.6/en/news-5-6-17.html

..etc

It's just the same for 5.5:

"InnoDB: A regression introduced by Bug #14329288 would result in a performance degradation when a compressed table does not fit into memory. (Bug #18124788, Bug #71436)"

http://dev.mysql.com/doc/relnotes/mysql/5.5/en/news-5-5-37.html

Does upgrading to a newer MySQL return it back to reasonable performance?

MySQL does have some kernel specific code in there too:

"asynchronous I/O is not supported on tmpfs in some Linux kernel versions. The workaround was to turn off the innodb_use_native_aio setting or use a different temporary directory. The fix causes InnoDB to turn off the innodb_use_native_aio setting automatically if it detects that the temporary file directory does not support asynchronous I/O. (Bug #13593888, Bug #11765450, Bug #58421)"

"http://dev.mysql.com/doc/relnotes/mysql/5.6/en/news-5-6-5.html

So I'd ensure you're running the latest build.

As an aside consider MySQL 5.6.X (which is now officially stable and has been for some time), "For Linux, MySQL 5.6 shows up to a 150% improvement in TPS throughput over MySQL 5.5" http://dev.mysql.com/tech-resources/articles/mysql-5.6-rc.html

Matthew1471
  • 317
  • 2
  • 4
-1

I had huge mysql performance problems moving from debian w/ kernel 2.6 and mysql 5.1 to debian w/ kernel 3.2 and mysql 5.5 (wheezy).

What solved the problem for mysql was barrier=0 in /etc/fstab. Check out https://wiki.archlinux.org/index.php/Ext4

daniel
  • 1
  • Barrier=0 will disable barriers and that means you can loose data if the power goes out or something crashes. – Jure1873 Sep 07 '13 at 19:06
  • @Jure1873: hm, as I understand it, it's quite safe for battery backed disks, and who would go without in a serious database setup? – Wrikken Sep 10 '13 at 16:23
  • Well you should be safe, if the disks are battery backed, but I always like to play it safe. If there is a kernel panic or something goes terribly wrong you could run into trouble. – Jure1873 Sep 10 '13 at 18:23