I cannot figure out reason of performance difference between two (three?) similar percona 5.6 instances. I prepared some tpcc-like benchmarking using sysbench and among 3 similar servers, one is making 30-40% better result. As far benchmark it is using 2GB test database, results are repetitive, some facts:
- during benchmarking all tables are cached by OS, iowait avg 1-2%
- performing multiple runs does not change results with or w/o recreating db structure
- restarting mysql/server does not change results
- 'show variables' does not show any differences between servers
I have run out of ideas. All servers are Dell R410s with dual X5570,64G identical memory configuration and sticks (recently replaced) and 860/870 EVO 500G SSDs. HT enabled, OS Ubuntu 18, Percona 5.6
As stated before, benchmark seems predictable (tested many times using also other configurations) - problem is that one particular scenario - 3 similar servers, 2 perform almost the same, and 3rd one has 30% better results. One difference is that one server (that with higher results) has old, sluggish disk controller with 256MB cache - but how can it have such impact if iowait during tests is neglible?
It's worth mentioning that I have taken over those servers, did some hardware upgrades but did not configured them 'from scratch'.
Any help appreciated.
EDIT: Additional info:
Benchmark is mainly based on: https://github.com/Percona-Lab/sysbench-tpcc
Just packaged it with custom script to ease propagation - I am using single thread to have better view on differences between single core performance.
For ease of writing lets call 2 servers performing worse 'BAD servers' and that one with 30% better results - GOOD server'. So more info:
- bad server utilize Samsung EVOs 870, good one 860 EVO
- load average circa 1.5 among all of them
- tried disabling HT - no impact
- table files caching by OS checked by fincore from ftools - after warmingup iowait drops to 0-2% during whole test
- in fiobenchmark, laying out 8GB test file, 75/25 RW test, bad server do circa 400% better results as far IOPS than good server - I assume that good server is limited by old controller, where bad servers are using just direct SATA.
However, my question is, why should controller cache have such BIG impact (if any?) if iowait is neglible?