0

I cannot figure out reason of performance difference between two (three?) similar percona 5.6 instances. I prepared some tpcc-like benchmarking using sysbench and among 3 similar servers, one is making 30-40% better result. As far benchmark it is using 2GB test database, results are repetitive, some facts:

  • during benchmarking all tables are cached by OS, iowait avg 1-2%
  • performing multiple runs does not change results with or w/o recreating db structure
  • restarting mysql/server does not change results
  • 'show variables' does not show any differences between servers

I have run out of ideas. All servers are Dell R410s with dual X5570,64G identical memory configuration and sticks (recently replaced) and 860/870 EVO 500G SSDs. HT enabled, OS Ubuntu 18, Percona 5.6

As stated before, benchmark seems predictable (tested many times using also other configurations) - problem is that one particular scenario - 3 similar servers, 2 perform almost the same, and 3rd one has 30% better results. One difference is that one server (that with higher results) has old, sluggish disk controller with 256MB cache - but how can it have such impact if iowait during tests is neglible?

It's worth mentioning that I have taken over those servers, did some hardware upgrades but did not configured them 'from scratch'.

Any help appreciated.

EDIT: Additional info:

Benchmark is mainly based on: https://github.com/Percona-Lab/sysbench-tpcc

Just packaged it with custom script to ease propagation - I am using single thread to have better view on differences between single core performance.

For ease of writing lets call 2 servers performing worse 'BAD servers' and that one with 30% better results - GOOD server'. So more info:

  • bad server utilize Samsung EVOs 870, good one 860 EVO
  • load average circa 1.5 among all of them
  • tried disabling HT - no impact
  • table files caching by OS checked by fincore from ftools - after warmingup iowait drops to 0-2% during whole test
  • in fiobenchmark, laying out 8GB test file, 75/25 RW test, bad server do circa 400% better results as far IOPS than good server - I assume that good server is limited by old controller, where bad servers are using just direct SATA.

However, my question is, why should controller cache have such BIG impact (if any?) if iowait is neglible?

  • You have different ssd 860 vs 870 it is not *identical*. Also say nothing about raid and arrays configuration. Each small detail matter in benchmarks. If you wanna see almost exact match all must be the same – gapsf Sep 29 '22 at 14:56
  • From 870 review: However, to this we must add the fact that Samsung refers outside of the official specifications - the MKX controller provides 30% better performance in small-block reading without a request queue. In other words, in some load cases, the 870 EVO can be expected to progress in terms of performance. – gapsf Sep 29 '22 at 14:57
  • "has old, sluggish disk controller with 256MB cache" why do you think is not answer to your question? Different raid controller, differrent disks is enough to get different results. Compare raid specs and array configuration like raid level, cache size, writeback, number of disks, strip size, readahead – gapsf Sep 29 '22 at 15:03
  • "during benchmarking all tables are cached by OS" how you figure out this fact? Maby you think they not touch disks duiring bench? There is sync and dirty* vm kernel parameters so kernel write dirty page cache to disk each 5-15 sec depending what you distro set by default. https://docs.kernel.org/admin-guide/sysctl/vm.html So storage subsystem affects bench even if "all cached" – gapsf Sep 29 '22 at 15:26
  • "all tables are cached by OS" - If the tables are `ENGINE=InnoDB`, then the caching is done by MySQL in the "buffer_pool", not the OS. – Rick James Sep 29 '22 at 16:58
  • "HT enabled" -- What is the CPU and Load Average? – Rick James Sep 29 '22 at 17:00
  • Is it your benchmark or some canned product? The latter may be trying to find the max number of concurrent threads can be run. I find this to be a nearly useless way to benchmark. – Rick James Sep 29 '22 at 17:02
  • "disk controller with 256MB cache" -- do the others have no caching? That could make a big difference for _writes_. – Rick James Sep 29 '22 at 17:03
  • So shouldn't lack of controller cache have impact on averace IOWAIT? It is - as said before - neglible. Average iops during benchmark measured by iostat is circa 1000 tps at device used by storage - as far as I tested storage using fio it easily reaches 9k 75/25 RW IOPS in server without controller, and 3k in server with controller (that is doing 30% better sysbench tpcc-like bench). I will update my first post . – atapidriver Sep 30 '22 at 08:03
  • Additional DB information request, please. From one of the FAST servers and the SLOW server, # cores, any SSD or NVME devices on MySQL Host servers? Post TEXT data on justpaste.it and share the links. From your SSH login root, Text results of: A) SELECT COUNT(*) FROM information_schema.tables; B) SHOW GLOBAL STATUS; after minimum 2 hours UPTIME C) SHOW GLOBAL VARIABLES; D) SHOW FULL PROCESSLIST; E) STATUS; not SHOW STATUS, just STATUS; for server workload tuning analysis to provide differential analysis. – Wilson Hauck Sep 30 '22 at 18:53

0 Answers0