2
  1. I'm trying out Proxmox 7.3-3 as a hypervisor and managed to install it using ZFS on my NVMe (Samsung_SSD_970_EVO_Plus_2TB_S6S2NS0T505403V)
  2. I also installed a Ubuntu 22.04 VM and its filesystem is ext4
  3. Next, I installed MySQL server (8.0.32)
  4. I normally don't tune MySQL, its performance is acceptable as-is
  5. However, it took 3 minutes to ingest a 20MB uncompressed SQL file using mysql < ...
  6. This is much slower than when the same Ubuntu guest ran on Windows 10 VirtualBox (a 20MB ingest will have taken less than 30 seconds usually)

Any idea what I might be missing here?

UPDATE 1

iostat from guest, I'm seeing 100% %util in red, why would it be 100?

Device            r/s     rkB/s   rrqm/s  %rrqm r_await rareq-sz     w/s     wkB/s   wrqm/s  %wrqm w_await wareq-sz     d/s     dkB/s   drqm/s  %drqm d_await dareq-sz     f/s f_await  aqu-sz  %util
dm-0             0.00      0.00     0.00   0.00    0.00     0.00 1448.00   6792.00     0.00   0.00    0.67     4.69    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.98 100.00
loop0            0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
loop1            0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
loop2            0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
sda              0.00      0.00     0.00   0.00    0.00     0.00 1075.00   6792.00   373.00  25.76    0.93     6.32    0.00      0.00     0.00   0.00    0.00     0.00  492.00    1.91    1.94 100.00
sdb              0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
sdc              0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
sr0              0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00

iostat from proxmox, also 100%

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.44    0.00    0.75    5.77    0.00   92.04

Device            r/s     rkB/s   rrqm/s  %rrqm r_await rareq-sz     w/s     wkB/s   wrqm/s  %wrqm w_await wareq-sz     d/s     dkB/s   drqm/s  %drqm d_await dareq-sz     f/s f_await  aqu-sz  %util
nvme0n1          0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
nvme1n1        484.00      0.00     0.00   0.00    1.77     0.00  814.00  20328.00     0.00   0.00    0.02    24.97    0.00      0.00     0.00   0.00    0.00     0.00  484.00    1.77    1.74 100.00
sda              0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
sdb              0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
sdc              0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
sdd              0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
sde              0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
sdf              0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
sdg              0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
zd0              0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
zd16             0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
zd32             0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
zd48             0.00      0.00     0.00   0.00    0.00     0.00  737.00   4916.00     0.00   0.00    0.00     6.67    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00 100.00
zd64             0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
zd80             0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00

proxmox's zpool iostat -v -l 1

                                     capacity     operations     bandwidth    total_wait     disk_wait    syncq_wait    asyncq_wait  scrub   trim
pool                               alloc   free   read  write   read  write   read  write   read  write   read  write   read  write   wait   wait
---------------------------------  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----
rpool                               208G  1.61T     18    495  76.0K  8.00M   98us   25us   98us   25us  394ns  374ns      -      -      -      -
  nvme-eui.0025385521403c96-part3   208G  1.61T     18    495  76.0K  8.00M   98us   25us   98us   25us  394ns  374ns      -      -      -      -
---------------------------------  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----

UPDATE 2

root@pve:~# zfs get all | grep "sync\|logbias"
rpool                     logbias               latency                default
rpool                     sync                  standard               local
rpool/ROOT                logbias               latency                default
rpool/ROOT                sync                  standard               inherited from rpool
rpool/ROOT/pve-1          logbias               latency                default
rpool/ROOT/pve-1          sync                  standard               inherited from rpool
rpool/data                logbias               latency                default
rpool/data                sync                  standard               inherited from rpool
rpool/data/vm-100-disk-0  logbias               latency                default
rpool/data/vm-100-disk-0  sync                  standard               inherited from rpool
rpool/data/vm-100-disk-1  logbias               latency                default
rpool/data/vm-100-disk-1  sync                  standard               inherited from rpool
rpool/data/vm-101-disk-0  logbias               latency                default
rpool/data/vm-101-disk-0  sync                  standard               inherited from rpool
rpool/data/vm-101-disk-1  logbias               latency                default
rpool/data/vm-101-disk-1  sync                  standard               inherited from rpool
rpool/data/vm-102-disk-0  logbias               latency                default
rpool/data/vm-102-disk-0  sync                  standard               inherited from rpool
rpool/data/vm-102-disk-1  logbias               latency                default
rpool/data/vm-102-disk-1  sync                  standard               inherited from rpool
root@pve:~#
bilogic
  • 155
  • 6
  • Please add the output of `iostat -x -k 1` from *both* the guest and the host systems when importing your SQL file. For the host system, please also add the output of `zpool iostat -v -l 1` during the same workload. Moreover, is your guest image installed onto a zvol or a normal zfs dataset? – shodanshok Apr 17 '23 at 13:51
  • pardon me, but I'm really new to this, how do I tell if it is a zvol or zfs? for `iostat`, it is an on-going output, i'm not sure how to present it properly, but if your concern is that my system is busy, I doubt so as I was monitoring it via the proxmox dashboard and the I/O was less than 15% load, nonetheless, i'm going to try to capture these outputs – bilogic Apr 17 '23 at 14:38
  • I do know that i'm using zfs-thin, could it be the issue here? – bilogic Apr 17 '23 at 14:41
  • proxmox uses zvol by default, so lets pretend your VM really is on a zvols for the moment. – shodanshok Apr 17 '23 at 15:08
  • @shodanshok updated, let's hope we can get this mystery solved! thank you! – bilogic Apr 17 '23 at 15:32
  • Is that SQL all `INSERTs`? What engine (InnoDB)? What is the value of `innodb_buffer_pool_size` on each machine? Are `TEXT` or `BLOB` columns involved? Just looking at I/O may not give you the desired answer. – Rick James Apr 17 '23 at 17:18
  • Your disk is maxed out by sync/flush requests. What NVMe disk model do you have? Please also add the output of `zfs get all | grep "sync\|logbias"` taken from your host machine. – shodanshok Apr 17 '23 at 17:32
  • @shodanshok added the new logs – bilogic Apr 17 '23 at 17:52
  • @shodanshok not gigabyte, I was confused. It is a `Samsung_SSD_970_EVO_Plus_2TB_S6S2NS0T505403V` 2TB – bilogic Apr 17 '23 at 17:59
  • Are the `INSERTs` one row at a time? Or batched (lots of rows in each `INSERT`)? – Rick James Apr 17 '23 at 21:40
  • Rick, thanks for the help. I'm quite sure it's one row at a time and it is something I can live with since it was generated by a tool. But the answer lied in tweaking ZFS to match my SSD characteristics. – bilogic Apr 18 '23 at 00:50

1 Answers1

1

Based on the output of iostat, we can see your disk struggling with sync/flush requests. This depends on the consumer-grade nature of your disk, which lacks any powerloss-protected writeback cache. As you can see, this means that even a disk rated for up to 560K random write iops really maxes out at ~500 fsync/s.

Ok, how can you speed up your server? Some options, ordered by preference:

  • if losing up to 5s of data is not a concern for this particular Proxmox server, you can disable sync pass-through at ZFS level by running the following command on your host: zfs set sync=disabled rpool. This will speed up all virtual machines running on this server but, again, remember you can lose data (up to 5s by default). However, due to ZFS being a CoW filesystem, no general filesystem corruption is possible even in the face of setting sync=disabled

  • if this is not acceptable, you can tune mysql to avoid flushing at each INSERT, rather issuing fsync() only every 1s. The option you need is innodb_flush_log_at_trx_commit = 2

  • you can set Proxmox (or, better, qemu) itself for ignoring fsync() issued against a specific virtual disk by setting the image cache option as unsafe

shodanshok
  • 47,711
  • 7
  • 111
  • 180
  • Since I have a UPS, I decided to test with `zfs set sync=disabled rpool` first, brought the whole operation down to 13 seconds. Once again, thank you! – bilogic Apr 17 '23 at 18:33
  • could you fill me in if I'm using zvol or a normal zfs dataset? thanks! – bilogic Apr 17 '23 at 18:39
  • You should be using volumes rather than dataset to store vm images. To be sure, you can run `zfs list -o name,type` on your host machine. – shodanshok Apr 17 '23 at 19:25
  • thanks, it says `volume` for all the `rpool/data/vm-*` – bilogic Apr 17 '23 at 19:29
  • I like to ask if my SSD is "suitable" for the current ZFS `sync=disabled` scenario. My concern is, is PVE + ZFS making any excessive writes that could kill my SSD faster vs when I was using W10 + NTFS to run basically the same services at the same load. Thank you. – bilogic May 06 '23 at 02:46
  • If anything, `sync=disabled` *lowers* SSD wear by the virtue of coalescing multiple small writes in bigger ones, limiting write amplification at NAND level. – shodanshok May 06 '23 at 08:17