0

I am testing a new ZFS configuration with z-std for log storage and storage of highly compressible files. The Array is tested on a 5 drive raidz-1 in a virtual machine on my PC which has direct access to the whole HDDs.

ZFS 2.0.2 is running in a Hyper-V Ubuntu VM and I am copying files from the Windows host via Samba. It's running locally on the PC, so network transfer speeds should not be a problem.

When I transfer a big, compressible file, the transfer itself is extremely bursty. You can see it here: Screenshot

I guess writes are being captured in a TXG, compressed, and then committed to disk. But there is some downtime when the CPU is essentially idle and the HDDs themselves are also not really utilized (which is expected, as CPU is the bottleneck when compressing data).

Can I somehow tune ZFS so that it accepts new data while a TXG is being compressed? Or is this the intented, optimal behaviour? If feel like speeds could be better when ZFS constantly accepts and compresses data.

  • Are you using the ZFS Windows port? Or ZFS is on an underlying Linux host? – shodanshok May 17 '21 at 09:23
  • @shodanshok ZFS is running natively in a ubuntu VM under Hyper-V and the HDDs are physically passed through so the VM has direct access. I am running openzfs 2.0.2 on the VM. I am copying files from the Windows Host via Samba to the Linux VM / ZFS – user3829915 May 17 '21 at 11:23
  • What do the Ubuntu VM cpu/disk activity graphs look like during this time? – Bert May 17 '21 at 15:35
  • @Bert https://imgur.com/a/nwGPW0q – user3829915 May 17 '21 at 18:13

1 Answers1

0

By default ZFS aggregates and flushes transactions each 5 seconds or 64 MB of dirty data, depending on which limit is reached first.

Transactions are aggregated into transactions group (TXG), and up to three TXGs can be "running" at the same time: a first in the open state (accepting writes), a second in the quiescing state (closing accepted writes) and a third in the flush phase (ie: writing to disk). In other words, ZFS does not accept writes "each 5 seconds" only as you seems to describe; rather, it flushes data each 5 seconds unless an high write load is active.

shodanshok
  • 47,711
  • 7
  • 111
  • 180