1

I have a problem with an ZFS problem, and done fair bit of isolating, but getting stuck.

  • Server1 - ZFS host, openbsd

  • Server2 - Workhorse that has ZFS drive mounted, ubuntu

An untar operation on the ZFS drive takes 2s on Server1. The same operation on the mounted ZFS drive takes 5 minutes (!) on Server2.

Both servers are connected via gigabit LAN and literally next to each in a rack. What should I check, what can I tune?

edit: iperf says "940 Mbits/sec" in both directions. Network speed is not the issue here.

edit2: here are the requested stats. I know that the pool is fine, running zpool clear would clear all issues, the two drives with low cksum errors look healthy, and the errors aren't recent. commands

Sebastian
  • 145
  • 6

1 Answers1

1

How are the servers connected and what portal is in use?
NFS?

If it's NFS, you may be running into issues with synchronous writes. Latency is different than throughput.

If you have access to the server, what are the outputs of:

zpool status, zfs list and zfs get sync


Edit: To test whether this is a ZFS synchronous write issue, temporarily set zfs set sync=disabled on the specific ZFS filesystem that's having the untar issue. Test the untar option.

ewwhite
  • 197,159
  • 92
  • 443
  • 809
  • You have a long-running zfs scrub running. Your server also doesn't have an SLOG device. Did you clarify if you're using NFS? – ewwhite Mar 26 '20 at 10:13
  • It's slow even without the scrub running, no SLOG device, yes. Yes, using NFS. Again, the speed of the ZFS pool ITSELF is not the problem, this is fine. – Sebastian Mar 26 '20 at 10:35
  • Disabling the sync took the time down from 5 minutes to 30s, still not close to the 2s on the server itself, but way better. But I am afraid disabling sync isn't the smartest solution? – Sebastian Mar 26 '20 at 10:39
  • You need an SLOG device. – ewwhite Mar 26 '20 at 11:56
  • I don't have any disk slots left, i can't install a SLOG device as far as I understand. Would switching to iscsi be of benefit? – Sebastian Mar 26 '20 at 12:08
  • Even without disk slots, how about adding a capacitor backed nvme card to the server as a SLOG? Just be sure to put it into a slot that is on the same processor as your controller so that q(p/t)i transfers are not an issue if you have multiple cpus. – Rowan Hawkins Mar 26 '20 at 21:36