8

When experiencing write I/O, the log column in zpool iostat -v does not show any ZIL activity, ever. This results in higher than expected wait times when writing data to disk (sometimes over 80ms during contention).

                     capacity     operations    bandwidth
    pool              alloc   free   read  write   read  write
----------------  -----  -----  -----  -----  -----  -----  
storage           1.88T  2.09T      3  3.01K   512K  39.3M
  mirror           961G  1.05T      0  1.97K   128K  20.8M
    mpathf            -      -      0    393      0  20.8M
    mpathg            -      -      0    391   128K  20.6M
  mirror           961G  1.05T      2  1.04K   384K  18.5M
    mpathi            -      -      1    379   256K  21.1M
    mpathj            -      -      0    281   128K  18.3M
logs                  -      -      -      -      -      -
  /zlog/zilcache      0  15.9G      0      0      0      0
cache                 -      -      -      -      -      -
  mpathk           232G     8M      1      0   130K      0
  mpathl           232G     8M      1      0   130K      0
----------------  -----  -----  -----  -----  -----  -----

My /zlog/zilcache device never has any IO. It is a file on very fast flash. I can write and read it when I remove it from the ZFS store, but ZFS seems to ignore it.

The device looks available:

  pool: storage
 state: ONLINE
  scan: scrub repaired 0 in 19h31m with 0 errors on Wed Nov 19 07:39:03 2014
config:

    NAME              STATE     READ WRITE CKSUM
    storage           ONLINE       0     0     0
      mirror-0        ONLINE       0     0     0
        mpathf        ONLINE       0     0     0
        mpathg        ONLINE       0     0     0
      mirror-1        ONLINE       0     0     0
        mpathi        ONLINE       0     0     0
        mpathj        ONLINE       0     0     0
    logs
      /zlog/zilcache  ONLINE       0     0     0
    cache
      mpathk          ONLINE       0     0     0
      mpathl          ONLINE       0     0     0

errors: No known data errors

Any way to configure ZFS to cache writes to the logs device for faster acknowledgements?

Thanks

user1955162
  • 296
  • 3
  • 16
  • Please describe your hardware in detail. Server type, disks, SSDs, etc. also OS distribution and versions. – ewwhite Dec 10 '14 at 23:36
  • Server is an IBM (Dual CPU Xeon), disks are 4TB SATA disks behind a raid controller, functioning as a JBOD. I have installed the Linux ZFS builds on RHEL 6.5. – user1955162 Dec 10 '14 at 23:44
  • And what are you using for a ZIL device? – ewwhite Dec 11 '14 at 01:00
  • its a 16GB file on a PCIe SSD. The SSD is mounted on /zlog – user1955162 Dec 11 '14 at 01:04
  • 1
    More details!!!! – ewwhite Dec 11 '14 at 01:13
  • I'd really like to help, but there isn't enough information to make a good recommendation. For instance, depending on your SSD, piping synchronous writes through the ZIL device might not work as expected. – ewwhite Dec 11 '14 at 01:47
  • Looks like you are confusing the ZIL and the L2ARC cache. – drookie Dec 11 '14 at 08:37
  • 1
    sorry, stepped away from work for a moment. I am seeing high wait IO when writing data to the ZFS pool over NFS. Writes are generated from a Mysql Database primarily, random IO is expected. The SSD is a Samsung SSD (250GB 840). I expect the ZIL to write data and commit the write, releasing the IO. Am I not understanding the ZIL device? – user1955162 Dec 12 '14 at 08:59
  • Drookie, the L2ARC devices are the cache devices, right? These are the mpathk and mpathl devices. They are read caches? The writes are meant to be the /zlog/zilcache device. If I am getting terminology backwards, my bad. My goal is to speed up writes with /zlog and reads with mpathk and mpathl – user1955162 Dec 12 '14 at 09:00

1 Answers1

7

I believe you are misunderstanding the ZIL purpose. You describe it as a write cache which it is not. No activity on the ZIL might just be a normal behavior depending on what is running on your machine.

Nothing is ever read from the ZIL, this is a write only device. The only exception would possibly occur during a pool import after a crash.

There are only writes to it if applications are performing synchronous writes. Regular I/Os like moving files around are not using the ZIL.

You can set sync=always on the dataset to force all writes to behave as if they were synchronous.

user
  • 4,335
  • 4
  • 34
  • 71
jlliagre
  • 8,861
  • 18
  • 36
  • 1
    I am intending to speed up write acknowledgements to the NFS client, and my understanding is that the ZIL stores writes to speed up acknowledgements to the NFS client. The client writes are coming from a MySQL database – user1955162 Dec 12 '14 at 09:01
  • Ok, that makes sense. Synchronous writes are expected from a database and should indeed affect the SLOG. Might be a intended tuning. What says `zfs get sync storage`? – jlliagre Dec 12 '14 at 09:25
  • `[root@storage ~]# zfs get sync storage NAME PROPERTY VALUE SOURCE storage sync standard default [root@storage ~]# ` Thanks!!! Let me know if you can't read that since the return characters didn't copy over. – user1955162 Dec 12 '14 at 09:29
  • 1
    I can read it but better to update your question to add this kind of information. I still have no explanation, the tuning value is the expected one. – jlliagre Dec 12 '14 at 09:51
  • 3
    I found the issue, for some reason, the NFS share is not sending data synchronously. I set sync=always, and the activity started writing to the SSD. Many thanks for the input and the pointer in the syncronisity – user1955162 Dec 12 '14 at 09:58
  • This is helpful. I benchmarked a 5x raidz2 7200 RPM HDD as 50% slower with `sync=always` turned on, even though the SSD ZIL ran hot for all the writes (confirmed with `zpool iostat -v tank 1`) – Bill McGonigle Feb 27 '20 at 18:04