Today I did some tests on L2ARC using the latest ZFS on Linux 0.7.10. I have seen that the L2ARC gets filled with data, but with the default module settings the data that is residing in L2ARC cache is never touched. Instead the data is read from the vdevs of the main pool. I also have seen this behaviour in 0.7.9 and I am not sure if that is the expected behaviour.
Even if that would be the expected behaviour, I think it is odd to spoil the L2ARC with data that is never read.
The test installation is a VM:
- CentOS 7.5 with latest patches
- ZFS on Linux 0.7.10
- 2GB RAM
I did some ZFS settings:
l2arc_headroom=1024
andl2arc_headroom=1024
to speed up the L2ARC population
Here is how the pool was created and the layout. I know it is rather odd for a real-world setup, but this was intended for L2ARC testing only.
[root@host ~]# zpool create tank raidz2 /dev/sda /dev/sdb /dev/sdc cache sdd -f
[root@host ~]# zpool list -v
NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
tank 2.95G 333K 2.95G - 0% 0% 1.00x ONLINE -
raidz2 2.95G 333K 2.95G - 0% 0%
sda - - - - - -
sdb - - - - - -
sdc - - - - - -
cache - - - - - -
sdd 1010M 512 1009M - 0% 0%
Now write some data to a file and look at the device usage.
[root@host ~]# dd if=/dev/urandom of=/tank/testfile bs=1M count=512
512+0 records in
512+0 records out
536870912 bytes (537 MB) copied, 9.03607 s, 59.4 MB/s
[root@host ~]# zpool list -v
NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
tank 2.95G 1.50G 1.45G - 10% 50% 1.00x ONLINE -
raidz2 2.95G 1.50G 1.45G - 10% 50%
sda - - - - - -
sdb - - - - - -
sdc - - - - - -
cache - - - - - -
sdd 1010M 208M 801M - 0% 20%
Alright, some of the data was already moved to the L2ARC but not all. So, read it in some more times to make it in L2ARC completely.
[root@host ~]# dd if=/tank/testfile of=/dev/null bs=512 # until L2ARC is populated with the 512MB testfile
[root@host ~]# zpool list -v
NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
tank 2.95G 1.50G 1.45G - 11% 50% 1.00x ONLINE -
raidz2 2.95G 1.50G 1.45G - 11% 50%
sda - - - - - -
sdb - - - - - -
sdc - - - - - -
cache - - - - - -
sdd 1010M 512M 498M - 0% 50%
Okay, L2ARC is populated and ready to be read. But one needs to get rid of L1ARC first. I did the following, which have seemed to work.
[root@host ~]# echo $((64*1024*1024)) > /sys/module/zfs/parameters/zfs_arc_max; sleep 5s; echo $((1024*1024*1024)) > /sys/module/zfs/parameters/zfs_arc_max; sleep 5s; arc_summary.py -p1
------------------------------------------------------------------------
ZFS Subsystem Report Sun Sep 09 17:03:55 2018
ARC Summary: (HEALTHY)
Memory Throttle Count: 0
ARC Misc:
Deleted: 20
Mutex Misses: 0
Evict Skips: 1
ARC Size: 0.17% 1.75 MiB
Target Size: (Adaptive) 100.00% 1.00 GiB
Min Size (Hard Limit): 6.10% 62.48 MiB
Max Size (High Water): 16:1 1.00 GiB
ARC Size Breakdown:
Recently Used Cache Size: 96.06% 1.32 MiB
Frequently Used Cache Size: 3.94% 55.50 KiB
ARC Hash Breakdown:
Elements Max: 48
Elements Current: 100.00% 48
Collisions: 0
Chain Max: 0
Chains: 0
Alright, now we are ready to read from the L2ARC (sorry for the long preface, but I thought it was important).
So running the dd if=/tank/testfile of=/dev/null bs=512
command again, I was watching zpool iostat -v 5
in a second terminal.
To my surprise, the file was read from the normal vdevs instead of the L2ARC, although the file sits in L2ARC. This is the only file in the filesystem and no other activity is active during my tests.
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
tank 1.50G 1.45G 736 55 91.9M 96.0K
raidz2 1.50G 1.45G 736 55 91.9M 96.0K
sda - - 247 18 30.9M 32.0K
sdb - - 238 18 29.8M 32.0K
sdc - - 250 18 31.2M 32.0K
cache - - - - - -
sdd 512M 498M 0 1 85.2K 1.10K
---------- ----- ----- ----- ----- ----- -----
I then did fiddle around with some settings like zfetch_array_rd_sz
, zfetch_max_distance
, zfetch_max_streams
, l2arc_write_boost
and l2arc_write_max
, setting them to an odd high number. But nothing did change.
After changing
l2arc_noprefetch=0
(default is1
)- or
zfs_prefetch_disable=1
(default is0
) - toggle both from their defaults
the reads are served from the L2ARC. Again running dd if=/tank/testfile of=/dev/null bs=512
and watching zpool iostat -v 5
in a second terminal and get rid of L1ARC.
[root@host ~]# echo 0 > /sys/module/zfs/parameters/l2arc_noprefetch
[root@host ~]# echo $((64*1024*1024)) > /sys/module/zfs/parameters/zfs_arc_max; sleep 5s; echo $((1024*1024*1024)) > /sys/module/zfs/parameters/zfs_arc_max; sleep 5s; arc_summary.py -p1
...
[root@host ~]# dd if=/tank/testfile of=/dev/null bs=512
And the result:
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
tank 1.50G 1.45G 0 57 921 102K
raidz2 1.50G 1.45G 0 57 921 102K
sda - - 0 18 0 34.1K
sdb - - 0 18 0 34.1K
sdc - - 0 19 921 34.1K
cache - - - - - -
sdd 512M 497M 736 0 91.9M 1023
---------- ----- ----- ----- ----- ----- -----
Now data is read from L2ARC, but only after toggling the module parameters mentioned above.
I also have read that L2ARC can be sized too big. But threads I have found about that topic were referring to performance problems or the space map for the L2ARC spoiling the L1ARC.
Performance is not my problem here, and as far as I can tell the space map for the L2ARC is also not that big.
[root@host ~]# grep hdr /proc/spl/kstat/zfs/arcstats
hdr_size 4 279712
l2_hdr_size 4 319488
As already mentioned, I am not sure if that is the intended behavior or if I am missing something.