For zpools which are rather full and/or very fragmented I use to enable metaslab debugging (echo metaslab_debug/W1 | mdb -kw
) to avoid space map thrashing and the resulting severe write performance hit. The problem itself seems to be old and understood, a fix has been rumored to be "in the works" for a while now, just as has been the defrag API which presumably should help as well, yet I could not find an "official" approach fixing it by default in production code.
Is there something I have missed?
Some environment data: my zpools are of moderate size (typically < 10 TB) and mostly present zfs datasets with zvols using the default record size of 8K (which is in fact variable due to typically enabled compression). Over the years, I have seen this problem appear in different versions of Solaris, especially with aged zpools which have seen a lot of data. Note that this is not the same as the zpool 90% full performance wall
as the space map thrashing due to fragmentation hits at a significantly lower space utilization level (I have seen it occur at 70% on a couple of old pools)