How often should I defrag ZFS pools?

Question

I ZFS scrub once a month which takes ~24 hours for the 72TB. For what I can find about how often I should scrub is:

very busy pools, once per week
not so busy pools, once per month

or run a scrub, measure the time, and adjust accordingly.

But what about online defragmention?

Question

Is online defragmention considered good practice like scrubbing is? And if, how often should I degrag?

Due to the way this question is worded, I feel it necessary to emphasise that a ZFS scrub does not defrag, see Nex7's comment for what a scrub does. The title asks about defrag, most of the body talks about scrub, and the last line asks about defrag, this is why I voted down this question. — BeowulfNode42, May 25 '14 at 00:56

score 10 · Accepted Answer · answered Dec 04 '13 at 15:44

10

This is not something you need to so often. Mainly because there's no notion of online defragmentation in ZFS. That's really only possible by copying the pool data to another pool or rewriting to new storage. Strive to keep your zpools below 70% utilization instead.

answered Dec 04 '13 at 15:44

ewwhite

197,159
92
443
809

30

I would say not /nearly/ so often. A scrub simply reads every block on the system. That's all it does. By virtue of how ZFS handles reads, ZFS will automatically repair any bad blocks when they're read. This means a scrub is only necessary to read data that isn't commonly being read anyway. Thus, the asker's comment of very busy pools once per week and not so busy pools once per month is actually backwards. A very busy pool, especially one that reads the majority of the live data regularly, need not scrub that often. A quiet pool that is rarely accessed should scrub more often. – Nex7 Dec 08 '13 at 17:04
It's important to at least scrub reasonably often (or, more generally, ensure all of the drive data gets read often) because otherwise, ZFS will have no idea what the status of the data on the disk actually is, and you may find upon resilvering after a disk failure that your redundant drives contain corrupted data that went undetected due to lack of reads (and then restore your data from backup, or just lose it) – Thomas Apr 22 '17 at 10:25
Why should one strive to keep the zpool below 70% utilization? – lindhe Jun 27 '22 at 16:13
@lindhe The number is higher today. But above some level free-space fragmentation, performance become an issue. – ewwhite Jun 27 '22 at 17:48

score 0 · Answer 2 · answered Jun 06 '23 at 22:13

I know this is an old question but I felt I could add a bit more if you come across this today like I have.

ZFS doesn't have a built in option for defragmentation. Due to how blocks are allocated, how ZFS is Copy On Write, and snapshots locking blocks down means you can't really defragment data. The only solution I know of is to make a pool of equivalent size and ZFS send/receive the data, destroy the old pool, make and make it again.

Also it's worth mentioning you have your scrubs backwards. Data you use a lot is constantly having it's checksums validated whereas quiescent data sits there rotting without verifying block/pointer checksums.

Generally most people do 1 month at least for heavy use datasets (even less if you know 90%+ of your data is going to be used like for a webserver)

For data that isn't used often scrubbing twice a month or once a week is good practice (depending on the number of disks, how much data, how old the drives are .etc) YMMV

How often should I defrag ZFS pools?

2 Answers2