I have two relatively new 4T hard drives (WD Data Center Re WD4000FYYZ) formatted as btrfs with raid1 data and raid1 metadata.
I copied a large binary file to the volume (~76 GB). Soon after copying the file, I ran a btrfs scrub. There were no errors.
A few months later, a scrub returned an unrecoverable error on that file. It has not been modified since it was originally copied. I might add that the SMART attributes for both drives do not indicate any errors (Current_Pending_Sector or otherwise).
The system with the drives does not have ECC memory.
The only thing that I can think of that might cause this kind of error is that in writing to another file whose data checksums were contained in the same block as some of the checksums for the big file, some corruption occurred in memory that allowed bad data to pollute one or more of the checksums for the big file.
Unfortunately, I was hoping in migrating to btrfs that once data was loaded and scrubbed successfully, you could be confident that it would remain so if it were not written to (in raid1/5/6 configuration, of course). Obviously, this is not the case.
Can anyone explain how this could have happened? Also, if I had taken a snapshot of the volume that contained the big file, would I still have had access to the original, uncorrupted data from the snapshot?