I'm experimenting with FSCTL_MOVE_FILE
. Mostly everything is working as expected. However, sometimes if I try to re-read (via FSCTL_GET_NTFS_FILE_RECORD
) the Mft record I just moved, I'm getting some bad data.
Specifically, if the file record says the $ATTRIBUTE_LIST attribute is non-resident and I use my volume handle to read the data from the disk, I find that the data there is internally inconsistent (record length is greater than the actual length of data).
As soon as I saw this happening, the cause was pretty clear: I'm reading the record before the Ntfs driver is finished writing it. Debugging supports this theory. But knowing that doesn't help me solve it. I'm using the synchronous method for the FSCTL_MOVE_FILE
call, but apparently the file system can still be updating stuff in the background. Hmm.
In a normal file, I'd be thinking LockFileEx
with a shared lock (since I'm just reading). But I'm not sure that has any meaning for volume handles? And I'm even less sure Ntfs uses this mechanism internally to ensure consistency.
Still, it seems like a place to start. But my LockFileEx
call against the volume handle is returning ERROR_INVALID_PARAMETER
. I'm not seeing what parameter may be in error, unless it's the volume handle itself. Perhaps they just don't support locks? Or maybe there's some special flags I'm supposed to set in CreateFile
when opening the volume handle? I've tried enabling SE_BACKUP_NAME
and FILE_FLAG_BACKUP_SEMANTICS
, but the error remains unchanged.
Moving forward, I can see a few alternatives here:
- Figure out how to lock sections using a volume handle (and hope the Ntfs driver is doing the same). Seems dubious at this point.
- Figure out how to flush the meta data for the file I just moved (nb: FlushFileBuffers for the MOVE_FILE_DATA.FileHandle didn't help. Maybe flushing the volume handle?).
- Is there some 'official' means for reading non-resident data that doesn't involve
ReadFile
against a volume handle? I didn't find one, but maybe I missed it. - Wait "a bit" after moving data to let the driver complete updating everything. Yuck.
FWIW, here's some test code for doing LockFileEx against a volume handle. Note that you must be running as an administrator to lock volume handles. I'm using J:
, since that's my flash drive. 50000 was picked at random, but should be less than the size of a flash drive.
void Lock()
{
WCHAR path[] = L"\\\\.\\j:";
HANDLE hRootHandle = CreateFile(path,
GENERIC_READ,
FILE_SHARE_READ | FILE_SHARE_WRITE,
NULL,
OPEN_EXISTING,
0,
NULL);
OVERLAPPED olap;
memset(&olap, 0, sizeof(olap));
olap.Offset = 50000;
// Lock 1k of data at offset 50000
BOOL b = LockFileEx(hRootHandle, 1, 0, 1024, 0, &olap);
DWORD j = GetLastError();
CloseHandle(hRootHandle);
}
The code for seeing the bad data is... rather involved. However it is readily reproducible. When it fails, I end up trying to read variable length $ATTRIBUTE_LIST entries that have '0' length, which results in an infinite loop since it looks like I never finished reading the entire buffer. I'm working around it by exiting if the length is zero, but I worry about "leftover garbage" in the buffer instead of nice clean zeros. Detecting that would be impossible, so I'm hoping for a better solution.
Not surprisingly, there isn't a lot of info out there on any of this. So if someone has some experience here, I could use some insight.
Edit 1:
More things that don't quite work:
- Still no luck on LockFileEx.
- I tried flushing the volume handle (as Paul suggested). And while this works, it more than doubles my execution time. And, strictly speaking, it still doesn't solve the problem. There's still no guarantee that Ntfs isn't going to change things some more between the FlushFileBuffers and FSCTL_GET_NTFS_FILE_RECORD / ReadFile.
- I wondered about the 'RecordChanged' timestamp of the $STANDARD_INFORMATION attribute. However, it's not being changed due to these changes to ATTRIBUTE_LIST.
- Fragmenting a file eventually causes an ATTRIBUTE_LIST to get added, and as fragmentation continues to increase, more DATA records will get added to that list. When a DATA record gets added, the UpdateSequenceNumber (not the one that's part of the MFT_SEGMENT_REFERENCE, the other one) gets updated. Unfortunately, there's a sequence of events to perform this update. And apparently the ATTRIBUTE_LIST buffer 'length' gets updated before the 'UpdateSequenceNumber'. So seeing if the 'UpdateSequenceNumber' has changed doesn't help avoid reading (potentially) bad information.
My next best thought is to see if perhaps Ntfs always zeros the new bytes before updating the record length (or maybe whenever the record length shrinks?). If I can depend on the record length being zero (instead of whatever leftover data might occupy those bytes), I can pretend to call this fixed.