7

I have a program that compares files in two folders. I want to detect if a file has been renamed, determine the newest file (most recently renamed), and update the name on the old file to match.

To accomplish this, I would check to see if the newest file is bit by bit identical to the old one, and if it is, simply rename the old file to match the new one.

The problem is, I have nothing to key on to tell me which file was most recently renamed.

I would love some property like FileInfo.LastModified, but for files that have been renamed.

I've already looked at solutions like FileSystemWatcher, and that is not really what I'm looking for. I would like to be able to run my synchronizer whenever I want, without having to worry about some dedicated process tracking a folder's state.

Any ideas?

Ty Norton
  • 319
  • 5
  • 14

5 Answers5

1

If you are running on an NTFS drive you can enable the change journal which you can then query for things like rename events. However you need to be an admin to enable it to begin with and it will use disk space. Unfortunately I don't know of any specific C# implementations of reading the journal.

tyranid
  • 13,028
  • 1
  • 32
  • 34
1

A: At least on NTFS, you can attach alternate data streams to a file. On your first sync, you can just attach a GUID in an ADS to the source files to tag them.

B: If you don't have write access to the source, store hashes of the files you synced in your target repository. When the source changes, you only have to hash the source files and only compare bit-by-bit if the hashes collide. Depending on the quality and speed of your hash function, this will save you a lot of time.

Andras Vass
  • 11,478
  • 1
  • 37
  • 49
  • I already store hashes of directories on each end. Matching data bit-by-bit is already possible. I was just hoping I might be able to save myself some bandwidth when syncing over slow networks by moving pre-existing files. It's looking like there's no non-NTFS specific way to do this. – Ty Norton Feb 23 '10 at 17:17
  • This might work over mapped drivers (or network shares) as well provided that the source volume is NTFS. If you have not already found it, this may be of some help: http://www.codeproject.com/KB/cs/ntfsstreams.aspx – Andras Vass Feb 23 '10 at 22:45
0

You could possibly create a config file that holds a list of all expected names within the folder, and then, if a file in the folder is not a member of the expected list of names, determine that the file has then been renamed. This would, however, add another layer of work considering you'd have to change the list every time you wish to add a new file to the folder.

Aaron
  • 7,431
  • 12
  • 35
  • 37
0

Filesystems generally do not track this.

Since you seem to be on Windows, you can use GetFileInformationByHandle(). (Sorry, I don't know the C# equivalent.) You can use the "file index" fields in the struct returned to see if files have the same index as something you've seen before. Keep in mind that hardlinks will also have the same index.

Alternatively you could hash file contents somehow.

I don't know precisely what you're trying to do, so I can't tell you whether either of these points makes sense. It could be that the most reasonable answer is, "no, you can't do that."

asveikau
  • 39,039
  • 2
  • 53
  • 68
  • I'm pretty sure this is not what the OP is looking for, but it is an interesting idea nonetheless... probably better than the OP's plan of testing if the two files are bit-identical to determine a rename has been performed. – rmeador Feb 22 '10 at 23:28
  • That's a great idea, but there are a lot of problems with the file ID. The Remarks section says, "The identifier that is stored in the nFileIndexHigh and nFileIndexLow members is called the file ID. [So high index & low index => **file ID**] Support for **file ID**s is file system-specific [so, not all filesystems may support it...NTFS probably does, who knows if the rest do?]. File IDs are not guaranteed to be unique over time, because file systems are free to reuse them [but for a snapshot of time they will be, I guess]. In some cases, the file ID for a file can change over time." – Alexandru Nov 07 '14 at 20:29
0

I would make a CRC (e.g. CRC example) of (all?) the files in the 2 directories storing the last update time with the CRC value, file name etc. After that, interate through the lists finding maches by the CRC and then use the date values to decide what to do.

Paul Kohler
  • 2,684
  • 18
  • 31
  • I already do this. The problem is that renaming a file doesn't modify any time stamps. – Ty Norton Feb 23 '10 at 17:14
  • Oh - as far as I am aware you can't rename a file and not modify it's time-stamp. If you could it would be a pretty low level API call (not likely exposed by C#) – Paul Kohler Feb 23 '10 at 21:06