3

I'm trying to write a function in C# that gets a directory path as parameter and returns a dictionary where the keys are the files directly under that directory and the values are their last modification time. This is easy to do with Directory.GetFiles() and then File.GetLastWriteTime(). However, this means that every file must be accessed, which is too slow for my needs. Is there a way to do this while accessing just the directory? Does the file system even support this kind of requirement?

Edit, after reading some answers: Thank you guys, you are all saying pretty much the same - use FileInfo object. Still, it is just as slow to use Directory.GetFiles() (or Directory.EnumerateFiles()) to get those objects, and I suspect that getting them requires access to every file. If the file system keeps last modification time of its files in the files themselves only, there can't be a way to extract that info without file access. Is this the case here? Do GetFiles() and EnumerateFiles() of DirectoryInfo access every file or get their info from the directory entry? I know that if I would have wanted to get just the file names, I could do this with the Directory class without accessing every file. But getting attributes seems trickier...

Edit, following henk's response: it seems that it really is faster to use FileInfo Object. I created the following test:

static void Main(string[] args)
    {
        Console.WriteLine(DateTime.Now);

        foreach (string file in Directory.GetFiles(@"\\169.254.78.161\dir"))
        {
            DateTime x = File.GetLastWriteTime(file);
        }

        Console.WriteLine(DateTime.Now);

        DirectoryInfo dirInfo2 = new DirectoryInfo(@"\\169.254.78.161\dir");
        var files2 = from f in dirInfo2.EnumerateFiles()
                select f;
        foreach (FileInfo file in files2)
        {
            DateTime x = file.LastWriteTime;
        }

        Console.WriteLine(DateTime.Now);
    }

For about 800 files, I usually get something like:
31/08/2011 17:14:48
31/08/2011 17:14:51
31/08/2011 17:14:52

Yoni
  • 275
  • 2
  • 4
  • 8
  • 2
    No, this info doesn't require accessing the files, only the directory entries in the MFT. You'll need a faster disk if that's too slow. – Hans Passant Aug 31 '11 at 12:12
  • @Yoni: How/what are you measuring? Did you compare a call to GetFiles to a loop that Opens/Closes each file? I would expect GetFiles() to take about the same time for 100|200 files while the loop would take twice as long. – H H Aug 31 '11 at 13:47
  • I think you are right. I did some testing, see edit above – Yoni Aug 31 '11 at 14:17
  • I'm sorry to say but your test is seriously flawed. Due to caching the 2nd run will always be faster. Caching is hard to eliminate anyway, and the fact that it's over a network also plays a role. Test on a local disk, with alternating runs and use the `System.Diagnostics.Stopwatch` class. – H H Aug 31 '11 at 14:38
  • Funny, I used a network drive because I thought the caching on local drives is much heavier (and also because I intend this function to work on a network drive). I did copy the test directory around, trying to eliminate the caching factor, and I did alternate the order. The results are similar. – Yoni Aug 31 '11 at 16:11

2 Answers2

2

I didn't do any timings but your best bet is:

DirectoryInfo di = new DirectoryInfo(myPath);
FileInfo[] files = di.GetFiles();

I think all the FileInfo attributes are available in the directory file records so this should (could) require the minimum I/O.

H H
  • 263,252
  • 30
  • 330
  • 514
  • I don't know how GetFiles() is implemented, but according to the times I measure, it seems that it accesses every file to get the info – Yoni Aug 31 '11 at 13:19
1

The only other thing I can think of is using the FileInfo-Class. As far as I can see this might help you or it might read the file as well (Read Permissions are required)

Random Dev
  • 51,810
  • 9
  • 92
  • 119
  • Thank you. I don't know exactly what you mean, but that one somewhat helps: http://msdn.microsoft.com/en-us/library/dd413232.aspx – Yoni Aug 31 '11 at 12:00
  • that's just a way to get to the FileInfo-Objects as I said. What I don't know if those *Info-Objects behave the same as your original code internaly ... but neve mind. – Random Dev Aug 31 '11 at 12:18
  • applying EnumerateFiles() on a DirectoryInfo object will return all FileInfo objects, that include the last write time that I need. However, it still accesses every file – Yoni Aug 31 '11 at 13:03
  • that is exactly what I was trying to say with "or it might read the file as well" - I guess there is no better way. Maybe some WinAPI wizzards have some magic hidden deep down in the windows bowels but for the framework I cannot think of another solution – Random Dev Aug 31 '11 at 13:17