My application needs to open a lot of small files, say 1440 files each containing data of 1 minute to read all the data of a certain day. Each file is only a couple of kB big. This is for a GUI application, so I want the user (== me!) to not have to wait too long.
It turns out that opening the files is rather slow. After researching, most time is wasted in creating a FileStream (OpenStream = new FileStream) for each file. Example code :
// stream en reader aanmaken
FileStream OpenStream;
BinaryReader bReader;
foreach (string file in files)
{
// bestaat de file? dan inlezen en opslaan
if (System.IO.File.Exists(file))
{
long Start = sw.ElapsedMilliseconds;
// file read only openen, anders kan de applicatie crashen
OpenStream = new FileStream(file, FileMode.Open, FileAccess.Read, FileShare.ReadWrite);
Tijden.Add(sw.ElapsedMilliseconds - Start);
bReader = new BinaryReader(OpenStream);
// alles in één keer inlezen, werkt goed en snel
// -bijhouden of appenden nog wel mogelijk is, zonodig niet meer appenden
blAppend &= Bestanden.Add(file, bReader.ReadBytes((int)OpenStream.Length), blAppend);
// file sluiten
bReader.Close();
}
}
Using the stopwatch timer, I see that most (> 80%) of the time is spent on creating the FileStream for each file. Creating the BinaryReader and actually reading the file (Bestanden.add) takes almost no time.
I'm baffled about this and cannot find a way to speed it up. What can I do to speed up the creation of the FileStream?
update to the question:
- this happens both on windows 7 and windows 10
- the files are local (on a SSD disk)
- there are only the 1440 files in a directory
- strangely, reading the (same) files again later, creating the FileStreams suddenly cost almost no time at all. Somewhere the OS is remembering the filestreams.
- even if I close the application and restart it, opening the files "again" also costs almost no time. This makes it pretty hard to find the performance issue. I had to make a lot of copies of directory to recreate the problem over and over.