6

I have a file which contains a certain number of fixed length rows having some numbers. I need to read each row in order to get that number and process them and write to a file. Since I need to read each row, as the number of rows increases it becomes time consuming.

Is there an efficient way of reading each row of the file? I'm using C#.

Jay
  • 373
  • 8
  • 22

4 Answers4

14

File.ReadLines (.NET 4.0+) is probably the most memory efficient way to do this.

It returns an IEnumerable<string> meaning that lines will get read lazily in a streaming fashion.

Previous versions do not have the streaming option available in this manner, but using StreamReader to read line by line would achieve the same.

Oded
  • 489,969
  • 99
  • 883
  • 1,009
0

Reading all rows from a file is always at least O(n). When file size starts becoming an issue then its probably a good time to look at creating a database for the information instead of flat files.

Ryathal
  • 271
  • 1
  • 2
  • well the files are the result of an external hardware which will be in the form of files and actually a large number of files....any way of efficiently reading the files would be appreciated – Jay Feb 09 '12 at 14:57
0

Not sure this is the most efficient, but it works well for me: http://msdn.microsoft.com/en-us/library/system.io.fileinfo.aspx

    //Declare a new file and give it the path to your file
    FileInfo fi1 = new FileInfo(path);

    //Open the file and read the text
    using (StreamReader sr = fi1.OpenText()) 
    {
        string s = "";
        // Loop through each line
        while ((s = sr.ReadLine()) != null) 
        {
            //Here is where you handle your row in the file
            Console.WriteLine(s);
        }
    }
David Welker
  • 368
  • 2
  • 13
  • What I do after this, rather than just write the line to the console, is convert the line into an Array and import the data to a database table. Seems to run very fast through the tab delimited file but I hardly ever have to go through more than a couple thousand records with it. –  Feb 09 '12 at 15:27
  • Unless you are doing something special with the StreamReader, which in this example you are not, you can just write `foreach(var line in File.ReadLines(path)) { Console.WriteLine(line); }`. – Philip Feb 10 '12 at 15:51
0

No matter which operating system you're using, there will be several layers between your code and the actual storage mechanism. Hard drives and tape drives store files in blocks, which these days are usually around 4K each. If you want to read one byte, the device will still read the entire block into memory -- it's just faster that way. The device and the OS also may each keep a cache of blocks. So there's not much you can do to change the standard (highly optimized) file reading behavior; just read the file as you need it and let the system take care of the rest.

If the time to process the file is becoming a problem, two options that might help are:

  1. Try to arrange to use shorter files. It sounds like you're processing log files or something -- running your program more frequently might help to at least give the appearance of better performance.

  2. Change the way the data is stored. Again, I understand that the file comes from some external source, but perhaps you can arrange for a job to run that periodically converts the raw file to something that you can read more quickly.

Good luck.

Caleb
  • 124,013
  • 19
  • 183
  • 272