23

I have a program which reads a text file and processes it to be seperated into sections.

So the question is how can the program be changed to allow the program to skip reading the first 5 lines of the file while using the Stream Reader to read the file?

Could someones please advise on the codes? Thanks!

The Codes:

class Program
{
    static void Main(string[] args)
    {
        TextReader tr = new StreamReader(@"C:\Test\new.txt");

        String SplitBy = "----------------------------------------";

        // Skip first 5 lines of the text file?
        String fullLog = tr.ReadToEnd();

        String[] sections = fullLog.Split(new string[] { SplitBy }, StringSplitOptions.None);

        //String[] lines = sections.Skip(5).ToArray();

        foreach (String r in sections)
        {
            Console.WriteLine(r);
            Console.WriteLine("============================================================");
        }
    }
}
JavaNoob
  • 3,494
  • 19
  • 49
  • 61
  • 2
    so whats the issue with using commented out line? – Ilia G Dec 11 '10 at 18:54
  • Its to show experts that .split method does not work. – JavaNoob Dec 11 '10 at 19:00
  • possible duplicate of [C# How to skip lines in Text file after text coverted to array?](http://stackoverflow.com/questions/4417916/c-how-to-skip-lines-in-text-file-after-text-coverted-to-array) – ChrisF Dec 11 '10 at 19:01
  • How does the `Split()` not work? It is very suboptimal on large files of course, but it is functional. – Ilia G Dec 11 '10 at 19:16

6 Answers6

28

Try the following

// Skip 5 lines
for(var i = 0; i < 5; i++) {
  tr.ReadLine();
}

// Read the rest
string remainingText = tr.ReadToEnd();
JaredPar
  • 733,204
  • 149
  • 1,241
  • 1,454
13

If the lines are fixed then the most efficient way is as follows:

using( Stream stream = File.Open(fileName, FileMode.Open) )
{
    stream.Seek(bytesPerLine * (myLine - 1), SeekOrigin.Begin);
    using( StreamReader reader = new StreamReader(stream) )
    {
        string line = reader.ReadLine();
    }
}

And if the lines vary in length then you'll have to just read them in a line at a time as follows:

using (var sr = new StreamReader("file"))
{
    for (int i = 1; i <= 5; ++i)
        sr.ReadLine();
}
phillip
  • 2,618
  • 19
  • 22
  • 4
    Just a note, unlike the other solutions here this will not bother reading the lines you're skipping over, and should be much faster than the others. – Mathieson Sep 18 '13 at 16:19
8

If you want to use it more times in your program then it maybe a good idea to make a custom class inherited from StreamReader with the ability to skip lines.

Something like this could do:

class SkippableStreamReader : StreamReader
{
    public SkippableStreamReader(string path) : base(path) { }

    public void SkipLines(int linecount)
    {
        for (int i = 0; i < linecount; i++)
        {
            this.ReadLine();
        }
    }
}

after this you could use the SkippableStreamReader's function to skip lines. Example:

SkippableStreamReader exampleReader = new SkippableStreamReader("file_to_read");

//do stuff
//and when needed
exampleReader.SkipLines(number_of_lines_to_skip);
Isti115
  • 2,418
  • 3
  • 29
  • 35
6

I'll add two more suggestions to the list.

If there will always be a file, and you will only be reading, I suggest this:

var lines = File.ReadLines(@"C:\Test\new.txt").Skip(5).ToArray();

File.ReadLines doesn't block the file from others and only loads into memory necessary lines.

If your stream can come from other sources then I suggest this approach:

class Program
{
    static void Main(string[] args)
    {
        //it's up to you to get your stream
        var stream = GetStream();

        //Here is where you'll read your lines. 
        //Any Linq statement can be used here.
        var lines = ReadLines(stream).Skip(5).ToArray();

        //Go on and do whatever you want to do with your lines...
    }
}

public IEnumerable<string> ReadLines(Stream stream)
{
    using (var reader = new StreamReader(stream))
    {
        while (!reader.EndOfStream)
        {
            yield return reader.ReadLine();
        }
    }
}

The Iterator block will automatically clean itself up once you are done with it. Here is an article by Jon Skeet going in depth into how that works exactly (scroll down to the "And finally..." section).

Mark Rucker
  • 6,952
  • 4
  • 39
  • 65
1

I'd guess it's as simple as:

    static void Main(string[] args)
    {
        var tr = new StreamReader(@"C:\new.txt");

        var SplitBy = "----------------------------------------";

        // Skip first 5 lines of the text file?
        foreach (var i in Enumerable.Range(1, 5)) tr.ReadLine();
        var fullLog = tr.ReadToEnd(); 

        String[] sections = fullLog.Split(new string[] { SplitBy }, StringSplitOptions.None);

        //String[] lines = sections.Skip(5).ToArray();

        foreach (String r in sections)
        {
            Console.WriteLine(r);
            Console.WriteLine("============================================================");
        }
    }
Bruno Brant
  • 8,226
  • 7
  • 45
  • 90
1

The StreamReader with ReadLine or ReadToEnd will actually go and read the bytes into the memory, even if you are not processing these lines, they will be loaded, which will affect the app performance in case of big files (10+ MB).

If you want to skip a specific number of lines you need to know the position of the file you want to move to, which gives you two options:

  1. If you know the line length you can calculate the position and move there with Stream.Seek. This is the most efficient way to skip stream content without reading it. The issue here is that you can rarely know the line length.
var linesToSkip = 10;
using(var reader = new StreamReader(fileName) )
{
    reader.BaseStream.Seek(lineLength * (linesToSkip - 1), SeekOrigin.Begin);
    var myNextLine = reader.ReadLine();
    // TODO: process the line
}
  1. If you don't know the line length, you have to read line by line and skip them until you get to the line number desired. The issue here is that is the line number is high, you will get a performance hit
var linesToSkip = 10;
using (var reader = new StreamReader(fileName))
{
    for (int i = 1; i <= linesToSkip; ++i)
        reader.ReadLine();

    var myNextLine = reader.ReadLine();
    // TODO: process the line
}

And if you need just skip everything, you should do it without reading all the content into memory:

using(var reader = new StreamReader(fileName) )
{
   reader.BaseStream.Seek(0, SeekOrigin.End);

   // You can wait here for other processes to write into this file and then the ReadLine will provide you with that content

   var myNextLine = reader.ReadLine();
   // TODO: process the line
}
Mando
  • 11,414
  • 17
  • 86
  • 167
  • 1
    Thank you for this answer. I needed to skip over 3.3 billion rows and I could approximate the total number of bytes so this really saved me a lot of time. – Ali May 13 '20 at 20:59