3

I'm trying to parse 10GB of .dat files into something recognizable in .NET. The column delimiter is a '~' and the EOL is a '++EOL++'. I know how to handle the delimiter but I can't find an easy way to handle the '++EOL++' when there are no actual line breaks in the file. Can this be handled with an option in FileHelpers or would I have to write something custom?

Scott
  • 189
  • 2
  • 10
  • +1 Good question, there doesn't appear to be anything obvious in the source code that would help your problem (I probably missed something). As a quick and dirty solution you could just so a string replace. – M.Babcock Jan 12 '12 at 18:32
  • This was always in my mind, but I wanted to make sure I'm not missing something that was built in. – Scott Jan 13 '12 at 14:43

1 Answers1

0

No FileHelpers does not support files with unusual end-of-lines character sequences by default.

It would probably be easiest to pre-parse the file and replace the EOL sequences. However, it is an extensible library, so you could create your own DataStorage subclass. You would essentially have to override

public override object[] ExtractRecords()
{
    using (MyStreamReader reader = new MyStreamReader(fileName, base.mEncoding, true, 102400))
    {
        T[] localArray = this.ReadStream(reader, maxRecords);
        reader.Close();
        return localArray;
    }
}

and then create a new class MyStreamReader, which would be identical to the (unfortunately sealed) InternalStreamReader except for ReadLine which contains the EOL code

switch (ch)
{
    case '\n':
    case '\r':

    etc...
}

(By the way I'm referring to the source code for FileHelpers 2.9.9. Version 2.0.0 seems to use a System.IO.StreamReader so you can just subclass it instead of duplicating InternalStreamReader.

shamp00
  • 11,106
  • 4
  • 38
  • 81