0

It seems that Excel (I've tested 2013 version) parses malformed CSV files without errors/warnings.

I've created my own lib for parsing CSV files. Made a sample CSV file for test (semicolon is the delimiter):

TEST_STRING;Field1
"Band "Radiohead" in the City Hall";

According to CSV format description: https://www.rfc-editor.org/rfc/rfc4180#page-2 (#2.7) -- my test file contains error in line 2 ("Band "Radiohead" in the City Hall") because inner double quotes ("Radiohead") ain't escaped (prefixed) with double quotes.

My lib and TextFieldParser Class both raise exception in this case. Sample code for TextFieldParser I've used:

using (TextFieldParser parser = new TextFieldParser(@"d:\test_2.csv"))
        {
            parser.TextFieldType = FieldType.Delimited;
            parser.SetDelimiters(";");
            int lineNumber = 0;
            while (!parser.EndOfData)
            {
                string[] fields = parser.ReadFields();
                foreach (string field in fields)
                {
                    //TODO: Process field
                    Console.WriteLine(field);
                }

                Console.WriteLine();
                Console.WriteLine("line " + (++lineNumber) + ":");
            }
        }

Exception is

enter image description here

But Excel 2013 opens the test file with the result:

enter image description here

whether I open test CSV file by double-clicking in Windows Explorer (if Excel is the default program for .CSV) or import test CSV file via a wizard in Excel 2013 (Ribbon => Data => From Text).

Is there a reason for such a behavior in Excel?

Community
  • 1
  • 1
Burst
  • 689
  • 7
  • 15
  • "I've created my own lib for parsing CSV files" - why would you? :http://www.codeproject.com/Articles/9258/A-Fast-CSV-Reader – Mitch Wheat Jul 22 '14 at 11:13
  • This can be ignored. FileHelpers Library ( http://filehelpers.sourceforge.net/ ) and TextFieldParser Class (note, that this class is from .NET Framework by Microsoft) can be helpful as well. Actual question is about Excel's algo and why it processes CSV not the way the spec says. – Burst Jul 22 '14 at 11:37

0 Answers0