0

This question is specific to ChoETL CSV reader

Take this example

"Header1","Header2","Header3"
"Value1","Val
ue2","Value3"

(Notepad++ screenshot)
enter image description here

  • All headers and values are quoted
  • There's a line break in "Value2"

I've been playing with ChoETL options, but I can't get it to work:

   foreach (dynamic e in new
                ChoCSVReader(@"test.csv")
                .WithFirstLineHeader()
                .MayContainEOLInData(true)
                .MayHaveQuotedFields()

                //been playing with these too
                //.QuoteAllFields()
                // .ConfigureHeader(c => c.IgnoreColumnsWithEmptyHeader = true)
                //.AutoIncrementDuplicateColumnNames()
                //.ConfigureHeader(c => c.QuoteAllHeaders = true)
                //.IgnoreEmptyLine()

                )
            {
                System.Console.WriteLine(e["Header1"]);
            }

This fails with:

Missing 'Header2' field value in CSV file

The error varies depending on the reader configuration

What is the correct configuration to read this text?

Cinchoo
  • 6,088
  • 2
  • 19
  • 34
The One
  • 4,560
  • 5
  • 36
  • 52
  • 1
    Cinchoo posts here so I'm sure you'll get some insight at some point; in the interim if it's pressing, perhaps clone the source code and add it to your project as a reference then you can step in and debug through it. I do this quite often with various libs and it's always enlightening.. :) – Caius Jard Dec 17 '21 at 19:45
  • 1
    Check out the answer to this question. It involves sanitizing your data to be in correct format by removing the line breaks in records: https://stackoverflow.com/questions/51658524/new-line-within-csv-column-causing-issue – Zserbinator Dec 18 '21 at 00:02
  • @Zserbinator line breaks are part of the data – The One Dec 18 '21 at 03:54
  • What is the default setting for `.MayHaveQuotedFields()` when you give no parameter, I mean shouldn't that be `.MayHaveQuotedFields(true)` in your case? – BdR Dec 18 '21 at 11:58

1 Answers1

2

It is bug in handling one of the cases (ie. header having quotes - csv2 text). Applied fix. Take the ChoETL.NETStandard.1.2.1.35-beta1 package and give it a try.

string csv1 = @"Header1,Header2,Header3
""Value1"",""Val
ue2"",""Value3""";

string csv2 = @"""Header1"",""Header2"",""Header3""
""Value1"",""Val
ue2"",""Value3""";

string csv3 = @"Header1,Header2,Header3
Value1,""Value2"",Value3";

using (var r = ChoCSVReader.LoadText(csv1)
    .WithFirstLineHeader()
    .MayContainEOLInData(true)
    .QuoteAllFields())
    r.Print();

using (var r = ChoCSVReader.LoadText(csv2)
    .WithFirstLineHeader()
    .MayContainEOLInData(true)
    .QuoteAllFields())
    r.Print();

using (var r = ChoCSVReader.LoadText(csv3)
    .WithFirstLineHeader()
    .MayContainEOLInData(true)
    .QuoteAllFields())
    r.Print();

Sample fiddle: https://dotnetfiddle.net/VubCDR

Cinchoo
  • 6,088
  • 2
  • 19
  • 34