I am trying to parse a number of CSV files that has double quotes and commas within the fields. I have no control over the format of the CSVs and instead of using "" to escape the quotes it is using \". The files are also extremely large so reading and using regex isn't the best option for me.
I would prefer to use an existing library and notewrite an entirely new parser. Currently I am using CSVHelper
This is an example of the CSV data:
"id","name","notes" "40","Continue","If the message \"Continue\" does not appear restart, and notify your instructor." "41","Restart","If the message \"Restart\" does not appear after 10 seconds, restart manually."
The problem is the double quotes aren't being escaped properly and the , is being read as a delimiter and separating the notes field into 2 separate fields.
This is my current code that is doesn't work.
DataTable csvData = new DataTable();
string csvFilePath = @"C:\Users\" + csvFileName + ".csv";
try
{
FileInfo file = new FileInfo(csvFilePath);
using (TextReader reader = file.OpenText())
using (CsvReader csv = new CsvReader(reader))
{
csv.Configuration.Delimiter = ",";
csv.Configuration.HasHeaderRecord = true;
csv.Configuration.IgnoreQuotes = false;
csv.Configuration.TrimFields = true;
csv.Configuration.WillThrowOnMissingField = false;
string[] colFields = null;
while(csv.Read())
{
if (colFields == null)
{
colFields = csv.FieldHeaders;
foreach (string column in colFields)
{
DataColumn datacolumn = new DataColumn(column);
datacolumn.AllowDBNull = true;
csvData.Columns.Add(datacolumn);
}
}
string[] fieldData = csv.CurrentRecord;
for (int i = 0; i < fieldData.Length; i++)
{
if (fieldData[i] == "")
{
fieldData[i] = null;
}
}
csvData.Rows.Add(fieldData);
}
}
}
Is there an existing library that lets you specify how to escape quotes or should I just write my own parser?