4

I am trying to read a CSV into a datatable.

The CSV maybe have hundreds of columns and only up to 20 rows.

It will look something like this:

+----------+-----------------+-------------+---------+---+
|  email1  |     email2      |   email3    | email4  | … |
+----------+-----------------+-------------+---------+---+
| ccemail1 | anotherccemail1 | 3rdccemail1 | ccemail |   |
| ccemail2 | anotherccemail2 | 3rdccemail2 |         |   |
| ccemail3 | anotherccemail3 |             |         |   |
| ccemail4 | anotherccemail4 |             |         |   |
| ccemail5 |                 |             |         |   |
| ccemail6 |                 |             |         |   |
| ccemail7 |                 |             |         |   |
| …        |                 |             |         |   |
+----------+-----------------+-------------+---------+---+

i am trying to use genericparser for this; however, i believe that it requires you to know the column names.

string strID, strName, strStatus;
using (GenericParser parser = new GenericParser())
{
    parser.SetDataSource("MyData.txt");

    parser.ColumnDelimiter = "\t".ToCharArray();
    parser.FirstRowHasHeader = true;
    parser.SkipStartingDataRows = 10;
    parser.MaxBufferSize = 4096;
    parser.MaxRows = 500;
    parser.TextQualifier = '\"';

    while (parser.Read())
    {
      strID = parser["ID"];  //as you can see this requires you to know the column names
      strName = parser["Name"];
      strStatus = parser["Status"];

      // Your code here ...
    }
}

is there a way to read this file into a datatable without know the column names?

Alex Gordon
  • 57,446
  • 287
  • 670
  • 1,062

4 Answers4

8

It's so simple!

        var adapter = new GenericParsing.GenericParserAdapter(filepath);
        DataTable dt = adapter.GetDataTable();

This will automatically do everything for you.

Alex Gordon
  • 57,446
  • 287
  • 670
  • 1,062
2

I looked at the source code, and you can access the data by column index too, like this

var firstColumn = parser[0]

Replace the 0 with the column number. The number of colums can be found using

parser.ColumnCount
Xharze
  • 2,703
  • 2
  • 17
  • 30
2

I'm not familiar with that GenericParser, i would suggest to use tools like TextFieldParser, FileHelpers or this CSV-Reader.

But this simple manual approach should work also:

IEnumerable<String> lines = File.ReadAllLines(filePath);
String header = lines.First();
var headers = header.Split(new[]{','}, StringSplitOptions.RemoveEmptyEntries);
DataTable tbl = new DataTable();
for (int i = 0; i < headers.Length; i++)
{
    tbl.Columns.Add(headers[i]);
}
var data = lines.Skip(1);
foreach(var line in data)
{
    var fields = line.Split(new[]{','}, StringSplitOptions.RemoveEmptyEntries);
    DataRow newRow = tbl.Rows.Add();
    newRow.ItemArray = fields;
}
Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
1

i used generic parser to do it. On the first run through the loop i get the columns names and then reference them to add them to a list

In my case i have pivoted the data but here is a code sample if it helps someone

        bool firstRow = true;
        List<string> columnNames = new List<string>();
        List<Tuple<string, string, string>> results = new List<Tuple<string, string, string>>();

        while (parser.Read())
        {
            if (firstRow)
            {
                for (int i = 0; i < parser.ColumnCount; i++)
                {
                    if (parser.GetColumnName(i).Contains("FY"))
                    {
                        columnNames.Add(parser.GetColumnName(i));
                        Console.Log("Column found: {0}", parser.GetColumnName(i));
                    }
                }
                firstRow = false;
            }

            foreach (var col in columnNames)
            {
                double actualCost = 0;
                bool hasValueParsed = Double.TryParse(parser[col], out actualCost);
                csvData.Add(new ProjectCost
                {
                    ProjectItem = parser["ProjectItem"],
                    ActualCosts = actualCost,
                    ColumnName = col
                });
            }
        }
jezza_bro
  • 51
  • 6
  • I would argue a generic parser is reinventing the wheel a bit. There are many pre-rolled CSV parsers that will work without knowing column names. It may be helpful to less experienced users if you define pivoting, or provide an example of what your actual input data looks like after pivoting – MyStackRunnethOver Nov 13 '18 at 19:29