5

With .Net's OleDb I try to import an Excel table in which the first row(s) can be empty. I want to keep the empty row in the DataTable to be able to map the cells to Excel-style cell names "A1, A2, ..." later. But the first line is removed, no matter what I do.

Excel file looks like:

 -   -   -
 ABC XY  ZZ
 1   2   3
 4   4   5

Where "-" is an empty cell. (I have no influence to the import format.)

Simplified code:

        string cnnStr = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=\"file.xls\";Extended Properties=\"Excel 8.0;HDR=No;IMEX=1\"";
        string mySheet = "Sheet1$";

        OleDbConnection connection = new OleDbConnection(cnnStr);
        DataSet Contents = new DataSet();
        using (OleDbDataAdapter adapter = new OleDbDataAdapter("select * from [" + mySheet + "]", connection))
        {
            adapter.Fill(Contents);
        }

        Console.WriteLine(Contents.Tables[0].Rows.Count); // prints: 3
        Console.WriteLine(Contents.Tables[0].Rows[0].ItemArray[0]); // prints: ABC

Any idea how to preserve that empty row?

ps: I found How to count empty rows when reading from Excel but couldn't reproduce it.

Community
  • 1
  • 1
djk
  • 943
  • 2
  • 9
  • 27
  • 1
    If the 2nd row is your actual header row, why do you need to keep the empty row ? Otherwise, if it's a consistent feature that you need to keep, can you not simply insert an empty row into your transformation? – Russ Clarke Apr 04 '12 at 12:01
  • As I said, I want to map cells to Excel style cell names (A1, B2, etc.). To be more specific: If the Excel contains an unwanted value or unknown header name, I want be abte to tell the user "The value in cell A2 is incorrect." Currently this works if no empty row exists. With empty rows, the numbering is shifted. – djk Apr 04 '12 at 13:25
  • I am having the same problem now. Did you manage to solve this? – Ross Brigoli May 19 '15 at 11:43
  • @RossBrigoli: No, I didn't. – djk May 20 '15 at 13:37

2 Answers2

2

The issue seems to be related to the TypeGuessRows feature of the OLEDB provider. In a nutshell, data in an Excel column can be of any type. The OLEDB provider guesses the data type by scanning the first 8 rows of the sheet to determine the Majority Type - the data type with the most number of values in the sample. Anything that is not of the Majority Type are discarded.

See this blog post for a more detailed explanation.

As well as this MS KB Article that discusses the behavior.

(Skip down to the Workaround section for the TypeGuessRows behavior)

As a test, I created a file similar to the sample you posted but formatted all of the columns as text and saved the file. Running the code you posted I was able to see 4 Rows returned, with the first Row an empty string.

You may also want to try modifying the registry to see if changing the TypeGuessRows setting to 0 (scan all data in the file to determine data type of each column) helps return the first blank row. My hunch is that this won't help though.

Tim Lentine
  • 7,782
  • 5
  • 35
  • 40
  • I am aware of this, but it seems not related: I'm not dealing with 8+ rows, the effect is the same with text/numbers only, and the registry value doesn't change anything (tested). "Formatting all columns as text" makes the first row included, because **any** format change to one of the first row cells is considered "content" (in rich text it is). Changing the background color of one cell in the first row has the same effect. Only a completely untouched unformatted empty first row (which is the case when you just start in the second row) seems to be considered unneeded by the Adapter. :( – djk Jul 23 '12 at 08:20
0

OleDbDataAdapter considers the first row as header. In order to get the first row, create a datarow from the header of the datatable. And insert at the first location.

  DataTable dt = Contents.Tables[0];

  DataRow dr = new DataRow();
  int i = 0;
  foreach (DataColumn column in dt.Columns)
  {
    dr[i] = column.ColumnName.ToString();
    i++;
  }
  dt.Rows.InsertAt(dr, 0);
Pat
  • 89
  • 1
  • 2
  • I believe this is not true. When the first row contains any value in any cell, then it is returned as part of the DataTable, e.g. the row count is 4. Also ColumnName contains in my case F1, F2, F3, ... - Note that the connStr contains HDR=No, however setting this to yes has the same effect: First line is ignored. – djk Apr 06 '12 at 11:23