10

I would like to read the contents of an Excel worksheet into a C# DataTable. The Excel worksheet could have a variable numbers of columns and rows. The first row in the Excel worksheet will always contain the column names but other rows may be blank.

All of the suggestions I have seen here in SO all assume the presence of Microsoft.ACE.OLEDB. I do not have this library installed on my system as when I try some of these solutions I get this error.

Microsoft.ACE.OLEDB.12.0' provider is not registered on the local machine.

Strange considering I have Office 2016 installed.

For this reason I was hoping to use the ClosedXML library via Nuget but I do not see any examples in their wiki of reading an Excel worksheet to a DataTable in C#.

HackSlash
  • 4,944
  • 2
  • 18
  • 44
webworm
  • 10,587
  • 33
  • 120
  • 217

2 Answers2

23

This is example is not mine. I cannot remember where I got it from as it was in my archives. However, this works for me. The only issue I ran into was with blank cells. According to a dicussion on the ClosedXML GitHUb wiki page it has something to do with Excel not tracking empty cells that are not bounded by data. I found that if I added data to the cells and then removed the same data the process worked.

public static DataTable ImportExceltoDatatable(string filePath, string sheetName)
{
  // Open the Excel file using ClosedXML.
  // Keep in mind the Excel file cannot be open when trying to read it
  using (XLWorkbook workBook = new XLWorkbook(filePath))
  {
    //Read the first Sheet from Excel file.
    IXLWorksheet workSheet = workBook.Worksheet(1);

    //Create a new DataTable.
    DataTable dt = new DataTable();

    //Loop through the Worksheet rows.
    bool firstRow = true;
    foreach (IXLRow row in workSheet.Rows())
    {
      //Use the first row to add columns to DataTable.
      if (firstRow)
      {
        foreach (IXLCell cell in row.Cells())
        {
          dt.Columns.Add(cell.Value.ToString());
        }
        firstRow = false;
      }
      else
      {
        //Add rows to DataTable.
        dt.Rows.Add();
        int i = 0;

        foreach (IXLCell cell in row.Cells(row.FirstCellUsed().Address.ColumnNumber, row.LastCellUsed().Address.ColumnNumber))
        {
          dt.Rows[dt.Rows.Count - 1][i] = cell.Value.ToString();
          i++;
        }
      }
    }

    return dt;
  }
}

Need to add

using System.Data;
using ClosedXML.Excel;

As well as the ClosedXML nuget package

For other datetime data type... this could be helpful... reference

if (cell.Address.ColumnLetter=="J") // Column with date datatype
 {
    DateTime dtime = DateTime.FromOADate(double.Parse(cell.Value.ToString()));
                     dt.Rows[dt.Rows.Count - 1][i] = dtime;
 }
 else
 {
      dt.Rows[dt.Rows.Count - 1][i] = cell.Value.ToString();
 }
singhswat
  • 832
  • 7
  • 20
webworm
  • 10,587
  • 33
  • 120
  • 217
  • How do we tackle date data type? – singhswat Jun 11 '19 at 15:27
  • @singhswat I treated everything as a string – webworm Jun 11 '19 at 16:45
  • 1
    Throws null reference exception on this line. foreach (IXLCell cell in row.Cells(row.FirstCellUsed().Address.ColumnNumber, row.LastCellUsed().Address.ColumnNumber)) – Jyotirmaya Prusty Jan 16 '20 at 05:32
  • 1
    @JyotirmayaPrusty wrap the contents of the else condition containing that for loop in a condition "if(row.FirstCellUsed() != null)". Otherwise, it fails on a completely empty row. – Brandon Barkley Aug 01 '20 at 22:53
  • If your sheet contains formulas "Value" will crash. It seems to me that using "CachedValue" gets your what you want though in all the cases I tested. That also works for straight values as the two contain the same. "InnerText" might be another option. – Brandon Barkley Aug 01 '20 at 22:54
  • If blank cells are a problem, would this not be a big issue in the real world? i.e. wouldn't this approch be academic (good in theory, bad in practice)? –  Aug 12 '21 at 18:17
  • 1
    Consider performance: (1) The way how rows added: https://stackoverflow.com/questions/18462915/adding-multiple-rows-to-datatable/18465931 (2) Working with Rows of whole sheet is expensive for large tables, Ranges are much faster: 6 cols x 1000000 rows -> 230sec vs 10sec. Maybe it might be solved in 3 steps: get headers/columns by first row; calculate range (range of columns and max rows); get range and iterate over its rows – Kiryl Jan 27 '22 at 22:45
  • The foreach was giving me problems with first column and last column info. It was better to just name a range based on (1,1,ws.Rows().Count,38) basically written as (A1:AL) and then iterate through just .Cells(). If the FirstColumn was Empty it would shift all the data to the left, when Col1 can have an empty value. – KeithL Aug 02 '22 at 20:27
19

With this code you can read the contents of an excel sheet. You can specify the name of the sheet or the number, a dataSet will be returned with the contents of the sheet.

public static DataTable GetDataFromExcel(string path, dynamic worksheet)
        {
            //Save the uploaded Excel file.


            DataTable dt = new DataTable();
            //Open the Excel file using ClosedXML.
            using (XLWorkbook workBook = new XLWorkbook(path))
            {
                //Read the first Sheet from Excel file.
                IXLWorksheet workSheet = workBook.Worksheet(worksheet);

                //Create a new DataTable.

                //Loop through the Worksheet rows.
                bool firstRow = true;
                foreach (IXLRow row in workSheet.Rows())
                {
                    //Use the first row to add columns to DataTable.
                    if (firstRow)
                    {
                        foreach (IXLCell cell in row.Cells())
                        {
                            if (!string.IsNullOrEmpty(cell.Value.ToString()))
                            {
                                dt.Columns.Add(cell.Value.ToString());
                            }
                            else
                            {
                                break;
                            }
                        }
                        firstRow = false;
                    }
                    else
                    {
                        int i = 0;
                        DataRow toInsert = dt.NewRow();
                        foreach (IXLCell cell in row.Cells(1, dt.Columns.Count))
                        {
                            try
                            {
                                toInsert[i] = cell.Value.ToString();
                            }
                            catch (Exception ex)
                            {

                            }
                            i++;
                        }
                        dt.Rows.Add(toInsert);
                    }
                }
                return dt;
            }