0

I'm working with a XLS spreadsheet that is rendered from a third-party software. If I open the file in a text editor it's obvious that it's HTML saved as an XLS file. It appears that MS used to allow this pattern and has recently stopped in Excel per a couple of articles I found:

So, now to the actual tech question. If I send the file (as a stream) directly into the ExcelDataReader using the CreateReader method it throws this error:

"Invalid file signature."

IExcelDataReader excelReader;
excelReader = ExcelReaderFactory.CreateReader(stream);

If on the other hand, I open it in Excel and do a "Save As" and save the XLS as an XLSX, then send it into my code; it imports fine.

My preference would be to not force my user to do the "Save As" step. And getting the third-party to change their format is not a realistic option (as we all know).

Any recommendations or maybe tricks to make this work with the ExcelDataReader?

Here's a screen shot of what I see if I open the XLS in Notepad++ (notice the xls extension):enter image description here

Jeremy Lattimore
  • 276
  • 3
  • 10
  • Without knowing anything about your third-party or accessing to your XLS file; we can't help you! please provide more details like a sample XLS file -HTH ;). – shA.t Sep 17 '18 at 06:59
  • I would love to add a sample file but I can't add the actual file due to the data inside of it. And... I can't seem to resave it with the same format. Excel changes the format any time that I try to resave. – Jeremy Lattimore Sep 17 '18 at 18:54
  • As I can see your file is actually a HTML file, That is as same as an original Excel file that we save it as HTML, I don't think ExcelDataReader supports [this file format!](https://github.com/ExcelDataReader/ExcelDataReader/blob/develop/README.md) ;). – shA.t Sep 18 '18 at 07:48

1 Answers1

1

My preference would be to not force my user to do the "Save As" step. And getting the third-party to change their format is not a realistic option (as we all know).

So, I can suggest you to use Microsoft.Office.Interop.Excel package to save it as a supported format before reading it by ExcelDataReader like this:

var app = new Microsoft.Office.Interop.Excel.Application();
var workbook = app.Workbooks.Open(htmlFileName);

if (File.Exists(excelFileName))
{
    File.Delete(excelFileName);
}

workbook.SaveAs(
    excelFileName,
    Microsoft.Office.Interop.Excel.XlFileFormat.xlWorkbookDefault,
    Type.Missing, Type.Missing,  
    false, false,
    Microsoft.Office.Interop.Excel.XlSaveAsAccessMode.xlNoChange,
    Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing);

workbook.Close();
app.Quit();
app = null;
workbook = null;

using (var stream = File.Open(excelFileName, FileMode.Open, FileAccess.Read))
{
    var reader = ExcelReaderFactory.CreateReader(stream);
    // ...
}
shA.t
  • 16,580
  • 5
  • 54
  • 111