0

I asked a similar question recently but thanks to the people that commented on that question I learned that the problem is more with excel than NPOI, so I deleted that question and rephrase it here.

Anyway, my main problem is stated in my previous question. I need to read a downloaded .xls-file using NPOI. The problem is that my file I downloaded is most likely a HTML-table that has been imported to an excel document. Either that, or the excel-document is really an .xlsx (MIME?)that has been zipped/unzipped wrongly.

When I open the document in excel i get a warning saying that the file might be corrupt. I press "ok" and everything works fine. So apparently the file is readable by excel, but not NPOI.

Does anyone know what I can do about this? Or is it a lost cause?

Community
  • 1
  • 1
Johan Hjalmarsson
  • 3,433
  • 4
  • 39
  • 64

1 Answers1

1

I figured it out!

Since the .xls file is really just a html-table, I opened it with notepad and saw that it was html-source for a table. So All I had to do was to make a parser to read from the html-file into a DataTable and proceed from there.

Here's a start (Haven't completed the parser yet):

private static void HTMLtoExcel(string fileName) //atm, reads the first cell value.
{
    string text = File.ReadAllText(fileName);
    DataTable dt = new DataTable();
    string insert;
    int start = text.IndexOf("<td>");
    int stop = text.IndexOf("</td>");
    insert = text.Substring(start, stop - start);
    insert = insert.Remove(0, 4);
    Console.WriteLine(insert);
}
Johan Hjalmarsson
  • 3,433
  • 4
  • 39
  • 64