0

I have an Excel file (.xls), it has about 1400 rows, there are rows which are displayed in correct font (both in sheet cells and in formula box which is docked at top). However there are some rows which are displayed with right font in sheet cells but if I select it and look at the formula box, it displays the text of the selected row in a wrong font (some kind of error or unsupported font).

My excel file's content contains unicode text, my laptop is installed with the most common unicode fonts (simply my language need unicode to display properly and I always have many unicode fonts installed). I'm using Excel 2013, but I don't think the problem is something related to missing font... Maybe the formula box and the sheet cell use different fonts. A noticeable point is all the cells which have their contents displayed with wrong font in the formula box (I call these cells Group B cells) are formatted with a different font from the other cells (I call these Group A cells). I've tried changing the font of Group B cells to the font of Group A cells but their contents were displayed with wrong font (both in cells and in the formula box) then.

If every work just handles with the Excel file only, there is not big problem, however I have a project which has to import this Excel file into a database. I've tried a demo and it loads the Excel file with lots of font errors. These errors are exact what displayed in the formula box I see in the Excel window. I don't know how to fix this. The problem is the customer supposes the Excel file is OK, and it looks like that (because the cells' contents are displayed OK in cells, the errors are just in the formula box). I don't have any reason to require the customer another Excel file.

I'm using NPOI to read the Excel file, I think it reads the cell's content in the same way the formula box reads, I've tried applying the correct font for every row before reading its string but it doesn't seem to work. Could you have any idea on this, I just need some suggestion and don't hope for a complete or perfect solution (I know this is not easy)? My first ideas are trying to make the cells' contents displayed with right/correct font in the formula box in the Excel window first and then trying to do this using code with NPOI (or another lib), then import the Excel file normally, I hope there is a simpler solution which extracts the content (with correct font) right when reading the cells.

Your suggestions would help me much in solving this problem. Thanks!

===========================================================

UPDATE with sample xls file xls file

King King
  • 61,710
  • 16
  • 105
  • 130
  • sounds like what you are discovering is that Excel is not a very good medium for transferring & storing data. In any case, please provide some examples with your question, including preferably an XLS file on dropbox which contains some of the problematic data. I am guessing you have some **double-byte** characters or other special characters that are causing problems, but without being able to see, I can't really know. – David Zemens Apr 20 '13 at 13:13
  • I don't think it's fault of Excel. I've found that the cell text/value will be deferred from it's formula, all the format and style will be applied later on the cell's text/value and that's why the text displays OK in cell (with correct font) while the formula displays wrong. However I don't understand why this happens, I mean there is some sync issue between cell text and cell formula. Thanks. – King King Apr 20 '13 at 13:38
  • In additiona to trying to describe what is, or what you *think* is happening, please upload an example of your file and identify some problematic cell(s). Nobody on here is pyschic. – David Zemens Apr 20 '13 at 13:48
  • @DavidZemens Wow, if you don't be afraid of looking at text in my language (Vietnamese), that'll be fine, please see my update. – King King Apr 20 '13 at 14:06
  • well the language barrier may pose a problem since I do not read/speak/write in Vietnamese, unfortunately... can you tell me what cell(s) are giving you the problem? Perhaps I can take a look at it anyways :) – David Zemens Apr 20 '13 at 16:50
  • The sample I uploaded has all the cells with font errors, the only one cell (I edited its formula) is what I want is at the address B3. (Note that you should look at the formula box to see them in error, the text in the formula box should be exact as the text in the cells). – King King Apr 20 '13 at 16:57
  • Sorry they literally all look identical on my computer -- all the cells appear identical to what appears in the formula box. I also check in VBA whether the cells `.Text` = `.Value` and also = `.Formula`. – David Zemens Apr 20 '13 at 19:38
  • Could you give me 1 snapshot? Thanks, by the way, what version of Excel are you using? – King King Apr 20 '13 at 19:43
  • I am using Excel 2010 on Win 7 64b. I do notice that cells B1 and B3 indicate they are font "Times New Roman" and appear *correct* whereas other cells are font ".VnTime" (which is not installed font on my computer). There are two screenshots [here](http://imgur.com/w5UbbNS,Ccha6MX) which show the cell and the formula bar for the cell. – David Zemens Apr 20 '13 at 20:01
  • @David Zemens, OK I know why they are all identical. The cells with font Times New Roman also display OK in my laptop (as I said), all the others display wrong but only in the formula box, because my laptop has .Vntime installed so they display OK in cells. While your computer doesn't have .Vntime installed so they display identically in both cells and the formula box but in fact these texts are wrong displayed, that's not Vietnamese and No one can read and understand them. You can search for .Vntime font to install and see how it looks exactly as in my laptop. Thanks. – King King Apr 20 '13 at 20:23
  • can you post screenshot of what these look like on your computer? – David Zemens Apr 20 '13 at 21:06

1 Answers1

0

I am using Excel 2010 on Win 7 64b.

I do not have the .VnTime font installed but thanks to your screenshots I see what you are seeing and there appears to be some disconnect between what the cell displays (font-formatted text) and what the Formula bar displays. Somehow, this impacts the way your application is reading the cell values.

In most cases it is safe to read just the cell as a range object.

Three alternatives you may try would be to change your code to read the:

  • cell.Formula, or
  • cell.Text, or
  • cell.Value

One potential problem is that formatting may be unique to Excel, and so if the underlying character set is not supported by your application, by the database, or by the client's systems/computers, this sort of mismatch error may persists.

There may be some additional information (although pertains mostly to number & date formatting) here:

http://blogs.office.com/b/microsoft-excel/archive/2007/11/12/manipulating-and-massaging-data-in-excel.aspx

David Zemens
  • 53,033
  • 11
  • 81
  • 130
  • Thanks, I'm afraid that your solution can't be tested because I'm using only NPOI, I don't know about other lib but with NPOI, I've tried all these: Cell.StringCellValue, Cell.ToString(), Cell.RichStringCellValue but all gave out the same result. I understand that the values of the cells are encoded in a different way (it's called TCVN3) while they will only be displayed correctly in my favorite font when being encoded in Unicode. My solution is converting these values to Unicode version and it works... – King King Apr 20 '13 at 23:40
  • ... but the new problem is there are other encoding types besides TCVN3 and Unicode, I'm searching for how to know the encoding name of a cell and apply the corresponding conversion. Thanks for help. – King King Apr 20 '13 at 23:40