What I do: populate & format an Excel file using a mix of Interop and ClosedXML.
First, the file is populated via Interop, then saved, closed, then I format the cells' RichText using ClosedXML.
Unfortunately, this formatting causes Excel to view my file as "corrupt" and needs to repair it. This is the relevant part:
var workbook = new XLWorkbook(xlsPath);
var sheet = workbook.Worksheet("Error Log");
for (var rownum = 2; rownum <= 10000; rownum++)
{
var oldcell = sheet.Cell("C" + rownum);
var newcell = sheet.Cell("D" + rownum);
var oldtext = oldcell.GetFormattedString();
if(string.IsNullOrEmpty(oldtext.Trim()))
break;
XlHelper.ColorCellText(oldcell, "del", System.Drawing.Color.Red);
XlHelper.ColorCellText(newcell, "add", System.Drawing.Color.Green);
}
workbook.Save();
And the colouring method:
public static void ColorCellText(IXLCell cel, string tagName, System.Drawing.Color col)
{
var rex = new Regex("\\<g\\sid\\=[\\sa-z0-9\\.\\:\\=\\\"]+?\\>");
var txt = cel.GetFormattedString();
var mc = rex.Matches(txt);
var xlcol = XLColor.FromColor(col);
foreach (Match m in mc)
{
txt = txt.Replace(m.Value, "");
txt = txt.Replace("</g>", "");
}
var startTag = string.Format("[{0}]", tagName);
var endTag = string.Format("[/{0}]", tagName);
var crt = cel.RichText;
crt.ClearText();
while (txt.Contains(startTag) || txt.Contains(endTag))
{
var pos1 = txt.IndexOf(startTag);
if (pos1 == -1)
pos1 = 0;
var pos2 = txt.IndexOf(endTag);
if (pos2 == -1)
pos2 = txt.Length - 1;
var txtLen = pos2 - pos1 - 5;
crt.AddText(txt.Substring(0, pos1));
crt.AddText(txt.Substring(pos1 + 5, txtLen)).SetFontColor(xlcol);
txt = txt.Substring(pos2 + 6);
}
if (!string.IsNullOrEmpty(txt))
crt.AddText(txt);
}
Error in file myfile.xlsx
The following repairs were performed: _x000d__x000a__x000d__x000a_
Repaired records:
string properties of /xl/sharedStrings.xml-Part (strings)
I've been through all the xmls looking for clues. In the affected sheet, in comparison view of Productivity Tool, some blocks appear as inserted in the repaired file and deleted in the corrupt one, although nothing significant seemed changed - except for one thing: the style attribute of that cell. Here an example:
<x:c r="AA2" s="59">
<x:f>
(IFERROR(VLOOKUP(G2,Legende!$A$42:$B$45,2,FALSE),0))
</x:f>
</x:c>
I have checked the styles.xml for style 59, but there is none. In the repaired file, this style has been changed to 14, which in my styles.xml is listed as a number format.
Unfortunately, a global search/replace of these invalid style indexes did not resolve the issue. Seeing the things going on here with corrupt indexes, renamed xmls, invalid named ranges etc., I took a different route: not to use interop at all, maybe the corruption was caused by Excel in the first place and the coloring was only the last straw.
Using ClosedXml only:
Wow. Just wow. This makes it even worse. I commented out the colouring part since without that, Interop produced a readable file without errors, so that's what I expect of ClosedXml too.
This is how I open the file and address the worksheet with ClosedXml:
var wb= new XLWorkbook(xlsPath);
var errors = wb.Worksheet("Error Log");
This is how I write the values into the file:
errors.Cell(zeile, 1).SetValue(fname);
With zeile being a simple int counter.
I then dare to set a column width:
errors.Column(2).Width = 50;
errors.Column(3).Width = 50;
errors.Column(4).Width = 50;
As well as setting some values in another sheet in exactly the same fashion before saving with validation.
wb.Save(true);
wb.Dispose();
Lo and behold: The validation throws errors:
Attribute 'name' should have unique value. Its current value 'Legende duplicates with others.
Attribute 'sheetId' should have unique value. Its current value '4' duplicates with others.
A couple more errors like attribute 'top' having invalid value '11.425781'.
Excel cannot open the file directly, must repair it. My Sheet "Legende" is now empty and the first sheet instead of third, and I get an additional fourth sheet "Restored_Table1" which contains my original "Legende" contents.
What the hell is going on with this file??
New attempt: re-create the Excel template from scratch - in LibreOffice.
I now think that the issue is entirely misleading. If I use the newly created file from LibreOffice, the validation causes a System.OutOfMemory exception due to too many validation errors. Opening in Excel requires repair, gives additional sheet and so forth.
Creating in LibreOffice, then opening in Excel, saving, then using that file as template produces a much better result albeit not perfect yet. Since I copied parts over from the old Excel file into LO while creating the new file, I assume some corrupt remnant got copied over.
I cannot shake the feeling that this is the file itself after all and has nothing to do with how I edit it!
Will post updaate tomorrow.