Excel generation, unreadable content

Question

I'm using Apache POI 3.12 (SXSSF workbook) in order to generate .xlsx files. The problem is that I'm doing the generation and when I open the file I'm receiving an error message:

Excel found unreadable content in file.xlsx. Do you want to recover the contents of this workbook? If you trust the source of this workbook, click Yes.

After clicking Yes, the file opens and I'm receiving this notification

Excel completed file level validation and repair. Some parts of this workbook may have been repaired or discarded. Removed Records: Comments from /xl/comments1.xml part (Comments) Repaired Records: Comments from /xl/comments1.xml part (Comments)

After that, I unzip the excel file and check the comments1.xml. All my comments are present. All 216 of them.

The section of the code that generates the comments is the following

String comment = _propertiesHolder.getComment();
String commentAuthor = _propertiesHolder.getCommentAuthor();
if(comment != null)
{
    int colIndex = cell.getColumnIndex();
    int rowIndex = cell.getRowIndex();
    CreationHelper helper = _workbook.getCreationHelper();
    ClientAnchor anchor = helper.createClientAnchor();
    anchor.setCol1(colIndex);
    anchor.setCol2(colIndex + 1);
    anchor.setRow1(rowIndex);
    anchor.setRow2(rowIndex + 3);

    // Create the comment and set the text+author
    Comment cellComment = _drawingPatriarch.createCellComment(anchor);
    if(commentAuthor != null)
    {
        cellComment.setAuthor(commentAuthor);
        RichTextString rs = helper.createRichTextString(commentAuthor + ": " + comment);
        cellComment.setString(rs);
    }
    else 
    {
        cellComment.setString(helper.createRichTextString(comment));
    }

    cellComment.setRow(rowIndex);
    cellComment.setColumn(colIndex);

    // Assign the comment to the cell
    cell.setCellComment(cellComment);
}

Do you have any idea what could be the cause of this problem? Although no information was lost, clearly there is something wrong and I would like to fix it. The comments are retrieved from database (varchar datatype). The biggest comment is 138 characters long.

Update

Something that I forgot to mention. I've also run the same extraction using hssf implementation and no errors were present. It would be a safe assumption that the data are not the problem.

Are you doing that for a single comment, or are you adding multiple comments? — Gagravarr, Aug 12 '15 at 10:06
Why are you creating a new Drawing Patriarch each time then? As per [the warning in the javadocs](https://poi.apache.org/apidocs/org/apache/poi/ss/usermodel/Sheet.html#createDrawingPatriarch%28%29), that's liable to remove any existing ones each time you call it! — Gagravarr, Aug 12 '15 at 10:09
I haven't seen that to be honest, but I've seen the implementation. It doesn't create if there is one already present `CTDrawing ctDrawing = getCTDrawing(); if (ctDrawing != null) { return getDrawingPatriarch(); }` — Alkis Kalogeris, Aug 12 '15 at 10:11
Although I will change it to use the same. It might be the case that I've missed something in the implementation — Alkis Kalogeris, Aug 12 '15 at 10:12
Short term, your best bet would be to switch to using XSSFWorkbook, which does properly support Comments. Longer term, raise a bug in the [Apache POI bugzilla](https://bz.apache.org/bugzilla/enter_bug.cgi?product=POI), including a junit unit test that shows the problem. And for bonus marks, add the SXSSFSheet support for it - the drawing will need to be created in the temp xml and written out, not just left in the parent XSSF sheet and then ignored — Gagravarr, Aug 12 '15 at 10:26
Unfortunately the xssf is very slow for my needs (big extraction, lot's of memory needed. gc hits repeatedly). Yes I will do the long term thing — Alkis Kalogeris, Aug 12 '15 at 10:51

score 0 · Accepted Answer · answered Aug 14 '15 at 10:08

Ok I found the problem. It was with the author.

The problem is this line cellComment.setAuthor(commentAuthor);.

If for one comment we set

cellComment.setAuthor("test")

and then in another comment we set

cellComment.setAuthor("test ")

There will be an error shown when opening the file. Mind the whitespace. The solution is to trim the author string before setting it.

Excel generation, unreadable content

1 Answers1