1

I'm trying to add a new row to existing table within MS Word document. I use POI 3.10, hwpf library. But after execution of this program, the file is crashed, the MS Word rise an warning message. All content looks strange, and not formatted.

A sample is below:

InputStream fin = new FileInputStream(args[0]);
    POIFSFileSystem fs = new POIFSFileSystem(fin);

    HWPFDocument doc = new HWPFDocument(fs);
    Range range = doc.getRange();

    range.getParagraph(269).insertAfter("TEST");
    doc.write(new FileOutputStream("SOME PATH"));

Maybe something additional should be updated(SI, DSI for eg.), because new CharacterRun is added?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Oljko
  • 71
  • 1
  • 2
  • 4

1 Answers1

2

Word97/2000 format DOC files are only supported in simple variants by HWPF. Tables are already critical. If Word rejects a file, which you created/modified with the library, you are probably out of luck.

I developed a custom library based on Apache's HWPF codebase some time ago for a customer. That custom library added support for many features and could reliably write Word files. Doing all these things right was a lot of work. So just fixing little things is not possible. You would have to spend several man months.

Would it be an option for you to create empty rows in Word and just fill them with HWPF?

EDIT : Likely functional work around:

Pre-fill the table with markers:

+--------+-------------+------------------------------------+
| Rev 1  |  2014-01-01 | Created document                   |
+--------+-------------+------------------------------------+
| Rev 2  |  2014-01-02 | Corrected flow chart               |
+--------+-------------+------------------------------------+
| $REVMRK|  $REVDATE## | $REVTEXT########################## |
+--------+-------------+------------------------------------+
| $REVMRK|  $REVDATE## | $REVTEXT########################## |
+--------+-------------+------------------------------------+
| $REVMRK|  $REVDATE## | $REVTEXT########################## |
+--------+-------------+------------------------------------+
| $REVMRK|  $REVDATE## | $REVTEXT########################## |
+--------+-------------+------------------------------------+

Make sure the markers are long enough. (Adding text in HWPF so that the addresses of paragraphs change in tables may cause trouble.)

To fill a row do this:

  1. Find the set of markers for one row
  2. Each marker must include all subsequent #-characters
  3. Prepare your text to be filled in for each marker
  4. Make sure your text only uses ASCII characters (see below)
  5. Make sure your text is not longer than the revision markers
  6. Fill the revision marks with the new content and fill the remaining #-characters with spaces

If the software does not find a new row, someone must add new row templates in Word.

The reason for some restrictions:

no insert: A lot of things are stored with addresses into the text stream (lots of internal extra tables which contain address references over the text content). This applies to paragraph borders, character formatting, table marks, bookmarks, graphics references etc. Some things are covered by HWPF, a lot are not. If you insert text, the addresses may shift and the Word file may get corrupted.

ASCII range: Sequences of text are stored as 1 byte per character or 2 bytes per character. When doing it right, inserting a non-ASCII character in a 1-byte range requires converting that range to a 2-byte range. This does not always work well in HWPF and it leads to address shifting (see "no insert" above).

You may get away with some restrictions if your document is "simpler". For instance: No textboxes is better. No embedded drawings is better. No nested tables is better. However, usually the restrictions are so tight, that you could as well use a plain text document and give it a .doc extension.

Let me know, if you need more details.

Rainer Schwarze
  • 4,725
  • 1
  • 27
  • 49
  • Thanks Rainer for info. The purpose of this program is to update revision history within existing document, revision history chapter is a table.I need somehow update this table with new information. Do you have some suggestions how I can do it? – Oljko Jul 14 '14 at 08:40
  • Hi @Rainer, nice idea, I need to negotiate with a customer is it OK for him. One more question, is it possible to solve this think in this way: – Oljko Jul 15 '14 at 07:56
  • Hi @Rainer, nice idea, I need to negotiate with a customer is it OK for him. One more question, is it possible to solve this think in this way: 1. Create empty .doc file. 2 Create HPWFdocument instanse, empty file as input. 3. Create Paragraph instance, copy text and properties from file(source file), until line after new text should be inserted is reached. 4. Create new Paragraph instance with some new text and properties. – Oljko Jul 15 '14 at 08:39
  • 5. Proceed creation of new instances and coping of properties until end of file is reached. 6. Save HPWFdocument instance, using internal method "write(OutputStream)". Is it possible, how do you think? Thanks for your help :) – Oljko Jul 15 '14 at 08:39
  • I have tried this solution, but still after saving, the document is crashed... I have used character.replace() method to replace markers with new text. When I'm trying to open this file with MS Word, I receive such message: "This error message can appear for several reasons. The document may be corrupt or damaged. Use either the Recover Text converter or the Open and Repair feature. Both are available from the Open dialog." – Oljko Jul 15 '14 at 14:48