So here is the code that works with the current task. Here the criteria of selecting paragraphs is quite simple: paragraphs 11..20 go to the file "us.docx", and 21..30 - to "japan.docx".
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.usermodel.Paragraph;
import org.apache.poi.hwpf.usermodel.Range;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;
public class SplitDocs {
public static void main(String[] args) {
FileInputStream in = null;
HWPFDocument doc = null;
XWPFDocument us = null;
XWPFDocument japan = null;
FileOutputStream outUs = null;
FileOutputStream outJapan = null;
try {
in = new FileInputStream("wto.doc");
doc = new HWPFDocument(in);
us = new XWPFDocument();
japan = new XWPFDocument();
Range range = doc.getRange();
for (int parIndex = 0; parIndex < range.numParagraphs(); parIndex++) {
Paragraph paragraph = range.getParagraph(parIndex);
String text = paragraph.text();
System.out.println("***Paragraph" + parIndex + ": " + text);
if ( (parIndex >= 11) && (parIndex <= 20) ) {
createParagraphInAnotherDocument(us, text);
} else if ( (parIndex >= 21) && (parIndex <= 30) ) {
createParagraphInAnotherDocument(japan, text);
}
}
outUs = new FileOutputStream("us.docx");
outJapan = new FileOutputStream("japan.docx");
us.write(outUs);
japan.write(outJapan);
in.close();
outUs.close();
outJapan.close();
} catch (IOException e) {
e.printStackTrace();
}
}
private static void createParagraphInAnotherDocument(XWPFDocument document, String text) { XWPFParagraph newPar = document.createParagraph();
newPar.createRun().setText(text, 0);
}
}
I used .docx as the output as it is waaaaay easier to add new paragraphs to a .docx than to a .doc file. The method insertAfter(ParagraphProperties props, int styleIndex)
for inserting a new Paragraph
to a given range
is now deprecated (i use POI version 3.10), and i couldn't find an easy and logical way to create a new Paragraph object in the empty .doc file. Whereas it's a pleasure to use straightforward and clean XWPFParagraph newPar = document.createParagraph();
.
However, this code uses .doc as an input, as required in your task. Hope this will help :)
P.S. Here we use a simple choosing criteria, using paragraph indices. If you need something like font criteria, as you said, you will probably post another questions, or maybe you'll find the solution yourself. Anyway, with docx things get easier.