How to read images in the ms-office .doc file using Apache poi? I have tried with the following code but it is not working.
try {
POIFSFileSystem fs = new POIFSFileSystem(new FileInputStream("C:\\DATASTORE\\ImageDocument.doc"));
Document document = new Document();
OutputStream fileOutput = new FileOutputStream(new File("C:/DATASTORE/ImageDocumentPDF.pdf"));
PdfWriter.getInstance(document, fileOutput);
document.open();
HWPFDocument hdocument=new HWPFDocument(fs);
Range range=hdocument.getOverallRange();
PdfPTable createTable;
CharacterRun run;
PicturesTable picture=hdocument.getPicturesTable();
int picoffset=run.getPicOffset();
for(int i=0;i<range.numParagraphs();i++) {
run =range.getCharacterRun(i);
if(picture.hasPicture(run)) {
Picture pic=picture.extractPicture(run, true);
byte[] picturearray=pic.getContent();
com.itextpdf.text.Image image=com.itextpdf.text.Image.getInstance(picturearray);
document.add(image);
}
}
}
When i execute the above code and prints the picture offset value it displays -1 and when print picture.hasPicture(run) it returns false though the input file has an image.
Please help me to find the solution. Thank you