Hi I'm trying to use jodconverter 3.0 to convert pdf files to html. The resulting html file contains junk characters meaning the conversion is not succesful. Can some one help me know what's happening.
Here is the code snippet:
OfficeManager officeManager = new DefaultOfficeManagerConfiguration().buildOfficeManager();
officeManager.start();
OfficeDocumentConverter converter = new
OfficeDocumentConverter(officeManager);
converter.convert(inputFile, outputFile);
officeManager.stop();
where inputFile = "test.pdf" and outputFile = "test.html" created using File = new File(...) ;
Sample from output file:
%PDF-1.4 %Çì�¢ 5 0 obj <</Length 6 0 R/Filter /FlateDecode>> stream
xœÅ][“#·q.[¢Ì,U’/’,˦sìÄÉ9 ÏxpÇDOVh;NUª,{“<ˆ~X.wIƼ./²þF¬#œ##—Æ
13gIFÒ#8#h4€Æ×#4°O7}Çø¦wÿÇÂéã_þÁlî>;zº‘\�#-ç#Ɇn#ôFIfÇZvsóñÑçG¾ùæ#¿
#ªZ³íó�ì˜Ô½†�#&–#µ½=Rê •ŸîöªS¦g#õ:åÉ•þ6WŒm7éÇŸ¥ÒÏ} Æ¿ý»ÜàçéçÜÇÇD#3|æ5¡Jï¤G ›dÑQË?ÿ"0e¢pø©ú‡‘Anyñù#Y9H‡#&
…ÿü��½[[ôñÝDáÖ.Šƒ�‘¸•#w3¥##w[\KãwºÛÉ?sÓÀ¬ÑÃöŸÜ#A4´�Ýœ¾###ü<=#`#
À####IÍCùA(#]Ù×#Ë÷Žþ{óh%#Q¬K#A]°þ À¶#L*##¥4¬ƒLü}þj�##á{SCê
‡¡Ã/"d½—`(# '`d»‡�0~
ó3.#ï�ÏnÔ˜=Ì›ƒ(#Õ…)Ú½½ãÆtli##l#…9Úþrq#RöN<ð(®
£ž¯ïöCÇ•„ÙïÓˆ®_A#cî#Ÿ=_ät0®;Äé•d¤Á¶äÌ#p=�Ûҗö#»epe_g,#´-éiP=ìÃb#ð¸òb2î
—Щ«(#Nõ=Úº—²‚% Ã#Ui×�AËÞ#s¶qý:Ã#xø