0

To convert pdf to HTML, I am using the aspose.pdf library. I have installed the library through pip3 install aspose-pdf. This library works fine for my Windows machine in Python, version 3.8. But facing an error in linux machine: Proxy error(NullReferenceException): Object reference not set to an instance of an object. I have tried Centos and Ubuntu, but getting the same error. source code:

import aspose.pdf as pdf

doc = pdf.Document("input.pdf")
saveOptions = pdf.HtmlSaveOptions()
doc.save("output.html", saveOptions)

system details:

  • OS: Ubuntu 20.04.3, centos 7
  • Python version: Python 3.8.10
  • Aspose.pdf: 23.5.0 (python via . Net)

I have installed the library through pip3 install aspose-pdf. This library works fine for my Windows machine in Python. I hope my code source works on Linux machine.

  • Please check the System Requirements for Linux OS in order to run the API https://docs.aspose.com/pdf/python-net/system-requirements/#system-requirements-for-target-linux. In case issue still persists, please create a post in Aspose.PDF official support forum (https://forum.aspose.com/c/pdf) where you will be assisted accordingly. This is Asad Ali and I work as Developer Evangelist at Aspose. – Asad Ali Jul 03 '23 at 17:04

1 Answers1

0

You can also try using Aspose.Words for Python, which also support conversion form PDF to HTML:

import aspose.words as aw

doc = aw.Document("in.pdf")
doc.save("out.html")

Please see Aspose.Words documentation to learn about additional requirements, when use Aspose.Words for Python under Linux: https://docs.aspose.com/words/python-net/system-requirements/#system-requirements-for-target-linux-platform

Alexey Noskov
  • 1,722
  • 1
  • 7
  • 13
  • Successful. However, aspose-pdf is more similar to the original compared to aspose-word. – Syarif Hidayat Jul 05 '23 at 07:55
  • As you know MS Word documents are flow, on other hand PDF documents are fixed page documents. While converting PDF document to Word document, Aspose.Words converts it to flow document, that makes it editable in MS Word. Aspose.PDF by default uses text frames to preserve original document layout, but this makes it hard to edit such document in MS Word. – Alexey Noskov Jul 05 '23 at 12:41