3

I want to create an exe that can be deployed onto other computers. The program needs to be able to read pdf's and turn them into images, but I don't want other users to have to download dependencies.

My understanding is that py2image and wand both require external dependencies that, if you convert to a exe, other users would also need to download the dependencies themselves.

Are there other options available/ workarounds ?

  • Are you trying to create a program to convert PDF to image? If yes what do you mean by "not using python dependencies"? – Masoud Rahimi Jun 03 '19 at 02:55
  • That would be part of the program, yes. I don't want to use things like imagemagick or poppler, as then when I deploy the program as an exe other users would need to install those programs. – Aditya Kaushik Jun 03 '19 at 10:31
  • Creating an executable from a working py script is totally different from Not using external dependencies for your script. I think you have to use an external package to handle the job for you and then you can simply pack it to a single executable to run without any dependencies. Is this what you want? – Masoud Rahimi Jun 03 '19 at 10:35
  • Yes, that's what I want. If I were to download/use imagemagick, do you know how I could include that into the executable with pyinstaller/other methods? – Aditya Kaushik Jun 03 '19 at 14:30

3 Answers3

1

I wasn't able to find a solution, apparently one needs a PDF renderer no matter what. The most lightweight solution is https://pymupdf.readthedocs.io/en/latest/intro.html. It is still a python binding for a PDF renderer (https://www.mupdf.com/), but you can install it, including its dependency, by:

pip install PyMuPDF

No need to install poppler or imagemagick.

Then you can convert a pdf to images as follows:

import fitz  

doc = fitz.open(stream=your_pdf_file_stream, filetype="pdf")  
for idx, page in enumerate(doc):  
    pix = page.get_pixmap(dpi=600) 
    the_page_bytes=pix.pil_tobytes(format="PNG")
    with open("page-%s.png"%idx, "wb") as outf:
        outf.write(the_page_bytes)

Unfortunately, mupdf has a copyleft license, so keep that in mind.

Emilia Apostolova
  • 1,719
  • 15
  • 18
  • ? Not sure what you mean, it has a GNU license (apache is not a copyleft license anyway), and also you don't need to install any exes if you are using the python wheels, via "pip install PyMuPDF". Per the documentation: Python wheels exist for Windows (32bit and 64bit), Linux (64bit, Intel and ARM) and Mac OSX (64bit, Intel only). – Emilia Apostolova Dec 22 '21 at 19:03
  • The fact that this does not need anything besides an entry in `requirements.txt` makes this the best answer by far. No one wants to mess with installing all kinds of third-party software manually and then fiddling with search paths just to get a script going. – rem Apr 27 '22 at 07:04
0

Actually, it took me a while to handle this, but I think it worth it. You need to do all steps carefully to make it work.

  1. Install pdf2image with pip install pdf2image.
  2. Get poppler windows binaries.
  3. Create a new directory like myproject.
  4. Create a script converter.py inside myproject and add below code.
  5. Create another directory inside myproject and name it poppler.
  6. Copy all files in the binary folder of downloaded poppler into poppler directory. Try to test pdfimages.exe if it is working.
  7. Use pyinstaller converter.py -F --add-data "./poppler/*;./poppler" --noupx
  8. Your executable is now ready. Run it like converter.exe myfile.pdf. Results would be created inside the output directory next to the executable.
  9. Now your standalone PDF2IMAGE converter app is ready!

converter.py:

import sys
import os
from pdf2image import convert_from_path


def current_path(dir_path):
    if hasattr(sys, '_MEIPASS'):
        return os.path.join(sys._MEIPASS, dir_path)
    return os.path.join(".", dir_path)


if __name__ == "__main__":
    if len(sys.argv) < 2:
        print("PASS your PDF file: \"converter.exe myfile.pdf\"")
        input()
        sys.exit(0)
    os.environ["PATH"] += os.pathsep + \
        os.pathsep.join([current_path("poppler")])

    if not os.path.isdir("./output"):
        os.makedirs("output")
    images = convert_from_path(sys.argv[-1], 500)
    for image, i in zip(images, (range(len(images)))):
        image.save('./output/out{}.png'.format(i), 'PNG')

PS: If you like it, you can add a GUI and add more settings for pdf2images.

Masoud Rahimi
  • 5,785
  • 15
  • 39
  • 67
  • You're amazing, thank you for your help. This is going to be part of a broader application, and I haven't yet started on this part of the app, but when I get there I will test this out. Thanks! – Aditya Kaushik Jun 03 '19 at 19:08
  • I'm running in conda environment and I'm totally lost how I should `--add-data` conda-installed poppler to the command. – user8491363 Feb 21 '21 at 14:52
  • This requires poppler, and the question explicitly asks for non-python dependencies, so I am not sure why this answer is here. – Emilia Apostolova Dec 22 '21 at 18:06
  • @EmiliaApostolova Read the comments on the question if you are not sure. – Masoud Rahimi Dec 23 '21 at 05:36
0

I got the same problem while trying to make .exe file with pyqt5 and pdf2file modules, using pyinstaller. If you need to add GUI created in PyQt5, do not add --windowed in pyinstaller command. That was ruining my work for 2days