5

I have created a Scrapy spider and successfully converted to a Windows executable using PyInstaller with a disc folder.

In order to do that, I have to make some slight changes in the Scrapy site-packages and add those packages in the Windows disc folder. It works perfectly.

How can I make this into a single EXE file with the commented Scrapy packages from the disc folder?

I have already tried with --OneFile command in PyInstaller, but it shows the Scrapy error. Why?

Error:

ImportError: No module named 'scrapy.spiderloader'

I am calling Scrapy from a script and spider details are passing inside crawl process() function. How can I tell PyInstaller to fetch my updated Scrapy package from a location?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Arun Augustine
  • 1,690
  • 1
  • 13
  • 20
  • have you tried pyinstaller's hiddenimport ? For example: --hidden-import=saidmodule.py – RockAndRoleCoder Mar 25 '19 at 06:09
  • I have tried with hidden imports for package scrapy, from the command line as well as the .spec file. both show the same error. – Arun Augustine Mar 25 '19 at 08:39
  • What is a *"disc folder"*? Do you mean *"dist folder"* or *"dest folder"*? (*"dist"* may or may not be for *"distribution"* (or *"distributable"*?) and *"destination"*, respectively) – Peter Mortensen Mar 26 '23 at 11:33

3 Answers3

7

A very similar issue is discussed in Python Scrapy conversion to EXE file using PyInstaller.

Initially, I used the auto-py-to-exe package (which is actually a GUI for PyInstaller).
I added following line to auto-py-to-exe -> advanced settings -> hidden import:

scrapy.spiderloader,scrapy.statscollectors,scrapy.logformatter,scrapy.extensions,scrapy.extensions.corestats,scrapy.extensions.corestats,scrapy.extensions.telnet,scrapy.extensions.memusage,scrapy.extensions.memdebug,scrapy.extensions.closespider,scrapy.extensions.feedexport,scrapy.extensions.logstats,scrapy.extensions.spiderstate,scrapy.extensions.throttle,scrapy.core.scheduler,scrapy.squeues,queuelib,scrapy.core.downloader,scrapy.downloadermiddlewares,scrapy.downloadermiddlewares.robotstxt,scrapy.downloadermiddlewares.httpauth,scrapy.downloadermiddlewares.downloadtimeout,scrapy.downloadermiddlewares.defaultheaders,scrapy.downloadermiddlewares.useragent,scrapy.downloadermiddlewares.retry,scrapy.downloadermiddlewares.ajaxcrawl,scrapy.downloadermiddlewares.redirect,scrapy.downloadermiddlewares.httpcompression,scrapy.downloadermiddlewares.redirect,scrapy.downloadermiddlewares.cookies,scrapy.downloadermiddlewares.httpproxy,scrapy.downloadermiddlewares.stats,scrapy.downloadermiddlewares.httpcache,scrapy.spidermiddlewares,scrapy.spidermiddlewares.httperror,scrapy.spidermiddlewares.offsite,scrapy.spidermiddlewares.referer,scrapy.spidermiddlewares.urllength,scrapy.spidermiddlewares.depth,scrapy.pipelines,scrapy.dupefilters,scrapy.core.downloader.handlers.datauri,scrapy.core.downloader.handlers.file,scrapy.core.downloader.handlers.http,scrapy.core.downloader.handlers.s3,scrapy.core.downloader.handlers.ftp,scrapy.core.downloader.webclient,scrapy.core.downloader.contextfactory

After that, the following command appeared in the last text box (don't forget to change path to your script):

pyinstaller -y -F --hidden-import scrapy.spiderloader --hidden-import scrapy.statscollectors --hidden-import scrapy.logformatter --hidden-import scrapy.extensions --hidden-import scrapy.extensions.corestats --hidden-import scrapy.extensions.corestats --hidden-import scrapy.extensions.telnet --hidden-import scrapy.extensions.memusage --hidden-import scrapy.extensions.memdebug --hidden-import scrapy.extensions.closespider --hidden-import scrapy.extensions.feedexport --hidden-import scrapy.extensions.logstats --hidden-import scrapy.extensions.spiderstate --hidden-import scrapy.extensions.throttle --hidden-import scrapy.core.scheduler --hidden-import scrapy.squeues --hidden-import queuelib --hidden-import scrapy.core.downloader --hidden-import scrapy.downloadermiddlewares --hidden-import scrapy.downloadermiddlewares.robotstxt --hidden-import scrapy.downloadermiddlewares.httpauth --hidden-import scrapy.downloadermiddlewares.downloadtimeout --hidden-import scrapy.downloadermiddlewares.defaultheaders --hidden-import scrapy.downloadermiddlewares.useragent --hidden-import scrapy.downloadermiddlewares.retry --hidden-import scrapy.downloadermiddlewares.ajaxcrawl --hidden-import scrapy.downloadermiddlewares.redirect --hidden-import scrapy.downloadermiddlewares.httpcompression --hidden-import scrapy.downloadermiddlewares.redirect --hidden-import scrapy.downloadermiddlewares.cookies --hidden-import scrapy.downloadermiddlewares.httpproxy --hidden-import scrapy.downloadermiddlewares.stats --hidden-import scrapy.downloadermiddlewares.httpcache --hidden-import scrapy.spidermiddlewares --hidden-import scrapy.spidermiddlewares.httperror --hidden-import scrapy.spidermiddlewares.offsite --hidden-import scrapy.spidermiddlewares.referer --hidden-import scrapy.spidermiddlewares.urllength --hidden-import scrapy.spidermiddlewares.depth --hidden-import scrapy.pipelines --hidden-import scrapy.dupefilters --hidden-import scrapy.core.downloader.handlers.datauri --hidden-import scrapy.core.downloader.handlers.file --hidden-import scrapy.core.downloader.handlers.http --hidden-import scrapy.core.downloader.handlers.s3 --hidden-import scrapy.core.downloader.handlers.ftp --hidden-import scrapy.core.downloader.webclient --hidden-import scrapy.core.downloader.contextfactory "C:/path/script.py"

If after, this your command returns: ImportError: No module named 'modulename' - add missing module to hidden imports and repeat this process with new extended hidden imports.
(I repeated this procedure 48 times in order to receive a working EXE file (and receive a list of submodules)!!)

Update

On Nov 12, 2019 (~ 6 month after I posted this answer ) pyinstaller added hooks that solved this specific import errors https://github.com/pyinstaller/pyinstaller/pull/4514/files

Contents of that pull request to pyinstaller pointing to this stack overflow question No such file or directory error using pyinstaller and scrapy

At this stage developers who will try to make exe from scrapy application - the most likely will have issue mentioned on Error after running .exe file originating from scrapy project which is completely different question.

Georgiy
  • 3,158
  • 1
  • 6
  • 18
  • Those are some very long command lines (1490 and 2296 characters, respectively). Not all command line processors may be able to handle them. Isn't there a better way? – Peter Mortensen Mar 26 '23 at 12:41
  • I made re-review of this and I found that contents of my answer.. are obsolete now. I've updated the answer. – Georgiy Jul 05 '23 at 20:44
1

I fixed it by using --hidden imports in the spec file. PyInstaller doesn't support all second-level module imports in Scrapy.

Run the PyInstaller command. Just update the spec file with the below hidden import changes,

hiddenimports=['scrapy.spiderloader','scrapy.statscollectors','scrapy.logformatter','scrapy.extensions','scrapy.extensions.logstats', 'scrapy.extensions.corestats','scrapy.extensions.memusage','scrapy.extensions.feedexport','scrapy.extensions.memdebug', 'scrapy.extensions.closespider','scrapy.extensions.throttle','scrapy.extensions.telnet','scrapy.extensions.spiderstate', 'scrapy.core.scheduler','scrapy.core.downloader','scrapy.downloadermiddlewares','scrapy.downloadermiddlewares.robotstxt', 'scrapy.downloadermiddlewares.httpauth','scrapy.downloadermiddlewares.downloadtimeout','scrapy.downloadermiddlewares.defaultheaders', 'scrapy.downloadermiddlewares.useragent','scrapy.downloadermiddlewares.retry','scrapy.core.downloader.handlers.http', 'scrapy.core.downloader.handlers.s3','scrapy.core.downloader.handlers.ftp','scrapy.core.downloader.handlers.datauri', 'scrapy.core.downloader.handlers.file','scrapy.downloadermiddlewares.ajaxcrawl','scrapy.core.downloader.contextfactory', 'scrapy.downloadermiddlewares.redirect','scrapy.downloadermiddlewares.httpcompression','scrapy.downloadermiddlewares.cookies', 'scrapy.downloadermiddlewares.httpproxy','scrapy.downloadermiddlewares.stats','scrapy.downloadermiddlewares.httpcache', 'scrapy.spidermiddlewares','scrapy.spidermiddlewares.httperror','scrapy.spidermiddlewares.offsite','scrapy.spidermiddlewares.referer', 'scrapy.spidermiddlewares.urllength','scrapy.spidermiddlewares.depth','scrapy.pipelines','scrapy.dupefilters','queuelib', 'scrapy.squeues',]

It was fixed with 45 module import issues. Using --onefile helps to run the Scrapy project in a single executable.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Arun Augustine
  • 1,690
  • 1
  • 13
  • 20
0

Make your Scrapy spider a Python script by following the updated documentation!

Follow the usual PyInstaller command to make the executable (make sure you are running it from inside your Scrapy project).

    pyinstaller --onefile filename.py
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131