0

I have a Scrapy project (here after called the_application) that has a dependency on a library (here after called the_library) fetched from a git repository, and everytime I attempt to deploy the Scrapy project by running scrapyd-deploy --include-dependencies the command fails with the following error:

'build/lib' does not exist -- can't clean it
'build/bdist.linux-x86_64' does not exist -- can't clean it
'build/scripts-3.10' does not exist -- can't clean it
  Running command git clone --filter=blob:none --quiet 'ssh://****@github.com/hrafnthor/my-library.git' /tmp/pip-install-no2qcx7z/my_library_19b012bf7ea747e9b2efaf39df47abac
  WARNING: Generating metadata for package my_library produced metadata for project name unknown. Fix your #egg=my_library fragments.
ERROR: Could not find a version that satisfies the requirement my_library (unavailable) (from versions: none)
ERROR: No matching distribution found for my_library (unavailable)
Traceback (most recent call last):
  File "/home/hrafn/Documents/dev/python/the_application/scraper/setup.py", line 5, in <module>
    setup(
  File "/home/hrafn/.local/lib/python3.10/site-packages/setuptools/__init__.py", line 108, in setup
    return distutils.core.setup(**attrs)
  File "/home/hrafn/.local/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 185, in setup
    return run_commands(dist)
  File "/home/hrafn/.local/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
    dist.run_commands()
  File "/home/hrafn/.local/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
    self.run_command(cmd)
  File "/home/hrafn/.local/lib/python3.10/site-packages/setuptools/dist.py", line 1221, in run_command
    super().run_command(command)
  File "/home/hrafn/.local/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
    cmd_obj.run()
  File "/home/hrafn/.local/lib/python3.10/site-packages/uberegg.py", line 35, in run
    self._install(self.requirements, self.bdist_dir)
  File "/home/hrafn/.local/lib/python3.10/site-packages/uberegg.py", line 45, in _install
    return subprocess.check_call(
  File "/usr/lib/python3.10/subprocess.py", line 369, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', '-m', 'pip', 'install', '-U', '-t', 'build/bdist.linux-x86_64/egg', '-r', 'requirements.txt']' returned non-zero exit status 1.

From the stacktrace it is clear that there is an issue with how the the_library dependency is defined. However I am not certain how to solve the issue.

The the_libary uses a pyproject.toml file for it self and the file contents are as follows:

[project]
name = "the_library"
version = "0.0.1"
description = ""
readme = "README.md"
requires-python = ">=3.7"
dependencies = []

[tool.setuptools]
package-dir = {"" = "src"}

[tool.setuptools.packages.find]
where = ["src"]

[build-system]
requires = ["setuptools", "wheel"]
build-backend = "setuptools.build_meta"

And the_application is a straight forward Scrapy project structure generated with scrapy startproject the_application and has a requirements.txt file with the following relevant content:

# omitting other dependencies

the_library @ git+ssh://git@github.com/hrafnthor/the_library.git@main
    # via -r requirements.in

# omitting other dependencies

The requirements.txt is generated via pip-compile requirements.in where the contents of requirements.in looks like this:

git+ssh://git@github.com/hrafnthor/the_library.git@main
scrapy

I have tried appending #egg=the_library to the dependency definition inside requirements.in without success. The issue seems to always be that the version information is missing when eggifying.

Update 1:

I looked further into this, and started by looking at what this uberegg library is doing. I'm not sure why this exactly is a dependency of scrapyd, as it just seems to wrap setuptools along with some file content iterations and logging. Seems like an unnecessary complication.

Anyway, the issue arrises from line 45 in the uberegg.py script which is invoking subprocess.check_call() which raises the error.

Strangely though there is no such issue raised if the offending command referenced in the stacktrace python3 -m pip install -U -t build/bdist.linux-x86_64/egg -r requirements.txt is called directly.

So from all this it would seem the issue isn't directly related to scrapyd. Unless it is related to the continued use of eggs (I thought those were deprecated more than a decade ago?).

Hrafn
  • 2,867
  • 3
  • 25
  • 44
  • setuptools deprecated the use of eggs. Not sure if that is the issue here but seems like it's time for scrapyd to get an update. – Alexander Mar 29 '23 at 23:04
  • @Alexander I was starting to worry that the issue wasn't on my end. – Hrafn Mar 30 '23 at 08:15
  • Turns out that this was entirely my own fault, as the library name as defined in pyproject.toml differed from the definition in the requirements.txt. Somehow pip was still able to install the dependency and use it when running the scraper locally, but packaging via `scrapyd-deploy` caused it to fail. – Hrafn Mar 30 '23 at 22:13

0 Answers0