2

I'm trying to using scrapy to deploy my crawler project to a scrapyd instance but calling the commend returns the following error:

Server response (200): {"status": "error", "message": "AttributeError: 'NoneType' object has no attribute 'module_name'"}

Here's my setup.py to build the python egg submitted during deploy:

from setuptools import setup, find_packages

setup(
    name = 'mycrawler',
    version = '0.1',
    packages = find_packages(),
    install_requires = [
        'scrapy',
        'PyMongo',
            'simplejson',
            'queue'
    ]
)

My scrapy.cfg:

[settings]
default = mycrawler.settings

[deploy:scrapyd_home_vm]
url = http://192.168.1.2:6800/
project = mycrawler

[deploy:scrapyd_local_vm]
url = http://192.168.38.131:6800/
project = mycrawler

I get the feeling that this has to do with the way the egg is being built but I'm not sure. I know that python throws an error like this when an you access an attribute on what should be an object but for whatever reason is actually null. I also do not have anything with the "module_name" attribute or anything that tries to reference it in my own code. Running the crawler from scrapy locally works just fine but deploying the egg does not.

The.Anti.9
  • 43,474
  • 48
  • 123
  • 161

3 Answers3

9

A rather late answer but I came across this same issue and found the solution.

The problem for me could be found by looking at the traceback emanating from scrapyd itself:

Traceback (most recent call last):
  File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/home/vagrant/venv/lib/python2.7/site-packages/scrapyd/runner.py", line 39, in <module>
    main()
  File "/home/vagrant/venv/lib/python2.7/site-packages/scrapyd/runner.py", line 34, in main
    with project_environment(project):
  File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
    return self.gen.next()
  File "/home/vagrant/venv/lib/python2.7/site-packages/scrapyd/runner.py", line 22, in project_environment
    activate_egg(eggpath)
  File "/home/vagrant/venv/local/lib/python2.7/site-packages/scrapyd/eggutils.py", line 13, in activate_egg
    settings_module = d.get_entry_info('scrapy', 'settings').module_name

AttributeError: 'NoneType' object has no attribute 'module_name'

As you can see it is trying to load the scrapy project's settings_module; the module_name attribute doesn't exist because d.get_entry_info is returning the None object.

The solution is to check your setup.py that is being used to generate the egg and check that the call to setup contains the following lines:

packages=find_packages(),
entry_points={'scrapy': ['settings = scraper.settings']},

Here, scraper.settings is the Python module path to the scrapy project's settings file. Change this to one which reflects your project layout and you should be dandy..

..if not, the key here is to check the output from scrapyd (enabling debug allows you to see that in the server response) to find the solution.

Darian Moody
  • 3,565
  • 1
  • 22
  • 33
  • Wouldn't that require bundling the whole project with scrapyd each time you want to deploy? If deploying scrapy from another location that's not the projects root folder it won't find the project, i.e. when deploying an egg to it. You would have to copy the whole folder with all the project files. – Computer's Guy Sep 21 '19 at 11:09
0

Are you using scrapyd-client package? If yes, then you don't even need a setup.py. I've came across that AttributeError because I already had a setup.py, so I deleted.

-1

This is a coding error, probably in your mycrawler module:

AttributeError: 'NoneType' object has no attribute 'module_name'

This means you are trying to access the attribute module_name in an object that was returned by some function or method, but the return value was None as opposed to being an object (probably the way in which the function or method indicates that an error occurred).

Check your code for places where you reference module_name name in a returned value.

Or it could be that scrapy requires that one of the objects you define and pass to it has to have the module_name attribute defined and you've forgotten to do so.

Finally, it could be a bug in scrapy.

But it very unlikely to be a problem with setuptools.

isedev
  • 18,848
  • 3
  • 60
  • 59
  • I'm aware of what actually causes the error from a python perspective, however it's not in my code. No where do I reference any attribute 'module_name'. And also, as stated at the end of the question, I have run it using just the scrapy command locally and it works just fine. – The.Anti.9 Jan 31 '13 at 20:56
  • fair enough... my bad for assuming you hadn't understood the error message. – isedev Jan 31 '13 at 20:59
  • For reference, just checked setuptools source... there's no reference in there to 'module_name', so at least my last statement should hold true :) – isedev Jan 31 '13 at 21:01