Questions tagged [scrapy-settings]

9 questions
2
votes
3 answers

Scrapy: How to access the custom, CLI passed settings from the __init__() method of a spider class?

I need to access the custom settings passed from the CLI using: -s SETTING_NAME="SETTING_VAL" from the __init__() method of the spider class. get_project_settings() allows me to access only the static settings. The docs explain how you can access…
Nikolay Shindarov
  • 1,616
  • 2
  • 18
  • 25
2
votes
0 answers

Connect Scrapy crawler with S3

My crawler download from a URL a Request.body which I save on a file locally. Now I would like to connect to my aws-s3. I read the documentation but face two issues: 1. the config as well as the credential files are not of a dict type? my file is…
Freddy
  • 73
  • 8
1
vote
2 answers

unable to deploy scrapy to scrapyd server

I am trying to deploy my scrapy which connected to django project to scrapyd, but when I tried scrapyd-deploy JD -p JDSpider, it failed. It said No module named GradutionProject. It seems the scrapyd cannot detect "GradutionProject.settings" in…
Zheyuuu
  • 151
  • 1
  • 12
1
vote
2 answers

How to mark scrape failed because of 503 as error in Scrapy?

So I got status 503 when I crawl. It's retried, but then it gets ignored. I want it to be marked as an error, not ignored. How to do that? I prefer to set it in settings.py so it would apply to all of my spiders. handle_httpstatus_list seems will…
Aminah Nuraini
  • 18,120
  • 8
  • 90
  • 108
1
vote
1 answer

Scrapy - How to get duplicate request referer

When I turn on DUPEFILTER_DEBUG, I got: 2016-09-21 01:48:29 [scrapy] DEBUG: Filtered duplicate request: http://www.example.org/example.html> The problem is, I need to know the duplicate request's referrer to debug the code. How can I debug the…
Aminah Nuraini
  • 18,120
  • 8
  • 90
  • 108
0
votes
1 answer

AttributeError: module 'OpenSSL.SSL' has no attribute 'TLS_METHOD'

When trying to import scrapy in my Jupyter Notebooks via Anaconda (Windows), I get this error, which I haven´t been able to solve. I'm working with Python 3. What I did till now: pip install Scrapy pip install pyopenssl import scrapy and I get the…
0
votes
1 answer

How can I create log file with spider name in settings.py dynamically?

I have 20 different spiders and it works scheduled. End of the day, when I check the log file I am getting over 15.000 line log. My recent log setting in settings.py from datetime import datetime now = datetime.today() now_time =…
Murat Demir
  • 716
  • 7
  • 26
0
votes
1 answer

scrapy - error in settings that does not allow me to do anything with scrapy

I accidentally changed something in my scrapy settings (I was trying to debug a spider by creating a runner.py file), and I can't do anything now with scrapy. This is the error I am getting after running any scrapy related command in the command…
sophocles
  • 13,593
  • 3
  • 14
  • 33
0
votes
2 answers

How to create JOBDIR settings in Scrpay Spider dynamically?

I want to create JOBDIR setting from Spider __init__ or dynamically when I call that spider . I want to create different JOBDIR for different spiders , like FEED_URI in the below example class QtsSpider(scrapy.Spider): name = 'qts' …
Laxman
  • 17
  • 1
  • 5