Questions tagged [pyspider]

Python based Powerful Spider(Web Crawler) System

Used for

  • Write script in python with powerful API
  • Powerful WebUI with script editor, task monitor, project manager and result viewer
  • MySQL, MongoDB, SQLite as database backend
  • Javascript pages supported!
  • Task priority, retry, periodical and recrawl by age or marks in index page (like update time)
  • Distributed architecture
38 questions
1
vote
0 answers

Spider program Python AttributeError: Object has no attribute

I just start learning Python. And I want to write a Spider program to get some jokes on web. When I run the program it warning :'AttributeError: 'Spider_Model' object has no attribute 'pages'. And the solutions I've found online do not work.Here is…
卫俊杰
  • 11
  • 1
1
vote
1 answer

Extract text from 200k domains with scrapy

My problem is: I want extract all valuable text from some domain for example www.example.com. So I go to this website and visit all the links with the maximal depth 2 and write it csv file. I wrote the module in scrapy which solves this problem…
sacherus
  • 1,614
  • 2
  • 20
  • 27
0
votes
0 answers

Unable to run linear regression

I am running the linear regression, however, i am having an error, which i can't fix. Please help me with this error. Thank you so much import pandas as pd import matplotlib.pyplot as plt from sklearn import linear_model data =…
0
votes
0 answers

Why do I fail to submit data to textarea with python requests.post()

I want to use the requests.post tool to automatically query domain name attribution on this websitea website,But the return value is always empty, I guess it is because the post method failed to transfer the data to the textarea url =…
0
votes
0 answers

why can't run pyspider on macos, my environment is python3.8

(demo_pyspider_1) x###:demo_pyspider_1 x###$ pyspider all Traceback (most recent call last): pa"/Volumes/D/anaconda3/anaconda3/envs/demo_pyspider_1/lib/python3.8/multiprocessing/reduction.py", line 60, in dump ForkingPickler(file,…
0
votes
0 answers

Why is the pyspider module failing with"'collections' has no attribute 'MutableMapping'"?

After Pycharm installs pyspider,enter "pyspider all" on command,and an error is reported. C: C:\Users\admin\AppData\Local\Programs\Python\Python310\lib\site-packages\pyspider\libs\utils.py:196: FutureWarning: timeout is not supported on your…
0
votes
0 answers

Scrapy Python Xpath and CSS + splash : returns an empty list

I want to scrape the name of products in this web page, but i get an empty list. I used splash for dynamic webpage but the result is the same. Can someone tell me what to do ? any other solution ? This is the webpage that i want to scrape :…
Devpy
  • 1
  • 4
0
votes
1 answer

How to index all catalogue from Netflix, Hotstar and other OTT platforms

I am working on a project wherein I need to catalog all the movie and TV show titles from major OTT platforms such as Netflix, Hotstar, Hulu, and such. The metadata collected would be title name, genre, released date, available on. Further, any…
Atul
  • 5
  • 1
  • 3
0
votes
1 answer

why I am getting this error while installing : pip install pyobjc-framework-Quartz

why Iam getting this error ERROR: Command errored out with exit status 1: command: 'C:\Users\kodal\anaconda3\python.exe' -c 'import sys, setuptools, tokenize; sys.argv[0] =…
0
votes
1 answer

How Setup Number of Simultaneous requests in PYSPIDER

I'm trying to use Pyspider crawler to scan my site, I would like one request to be made every 2 seconds, but currently I know that 3 requests are made at the same time, I could not find the setting to change this parameter. I found in…
Overflow992
  • 31
  • 2
  • 5
0
votes
1 answer

Pyspider console : phantomjs not found, continue running without it

I try to start a scraping project with Pyspider, I installed the required libraries: Pyspider PhantomJs Tornado Wsgidav (the required version 2.4) Jsmin OK, after installation I got this error File…
0
votes
0 answers

Does PySpider have click or set value api in scraping?

while scraping many times we need to set some value on end site and then need to click on search for more results .
0
votes
1 answer

Trouble writing Scrapy selector

Very new to python, trying to explore the possibility of importing a long developed project from another language and a buddy swears that Python is my answer. I have the IDE up and running, scrapy working properly and properly kicking the 'name' and…
Ben
  • 3
  • 2
0
votes
1 answer

Why use BeautifulSoup find_all method will results in an error(list index out of range)?

The html like this:
...
  • title1 subtitle1
  • title2 subtitle2
  • WUSO01
    • 23
    • 8
    0
    votes
    0 answers

    I am trying to run scrapy crawl and getting this error "ModuleNotFoundError: No module named 'win32api'"

    I am trying to run scrapy crawl command in python 3.6 and getting this error ModuleNotFoundError: No module named 'win32api' I tried to use pip install win32api It says "Could not find a version that satisfies the requirement win32api (from…