Questions tagged [pyspider]

Python based Powerful Spider(Web Crawler) System

Used for

Write script in python with powerful API
Powerful WebUI with script editor, task monitor, project manager and result viewer
MySQL, MongoDB, SQLite as database backend
Javascript pages supported!
Task priority, retry, periodical and recrawl by age or marks in index page (like update time)
Distributed architecture

38 questions

votes

1 answer

Why is this code only downloading one page's data?

I have tried many times, but it does not work: import requests from lxml import html, etree from selenium import webdriver import time, json #how many page do you want to scan page_numnotint = input("how many page do you want to scan") page_num =…

python python-3.x pyspider

asked May 07 '17 at 12:41

周义翔

votes

1 answer

Pyspider Installation for Python 3.5/win 64 "Failed building wheel for lxml

I'm trying to install pyspider and always got "Failed building wheel for lxml...", It looks like the lxml is not installed properly and I've tried to download lxml-3.6.1-cp35-cp35m-win_amd64.whl from…

python lxml python-wheel pyspider

asked Aug 05 '16 at 07:32

Li Yin

votes

1 answer

pyspider : No module named 'wsgidav'

I am using python 3.5.2 on windows 10,I installed pyspider,and run pyspider all,there are some errors,as follow: what should I do?

python-3.x pyspider wsgidav

asked Jun 29 '16 at 10:11

zwl1619

4,002
14
54
110

votes

1 answer

I want store output of python pyspider script to csv or json

Here My code which i made: import json from pyspider.libs.base_handler import * f = open("demo.txt","w") class Handler(BaseHandler): crawl_config = { } @every(minutes=0,seconds = 0) def on_start(self): self.crawl('Any URL',…

python json csv pyspider

asked Jun 28 '16 at 07:02

Piyush

votes

0 answers

how scrapy and pyspider send requests to web server

I am learning the creeper frame: scrapy and pyspider, and I am curious about how do they send requests to web server. Does they use the python module: requests, or buit-in module urllib? Any advice is helpful. Thank you.

python scrapy pyspider

asked May 17 '16 at 03:41

gaoxinge

-1

votes

1 answer

How to hide the continuous hit rates(Refresh) to a website

I have developed a Python (Requests) and Java code to scrap data from a Website. And it will work by continuously refresh the website for new data. But the Website recently identified my scraper as an Automated Service and my account had been Locked…

web-scraping python-requests scrapy pyspider

asked Jun 11 '18 at 23:16

sam mathew

-2

votes

1 answer

why xpath output keeps changing?

the problem i am facing is weird and its so much waste of time in theory this should give out the link next_page = response.xpath('//ul[@class="pagination justify-content-center"]/li[6]/a/@href').get() but the out put i got is…

python-3.x xpath scrapy web-crawler pyspider

asked Feb 20 '23 at 18:05

user21195292

-6

votes

2 answers

Imported but unused in python

import bumpy as np import matplotlib.pyplot as per import pandas as pd. Console showing some warning. Can anyone help me with this

python pyspider

asked Aug 08 '19 at 09:45

nag6631

Prev 1 2