Highest Voted 'scrapinghub' Questions

2

votes

1 answer

ScrapingHub and remote database

I'm creating a spider with scrapy, and I want to use MySQL database to get start_urls in my spider. Now I would like to know if it's possible to connect scrapy-cloud to a remote database?

mysql scrapy scrapinghub

asked Jul 20 '15 at 12:25

gueyebaba

51
4

2

votes

1 answer

Add settings in scrapinghub spider

I'm trying to enable mongodb in my spider in scrapinghub platform. For this I have to enable the extension via "EXTENSIONS" setting in the UI. But, while running the spider, I get the below error: ValueError: Some paths in…

mongodb scrapy scrapinghub

asked Jun 23 '15 at 14:18

user3295878

831
1
6
19

2

votes

1 answer

Delete spiders from scrapinghub

I am a new user of scrapinghub. I already searched on googled and had read the scrapinghub docs but I could not find any information about removing spiders from a project. Is it possible, how? I do not want to replace a spider, I want to…

web-crawler scrapy scrapinghub

asked May 04 '15 at 10:01

Inês Martins

530
2
10
23

2

votes

1 answer

Achieving Next page through javascript in scrapy python with splash?

Actually my intension is to achieve the Next from "href="javascript:submitAction_win0(document.win0,'HRS_APPL_WRK_HRS_LST_NEXT')", so Just for an example I am taking [this url][1]. From this url as you can see the Next at the end of the page, so if…

javascript python scrapy scrapinghub

asked Nov 20 '14 at 07:55

user4273328

1

vote

1 answer

Logitech Gaming Software LUA script not working on GHub

I’ve always used this LUA script on Logitech Gaming software by using a G502 mouse, I had to change my old mouse and I bought a new version “G502x” which is not recognized by LGS so I had to install GHub to make the mouse useful, but the script is…

logitech-gaming-software scrapinghub lua-scripting-library

asked Aug 29 '23 at 02:55

Loriner De Syrtis

11
1

1

vote

1 answer

Webscraping yml files from Github

I am trying to scrape certain open source file from GitHub but I'm having an issue with their new format. This if an example link: https://github.com/xavierLowmiller/xcodegen-action/blob/main/action.yml that leads to a YML file. I am trying to…

web-scraping beautifulsoup scrapinghub

asked Jul 13 '23 at 23:03

Artemis

15
3

1

vote

0 answers

Web scraping using Octoparse

I have been trying to use Octoparse to scrape data from a particular webpage. It has a total of 361 pages and 10 data rows on each page (total of 3610 data points). However, what I get is only 3260 data points. Normally the process works fine and…

web-scraping web-crawler scrapinghub ironwebscraper

asked Nov 03 '22 at 17:37

Anthony Nguyen

27
5

1

vote

1 answer

Crawlera/Zyte proxy authentication using C# and Selenium

I've tried a number of ways of using Zyte (formally Crawerla) proxies with Selenium. They provide 1- API key (username) 2- Proxy url/port. No password is needed. What I have tried... ChromeOptions options = new ChromeOptions(); var proxy =…

c# selenium selenium-chromedriver scrapinghub

asked Feb 19 '21 at 01:55

MattHodson

736
7
22

1

vote

1 answer

Not able to scrape image URLs using beautiful soup and python

So basically I am using the below code to scrape the image urls of the credit cards from the respective links in the explore_more_url variable. from urllib.request import urlopen from bs4 import BeautifulSoup import json, requests, re from selenium…

python web-scraping beautifulsoup python-requests scrapinghub

asked Feb 17 '21 at 13:14

user15215612

1

vote

1 answer

How can I scrape the image using Beautiful Soup and python

I am trying to scrape the image link from the below link but I am not able to Link : https://www.online.citibank.co.in/credit-card/rewards/citi-rewards-credit-card?eOfferCode=INCCCCTWAFCTRELM I have used the below code x = '…

python web-scraping beautifulsoup python-requests scrapinghub

asked Feb 11 '21 at 10:30

Ali Baba

85
11

1

vote

2 answers

Trying to scrape image urls but not able to get it using beautiful soup and python

I am scraping this link :…

python web-scraping beautifulsoup python-requests scrapinghub

asked Feb 11 '21 at 04:44

Ali Baba

85
11

1

vote

0 answers

I am using scrapy to scrape data from Yelp. I cannot see any error but data is not getting scraped from the StartURLs mentioned in the spider

Code for the items.py and other files are mentioned below. The logs are also mentioned at the end.I am not getting any error but according to the logs the scrapy has not scraped any pages. ``` import scrapy class YelpItem(scrapy.Item): #…

web-scraping scrapy scrapy-pipeline scrapinghub

asked Oct 10 '20 at 06:21

sneha s

11
1

1

vote

1 answer

How to iterate through a list of Beautful soup tag elements and get a particular text if found else an empty string?

Case1:

Derattizzazione Disinfestazione Punteruolo Rosso - Quark Srl

python-3.x web-scraping beautifulsoup scrapy scrapinghub

asked Jun 27 '20 at 19:03

dashkandhar

83
1
7

1

vote

0 answers

504 Timeout Exception when using scrapy-splash with crawlera

I tried scrapy-splash with http://www.google.com and followed all the prerequisite steps given in the following Github Repo https://github.com/scrapy-plugins/scrapy-splash and i was able to render the Google page. However when i tired the same…

python scrapy scrapy-splash scrapinghub crawlera

asked May 26 '20 at 09:36

Shashikiran Neelakantaiah

77
6

1

vote

1 answer

ScrapingHub Deploy Fails

I am trying to deploy to ScrapingHub and here is the error I am getting... Deploy log last 30 lines: File "/app/python/lib/python3.8/site-packages/scrapy/cmdline.py", line 142, in execute cmd.crawler_process = CrawlerProcess(settings) File…

scrapy scrapinghub

asked May 11 '20 at 13:22

johncsmith427

83
8

Prev 1 2

3

…

11 12 Next

Questions tagged [scrapinghub]

Derattizzazione Disinfestazione Punteruolo Rosso - Quark Srl