Questions tagged [splash-js-render]

Splash JS is a javascript rendering service. It’s a lightweight web browser with an HTTP API, implemented in Python using Twisted and QT. It's Selenium's competitor.

https://splash.readthedocs.io/en/stable/

Splash - A javascript rendering service

Splash is a javascript rendering service. It’s a lightweight web browser with an HTTP API, implemented in Python using Twisted and QT. The (twisted) QT reactor is used to make the sever fully asynchronous allowing to take advantage of webkit concurrency via QT main loop. Some of Splash features:

process multiple webpages in parallel;
get HTML results and/or take screenshots;
turn OFF images or use Adblock Plus rules to make rendering faster;
execute custom JavaScript in page context;
write Lua browsing scripts;
develop Splash Lua scripts in Splash-Jupyter Notebooks.
get detailed rendering info in HAR format.

138 questions

votes

1 answer

Scrapy does not fetch markup on response.css

I've built a simple scrapy spider running on scrapinghub: class ExtractionSpider(scrapy.Spider): name = "extraction" allowed_domains = ['domain'] start_urls = ['http://somedomainstart'] user_agent = "Mozilla/5.0 (Windows NT 10.0;…

asked Aug 27 '19 at 15:37

qubits

1,227
3
20
50

votes

0 answers

FileNotFoundError: [Errno 2] after pushing splash to heroku

I'm trying to deploy the latest scrapinghub/splash I am using git-bash on win10. I forked the repo to https://github.com/kc1/splash/blob/master and I have been trying to follow Using docker, scrapy splash on Heroku to modify the docker file After…

linux docker heroku splash-js-render

asked May 31 '19 at 14:05

user1592380

34,265
92
284
515

votes

2 answers

Storing responses as files using Scrapy Splash

I'm creating my first scrapy project with Splash and work with the testdata from http://quotes.toscrape.com/js/ I want to store the quotes of each page as a separate file on disk (in the code below I first try to store the entire page). I have the…

python web-scraping scrapy scrapy-splash splash-js-render

asked Oct 14 '20 at 10:02

Adam

6,041
36
120
208

votes

2 answers

Getting a response body with scrapy splash

I'm working with scrapy 1.6 and splash 3.2 I have: import scrapy import random from scrapy_splash import SplashRequest from scrapy.utils.response import open_in_browser from scrapy.linkextractors import LinkExtractor USER_AGENT = 'Mozilla/5.0…

python scrapy scrapy-splash splash-js-render

asked Jun 24 '19 at 21:21

user1592380

34,265
92
284
515

votes

1 answer

Click Button in Scrapy-Splash

I am writing a scrapy-splash program and I need to click on the display button on the webpage, as seen in the image below, in order to display the data, for 10th edition, so I can scrape it. I have the code I tried below but it does not work. The…

python scrapy scrapy-splash splash-js-render

asked Jun 21 '19 at 15:22

Tim

votes

1 answer

Scrapy with Splash doesn't wait for website to load

I am trying to render and scrape an interactive website by invoking Splash through the Python script, basically following this tutorial: import scrapy from scrapy_splash import SplashRequest class MySpider(scrapy.Spider): start_urls =…

python scrapy scrapy-splash splash-js-render

asked Aug 11 '18 at 19:35

Zed

5,683
11
49
81

votes

0 answers

Scrapy + Splash returns a lot of 504 Time Out errors

I have followed Splash's FAQ for production setups and my system currently looks like this: 1 Scrapy Container with 6 concurrency requests. 1 HAProxy Container that load balance to splash containers 2 Splash Containers with 3 slots each. I use…

amazon-web-services scrapy scrapy-splash splash-js-render

asked Jul 03 '18 at 10:38

Marcus Lind

10,374
7
58
112

votes

2 answers

Scrapy Splash click button doesn't work

What I'm trying to do On avito.ru (Russian real estate site), person's phone is hidden until you click on it. I want to collect the phone using Scrapy+Splash. Example URL:…

python scrapy splash-js-render

asked Mar 14 '18 at 11:19

alexanderlukanin13

4,577
26
29

votes

0 answers

Having content security policy issue with scrapy and splash

What I am doing is Google for some linkedin specific links Login to linkedin.com (successful) Revisit home page (it fails here) Extract some the the desired info from links I googled in first step My scrapy bot fails at step 3. So my questions…

scrapy scrapy-splash splash-js-render

asked Dec 29 '17 at 06:35

sakhunzai

13,900
23
98
159

votes

1 answer

How to get cookie generated from a Scrapy Splash request?

So I have made a Scrapy Splash request like this: def start_requests(self): lua_script = ''' function main(splash) local url = splash.args.url assert(splash:go(url)) assert(splash:wait(0.5)) return { cookies =…

python lua scrapy scrapy-splash splash-js-render

asked Oct 24 '17 at 00:31

Aminah Nuraini

18,120
8
90
108

votes

1 answer

Scrapy Splash is always returning the same page

For each of several Disqus users, whose profile urls are known in advance, I want to scrape their names and usernames of their followers. I'm using scrapy and splash do to so. However, when I'm parsing the responses, it seems that it is always…

python web-scraping scrapy scrapy-splash splash-js-render

asked Aug 07 '17 at 21:20

Milos

votes

2 answers

Execute inline JavaScript in Scrapy response

I am trying to log into a website with Scrapy, but the response received is an HTML document containing only inline JavaScript. The JS redirects to the page I want to scrape data from. But Scrapy does not execute the JS and therefore doesn't route…

javascript python scrapy scrapy-splash splash-js-render

asked Jun 22 '17 at 10:10

Craig

votes

3 answers

How to scrape AJAX based websites by using Scrapy and Splash?

I want to make a general scraper which can crawl and scrape all data from any type of website including AJAX websites. I have extensively searched the internet but could not find any proper link which can explain me how Scrapy and Splash together…

javascript ajax scrapy scrapy-splash splash-js-render

asked Jun 08 '17 at 12:43

Rohan

votes

1 answer

Proxy servers with Scrapy-Splash

I am trying to get proxy servers to work on my local splash instance. I have read several documents, but have not found any workable examples. It was brought to my attention that this https://github.com/scrapy-plugins/scrapy-splash/issues/107 was…

python web-scraping scrapy scrapy-splash splash-js-render

asked Mar 29 '17 at 10:01

eusid

votes

1 answer

Scrapy-Splash with Tor

I have succeed to run Scrapy with Tor using this link: http://pkmishra.github.io/blog/2013/03/18/how-to-run-scrapy-with-TOR-and-multiple-browser-agents-part-1-mac/ But i couldn't run Splash with Tor. In Scrapy-settings.py I directed to polipo for…

scrapy tor scrapy-splash splash-js-render polipo

asked Feb 16 '17 at 07:46

Remzi Meric Ceylan

Prev 1

…

9 10 Next