Questions tagged [splash-js-render]

Splash JS is a javascript rendering service. It’s a lightweight web browser with an HTTP API, implemented in Python using Twisted and QT. It's Selenium's competitor.

https://splash.readthedocs.io/en/stable/

Splash - A javascript rendering service

Splash is a javascript rendering service. It’s a lightweight web browser with an HTTP API, implemented in Python using Twisted and QT. The (twisted) QT reactor is used to make the sever fully asynchronous allowing to take advantage of webkit concurrency via QT main loop. Some of Splash features:

  • process multiple webpages in parallel;
  • get HTML results and/or take screenshots;
  • turn OFF images or use Adblock Plus rules to make rendering faster;
  • execute custom JavaScript in page context;
  • write Lua browsing scripts;
  • develop Splash Lua scripts in Splash-Jupyter Notebooks.
  • get detailed rendering info in HAR format.
138 questions
2
votes
2 answers

Scrapy + Splash = Connection Refused

I installed Splash using this link. Followed all steps to installation, but Splash doesn't work. My settings.py file: BOT_NAME = 'Teste' SPIDER_MODULES = ['Test.spiders'] NEWSPIDER_MODULE = 'Test.spiders' DOWNLOADER_MIDDLEWARES = { …
Ricardo
  • 83
  • 1
  • 8
2
votes
0 answers

How to catch splash:on_response retry error and item?

I am using scrapy, splash, and scrapy_splash to scrape a catalog website. The website uses a form POST to open a new item details page. Sometimes the item detail page displays a default error page (not related to HTTP status) in Splash however if…
Charles Green
  • 413
  • 3
  • 15
1
vote
0 answers

Site does not render correctly in Splash

I'm trying to click on one of the country tabs in this archived site. The mouse_click() method seems to be working but the content is not updated, I've tried clicking many things in this site and nothing works, I've tried everything! Please help. My…
1
vote
0 answers

Get dinamically loaded images with splash

I have been working with scrapy + splash trying to scrape images from different websites. The thing is that some pages load the images dynamically and I can't get them fully loaded and the 'src' attribute is not there. I started using splash from…
1
vote
0 answers

change url while being signed in splash

Hi everyone I managed to sign in embedding a lua script within my scrapy code. However when I redirect the scraper it does not keep signed in. How can i do that? Here is my scrapy code: class ExampleSpider(scrapy.Spider): name = 'example' …
Federico
  • 11
  • 2
1
vote
1 answer

Missing items when scraping javascript rendered page using scrapy and splash

I am trying to scrape the following website for basic real estate listing information: https://www.propertyfinder.ae/en/search?c=2&fu=0&l=50&ob=nd&page=1&rp=y Parts of the website are dynamically loaded from a back end API when the page is scrolled…
1
vote
1 answer

Unable to click button in Lua Script with Splash

This problem is similar in nature to this question, yet my problem still persists after having attempted the provided solution. I want my Lua script to react to a modal popup in the case that it appears by closing it. However, the final screenshot,…
Luca Guarro
  • 1,085
  • 1
  • 11
  • 25
1
vote
0 answers

Scrapy-Splash not rendering this site

Javascript not rendering at https://iwilltravelagain.com/latin-america-caribbean/?page=1 Can you tell me what could be the problem? my settings: SPLASH_URL = 'http://localhost:8050' DOWNLOADER_MIDDLEWARES = { …
1
vote
0 answers

Scrape web page with tabs using splashR

I wish to scrape a web page that uses tabs. I didn't have much luck with rvest so I am trying splashR. Splash is a headless browser designed specifically for web scraping. As mentioned in this introduction, you will need access to a Splash…
ixodid
  • 2,180
  • 1
  • 19
  • 46
1
vote
2 answers

Access DOM of google.com using Lua script in Splash

I am trying to run a Lua script in Splash to perform a Google search and take the screenshot of search results. When I try to select the Google search box using xpath or css selector in my Lua script I get this error: { "error": 400, "type":…
Hades
  • 11
  • 1
1
vote
1 answer

How do I scrape from this website using scrapy and splash?

I'm a newbie and I'm trying to scrape the href link of each place listed in this website. Then I want to go into each link and scrape data but I'm not even able to get the href links from this code. However, I'm able to use the same xpath selector…
1
vote
1 answer

Why won't Splash render this webpage?

I'm quite new to Splash and tho I was able to get Splash setup on my Ubuntu 18 (via Splash/Docker) it gives me different results for this page: https://www.overstock.com/Home-Garden/Area-Rugs/31446/subcat.html Normally it's rendered like so: But…
1
vote
2 answers

Inside Splash, how to use src attribute to append to a url

------------ORIGINAL QUESTION------------------ In my Splash Script, I am trying to use "splash:go" on a new url that is based on the "src" attribute of an "img" tag. How can I access this "src" relative url and join it to a start_url? For example,…
ta_duke
  • 49
  • 1
  • 8
1
vote
1 answer

How to return both png and html in scrapy-splash?

If I had a html and png returned from a scrapy-splash request, how to I use that html to scrape an element while also using the png to save a png image? Do I write response.html and response.png?
ta_duke
  • 49
  • 1
  • 8
1
vote
0 answers

Can I use two endpoints of Splash 'render.html' and 'execute' from the same script so I don't have to make requests twice?

I think in the script below, Splash fetches twice the same url from distant server. In first case, it sends the html rendered since the endpoint is 'render.html' and for another request, we execute lua script to click a button and then send request…