Questions tagged [python-requests-html]

Requests-HTML is a Python HTTP library built around the requests API, adding support for parsing HTML (with optional headless-browser support to render JavaScript).

Official web site

Requests is a web scraping library written in Python under the MIT license.

This library intends to make parsing HTML (e.g. scraping the web) simple, building on top of the library for the HTTP layer. It supports XPath and CSS selectors, User Agent spoofing, and optional headless-browser support to execute JavaScript scripts on the page.

535 questions
3
votes
2 answers

requests-html and infinite scrolling

I'm checking a python library: requests-html. Looks interesting, easy and clear scraping. However, I'm not sure how to render a page with infinite scrolling. From their documentation I understand that I should render a page with special attribute…
Maciek
  • 33
  • 1
  • 8
3
votes
1 answer

Requests-html package does not render properly for fast.com

I am developing a web-scraping app using python 3.7. I am using requests-html for parsing data. Up until now, I have tried the following code which tries to use the render function (since speed data on fast.com is loaded through javascript). from…
3
votes
1 answer

How to find all Elements of a specific Type with the new Requests-HTML library

I wanna find all specific fields in a HTML, in Beautiful soup everything is working with this code: soup = BeautifulSoup(html_text, 'html.parser') urls_previous = soup.find_all('h2', {'class': 'b_algo'}) but how can I make the same search with the…
3
votes
1 answer

Can't extract the result as expected when using requests_html

I can't extract the correct result with using requests_html: >>> from requests_html import HTMLSession >>> session = HTMLSession() >>> r = session.get('https://www.amazon.com/dp/B07569DYGN') >>>…
Lordran
  • 649
  • 8
  • 15
3
votes
2 answers

Cannot find css class using Request HTML

After following this tutorial on finding a css class and copying the text on a website, I tried to implement this into a small text code but sadly it didnt work. I followed the tutorial exactly on the same website and did get the headline of the…
Braincain007
  • 51
  • 1
  • 6
3
votes
2 answers

Python append to array and for loop for it

I am trying to insert to array some links then for loop them ( to enter them). My code : import requests from requests_html import HTMLSession import sys links = [] link = "http://tvil.me" pagedata = HTMLSession().get(link) info =…
kaki
  • 103
  • 2
  • 9
3
votes
1 answer

Python requests-html with Tor

The requirement is to scrap anonymously or change ip after certain number of calls. I use the https://github.com/kennethreitz/requests-html module to parse the HTML, but i get the below error, socks.SOCKS5Error: 0x01: General SOCKS server…
2
votes
3 answers

Python University Names and Abbrevations and Weblink

I want to prepare a dataframe of universities, its abbrevations and website link. My code: abb_url = 'https://en.wikipedia.org/wiki/List_of_colloquial_names_for_universities_and_colleges_in_the_United_States' abb_html =…
Mainland
  • 4,110
  • 3
  • 25
  • 56
2
votes
1 answer

Python Requests HTML - Result of found item doesn't return content

Hi im scraping a webpage with my script, the problem is only one item (title) can be found correctly, other items only throw back html when grabbed like that: [] My…
2
votes
2 answers

How to get desktop version of the site?

I am using the requests library to parse the website but it returns the mobile version of the site. How can I get the HTML page of the desktop version? import requests sess = requests.Session() sess.get("https://google.com/")
sasha
  • 135
  • 11
2
votes
0 answers

Using requests_html with arender() function gives RuntimeError: 'Event loop is closed'

I'm trying to scrape Amazon asynchronously with requests-html package. See the following code: import datetime from requests_html import AsyncHTMLSession import asyncio asins = ['B09GB6GSMM', 'B09GB7BCQH', 'B075D23KXX'] async def…
2
votes
1 answer

Unable to find tag when data scraping

I am new to Python and I've been working on a program that alerts you when a new item is uploaded to jp.mercari.com (a shopping site). I have the alert part of the program working, but it operates based on the number of items that come up on the…
2
votes
1 answer

How to Pass in an OAUTH Token Correctly

I'm trying to request information from an API. The way I'm passing in the OAUTH Token is wrong, I assume. import requests import json URL = "https://api.direct.yandex.com/json/v5/keywords" token = "/* Access Token */" PARAMS = { …
2
votes
1 answer

Web Scraping - Cloudflare Issues

I am trying to scrape https://www.carsireland.ie/search#q?%20scraper%20python=&toggle%5Bpoa%5D=false&page=1 (I had built a scraper but then they did a total overhaul of their website). The new website has a new format and has Cloudflare to provide…
2
votes
1 answer

Limiting number of concurrent AsyncIO tasks using Semaphore not working

Objective: I am trying to scrape multiple URLs simultaneously. I don't want to make too many requests at the same time so I am using this solution to limit it. Problem: Requests are being made for ALL tasks instead of for a limited number at a…
1 2
3
35 36