Questions tagged [python-requests-html]

Requests-HTML is a Python HTTP library built around the requests API, adding support for parsing HTML (with optional headless-browser support to render JavaScript).

Official web site

Requests is a web scraping library written in Python under the MIT license.

This library intends to make parsing HTML (e.g. scraping the web) simple, building on top of the python-requests library for the HTTP layer. It supports XPath and CSS selectors, User Agent spoofing, and optional headless-browser support to execute JavaScript scripts on the page.

535 questions

votes

2 answers

requests-html and infinite scrolling

I'm checking a python library: requests-html. Looks interesting, easy and clear scraping. However, I'm not sure how to render a page with infinite scrolling. From their documentation I understand that I should render a page with special attribute…

python-3.x python-requests-html

asked Jun 13 '19 at 17:48

Maciek

votes

1 answer

Requests-html package does not render properly for fast.com

I am developing a web-scraping app using python 3.7. I am using requests-html for parsing data. Up until now, I have tried the following code which tries to use the render function (since speed data on fast.com is loaded through javascript). from…

python-3.x python-requests-html

asked Feb 02 '19 at 22:14

rohan_aggarwal

votes

1 answer

How to find all Elements of a specific Type with the new Requests-HTML library

I wanna find all specific fields in a HTML, in Beautiful soup everything is working with this code: soup = BeautifulSoup(html_text, 'html.parser') urls_previous = soup.find_all('h2', {'class': 'b_algo'}) but how can I make the same search with the…

python python-3.x beautifulsoup screen-scraping python-requests-html

asked Oct 25 '18 at 18:04

FoldFence

2,674
4
33
57

votes

1 answer

Can't extract the result as expected when using requests_html

I can't extract the correct result with using requests_html: >>> from requests_html import HTMLSession >>> session = HTMLSession() >>> r = session.get('https://www.amazon.com/dp/B07569DYGN') >>>…

python python-3.x pyquery python-requests-html

asked Oct 08 '18 at 09:41

Lordran

votes

2 answers

Cannot find css class using Request HTML

After following this tutorial on finding a css class and copying the text on a website, I tried to implement this into a small text code but sadly it didnt work. I followed the tutorial exactly on the same website and did get the headline of the…

python css python-3.x python-requests-html

asked Aug 02 '18 at 22:37

Braincain007

votes

2 answers

Python append to array and for loop for it

I am trying to insert to array some links then for loop them ( to enter them). My code : import requests from requests_html import HTMLSession import sys links = [] link = "http://tvil.me" pagedata = HTMLSession().get(link) info =…

python python-3.x python-requests python-requests-html

asked May 03 '18 at 12:10

kaki

votes

1 answer

Python requests-html with Tor

The requirement is to scrap anonymously or change ip after certain number of calls. I use the https://github.com/kennethreitz/requests-html module to parse the HTML, but i get the below error, socks.SOCKS5Error: 0x01: General SOCKS server…

python python-3.x python-requests python-requests-html

asked Apr 20 '18 at 18:57

leaf

votes

3 answers

Python University Names and Abbrevations and Weblink

I want to prepare a dataframe of universities, its abbrevations and website link. My code: abb_url = 'https://en.wikipedia.org/wiki/List_of_colloquial_names_for_universities_and_colleges_in_the_United_States' abb_html =…

python pandas dataframe url python-requests-html

asked Sep 09 '22 at 00:17

Mainland

4,110
3
25
56

votes

1 answer

Python Requests HTML - Result of found item doesn't return content

Hi im scraping a webpage with my script, the problem is only one item (title) can be found correctly, other items only throw back html when grabbed like that: [] My…

python html python-3.x web-scraping python-requests-html

asked Jul 05 '22 at 14:05

dsadeq32423

votes

2 answers

How to get desktop version of the site?

I am using the requests library to parse the website but it returns the mobile version of the site. How can I get the HTML page of the desktop version? import requests sess = requests.Session() sess.get("https://google.com/")

python python-requests python-requests-html

asked Apr 19 '22 at 08:08

sasha

votes

0 answers

Using requests_html with arender() function gives RuntimeError: 'Event loop is closed'

I'm trying to scrape Amazon asynchronously with requests-html package. See the following code: import datetime from requests_html import AsyncHTMLSession import asyncio asins = ['B09GB6GSMM', 'B09GB7BCQH', 'B075D23KXX'] async def…

python-3.x asynchronous web-scraping python-requests-html

asked Apr 05 '22 at 15:35

nosta

votes

1 answer

Unable to find tag when data scraping

I am new to Python and I've been working on a program that alerts you when a new item is uploaded to jp.mercari.com (a shopping site). I have the alert part of the program working, but it operates based on the number of items that come up on the…

python html beautifulsoup html-parsing python-requests-html

asked Feb 27 '22 at 09:34

Leng1

votes

1 answer

How to Pass in an OAUTH Token Correctly

I'm trying to request information from an API. The way I'm passing in the OAUTH Token is wrong, I assume. import requests import json URL = "https://api.direct.yandex.com/json/v5/keywords" token = "/* Access Token */" PARAMS = { …

oauth-2.0 oauth python-requests python-requests-html yandex-api

asked Feb 01 '22 at 09:04

mvcast77

votes

1 answer

Web Scraping - Cloudflare Issues

I am trying to scrape https://www.carsireland.ie/search#q?%20scraper%20python=&toggle%5Bpoa%5D=false&page=1 (I had built a scraper but then they did a total overhaul of their website). The new website has a new format and has Cloudflare to provide…

python web-scraping python-requests python-requests-html

asked Dec 27 '21 at 02:49

MrSwan

votes

1 answer

Limiting number of concurrent AsyncIO tasks using Semaphore not working

Objective: I am trying to scrape multiple URLs simultaneously. I don't want to make too many requests at the same time so I am using this solution to limit it. Problem: Requests are being made for ALL tasks instead of for a limited number at a…

python web-scraping python-asyncio python-requests-html

asked Dec 23 '21 at 03:17

José Guedes

Prev 1 2

…

35 36 Next