Questions tagged [python-requests-html]

Requests-HTML is a Python HTTP library built around the requests API, adding support for parsing HTML (with optional headless-browser support to render JavaScript).

Official web site

Requests is a web scraping library written in Python under the MIT license.

This library intends to make parsing HTML (e.g. scraping the web) simple, building on top of the library for the HTTP layer. It supports XPath and CSS selectors, User Agent spoofing, and optional headless-browser support to execute JavaScript scripts on the page.

535 questions
4
votes
0 answers

flask: RunTimeError: There is no current event loop in thread

I am making a simple flask app that returns results from an external website. The user enters data into my site. This data is used to load another site. Data is extracted and returned in a list. This program works independently, but will not work as…
Alex
  • 51
  • 5
4
votes
1 answer

Python Requests run JS file from GET

Goal To log in to this website (https://www.reliant.com) using python requests etc. (I know this could be done with selenium or PhantomJS or something, but would prefer not to) Problem During the log in process there a couple of redirects where…
4
votes
0 answers

Python html-requests render() doesn't render javascript elements

I am attempting to scrape a website which as well as requiring a login, the core data is rendered with javascript and XHR files. I am using the html-requests library, however the render() function appears to have no effect on the webpage. Here is my…
4
votes
2 answers

Parse element's tail with requests-html

I want to parse an HTML document like this with requests-html 0.9.0: from requests_html import HTML html = HTML(html='important data and some rubbish') data = html.find('.data', first=True) print(data.html) #…
Norrius
  • 7,558
  • 5
  • 40
  • 49
3
votes
1 answer

Getting the price of the game from EGS

I'm trying to get the price of the game from the epic games store, but I get a 403 error import requests from bs4 import BeautifulSoup url = "https://store.epicgames.com/ru/p/cities-skylines" response = requests.get(url) if response.status_code…
3
votes
1 answer

Full text returned in requests-html not just first

I am trying to scrape specific elements of the internship page below using requests-html. I specifiy that first=True but when I print the text out it prints everything on the page starting with the element I selected instead of returning just that…
Kyle Roark
  • 31
  • 1
3
votes
0 answers

How to set height and width dimensions when sending a request with Requests-HTML

I am using requests-html to get data on this website: https://rl.insider.gg/en/pc. Whenever I send a request to the site and attempt to render the Javascript, I keep getting redirected to the mobile site. I did some investigating and found out that…
3
votes
1 answer

pass ip address in python requests instead of url link

i wanna pass ip address in python requests instead of url link. e.g: instead of requests.get(url="https://www.google.com") use: requests.get(url="172.217.168.228") So basically pass ip instead of url. How can i do this? i guess i should pass…
Ali
  • 922
  • 1
  • 9
  • 24
3
votes
1 answer

Find a string in python beautifulsoup that shares the same class as other string

I'm trying to scrape data from a bitcoin transaction. I want to parse through the source and get the amount being sent. I'm currently using the BeautifulSoup library to achieve this. The class that contains the amount being sent it being used by…
3
votes
0 answers

requests_html library error=Cannot use HTMLSession within an existing event loop. Use AsyncHTMLSession instead

I was trying to use the library requests_html in my jupyter notebook when I step with the error of: Cannot use HTMLSession within an existing event loop. Use AsyncHTMLSession instead. I search for solutions and found this one: import…
Ismael
  • 99
  • 1
  • 8
3
votes
1 answer

error: AttributeError: 'coroutine' object has no attribute 'newPage' when doing youtube webscraping

I'm trying to do a webscraping on youtube to get the information from a video, however it is giving an error and it seems that it is in the renders () of requests_html, code below: from requests_html import AsyncHTMLSession import…
3
votes
1 answer

how to save mutliple function arguments into one variable

I want to save multiple function arguments into one variable so that when I want to change something I will have to change it once and not four times. this is my code def request_function(self): if self.request_type == "get": return…
Saba
  • 416
  • 3
  • 14
3
votes
0 answers

Certificate expired while running requests-html

I have been trying to use requests-html in a venv environment (python 3.7.0 - MacOS 10.15.1), however I am dealing with some certificate issue (I'm not behind any proxy/firewall): The main call is : from requests_html import HTMLSession sessao =…
3
votes
0 answers

Submit (POST) form with the python requests-html library fails

I'm using python 3.7 and the requests-html library. I have tried to send a get request in a session to a site with a form. First I use the response to get the CAPTCHA image and download it, and than send a POST request in the same session including…
Allon
  • 55
  • 4
3
votes
1 answer

I cannot seem to handle blank results from regex(re.search) in python, i either get duplicates or no results?

I am trying to pull list of individuals from https://www.ourcommons.ca/Parliamentarians/en/members?view=List. Once I have the list I go through each members link and try to find their email address. Some of the members don't have email as a result…
1
2
3
35 36