1

I'm using grequests to scape websites faster. However, I also need to login to the website.

Before (just using requests) I could do:

where headers is my User-Agent.

with requests.Session() as s: 
    s.headers.update(headers)
    s.post(loginURL, files = data)
    s.get(scrapeURL)

Using grequests I've only been able to pass headers by doing:

rs = (grequests.get(u, headers=header) for u in urls)
response = grequests.map(rs)

Is there anyway to do a POST at the same time so I can login? The login URL is differnt than the URL(s) I'm scrapping.

Rafael
  • 3,096
  • 1
  • 23
  • 61

2 Answers2

3

First login the session, then pass it explicitly to your grequest like this:

requests = []
for url in urls:
    request = grequests.AsyncRequest(
        method='GET', 
        url=url, 
        session=session,
    )
    requests.append(request)
wim
  • 338,267
  • 99
  • 616
  • 750
3

You can pass in the Session object exactly the same as the headers:

with requests.Session() as s: 
    s.headers.update(headers)
    s.post(loginURL, files = data)
    s.get(scrapeURL)


    rs = (grequests.get(u, headers=header, session=s) for u in urls)
    response = grequests.map(rs)
Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
  • 1
    Perfect!!. I'm sure this is a copy and paste error on your part (as these typos were in my question too) but it should be `requests.Session()` and there's a period at the very end – Rafael Oct 11 '16 at 22:56