1

I would like to use PyQuery for extract information from a site that requires authentication.

I can access the site "manually" and then see the resulting cookie in Firefox's

Tools > Web Developer > Storage Inspector.

Can I somehow use this cookie in conjunction with PyQuery?

This should (hopefully) save me from reverse-engineering the authentication process (which apparently redirects to Shibbolet). I am imagining a combination of exporting the cookie from Firefox and then using it when initializing PyQuery.

zx485
  • 28,498
  • 28
  • 50
  • 59
rookie099
  • 2,201
  • 2
  • 26
  • 52
  • Why don't you use something like [requests](http://docs.python-requests.org/en/master/) to get the data from an URL (including the headers etc)? – Laur Ivan Jan 09 '18 at 13:43
  • @LaurIvan Thx, the link explains how to pass cookies to requests. If you want to write up your comment as answer, I shall accept it. – rookie099 Jan 09 '18 at 15:36

1 Answers1

0

From pyQuery's documentation:

pyquery allows you to make jquery queries on xml documents

IMHO, this is not the correct tool if you need to add cookies (e.g. session id) to the message. Instead, you'd need to use something like requests. Reproducing the sample from the documentation:

url = 'http://httpbin.org/cookies'
cookies = dict(cookies_are='working')

r = requests.get(url, cookies=cookies)
print(r.text)

# output: '{"cookies": {"cookies_are": "working"}}'

You can combine requests and pyQuery to process forms and fully automate the process without passing through firefox.

Depending on the actual problem, you might consider something like selenium.

Laur Ivan
  • 4,117
  • 3
  • 38
  • 62