0

From this question, the last responder seems to think that it is possible to use python to open a webpage, let me sign in manually, go through a bunch of menus then let the python parse the page when I get where I want. The website has a weird sign in procedure so using requests and passing a user name and password will not be sufficient.

However it seems from this question that it's not a possibility.

SO the question is, is it possible? if so, do you know of some example code out there?

Community
  • 1
  • 1
Jarrod
  • 609
  • 2
  • 8
  • 16
  • Who said you need selenium? https://github.com/jmcarp/robobrowser – OneCricketeer Mar 12 '17 at 04:22
  • What's so weird about the login procedure that requests can't handle? I've seen some doozies and, so long as you trap the conversation between the client and server, requests can do anything. – Carlos Mar 12 '17 at 04:30
  • @cricket_007 Thanks, I'll try this. – Jarrod Mar 12 '17 at 04:38
  • @cricket_007 No one really told me, just from looking around it seemed like I kept running into questions and articles using selenium. I just assumed that was what most people used. – Jarrod Mar 12 '17 at 04:52
  • @Carlos, I've not had any luck due to due the use of a external token to get a code to log in with. Maybe I can figure out a way for it to stop, and let me enter the code then let it do its thing afterwards. You're probably right, I can probably explore the requests option a bit more. – Jarrod Mar 12 '17 at 04:58
  • You'd probably need some way to pass the auth token to selenium. Does it need to be selenium?? Have you tried the requests module? You can probs auth in with requests and then parse the page.. I'd personally use chrome dev tools to figure out the requests and passed parameters. – reticentroot Mar 12 '17 at 05:44

1 Answers1

1

The way to approach this problem is when you login normally have the developer tools next to you and see what the request is sending.

When logging in to bandcamp the XHR request that's being sent is the following:

Bandcamp

From that response you can see that an identity cookie is being sent. That's probably how they identify that you are logged in. So when you've got that cookie set you would be authorized to view logged in pages.

So in your program you could login normally using requests, save the cookie in a variable and then apply the cookie to further requests using requests.

Of course login procedures and how this authorization mechanism works may differ, but that's the general gist of it.

So when do you actually need selenium? You need it if a lot of the things are being rendered by javascript. requests is only able to get the html. So if the menus and such is rendered with javascript you won't ever be able to see that information using requests.

Jonathan
  • 8,453
  • 9
  • 51
  • 74