-1

I'm very new to python, and I'm trying to scrape a webpage using BeautifulSoup, which requires a log in.

So far I have

import mechanize
import cookielib
import requests
from bs4 import BeautifulSoup

# Browser
br = mechanize.Browser()

# Cookie Jar
cj = cookielib.LWPCookieJar()
br.set_cookiejar(cj)

br.open('URL')

#login form
br.select_form(nr=2)
br['email'] = 'EMAIL'
br['pass'] = 'PASS'
br.submit()

soup = BeautifulSoup(br.response().read(), "lxml")
with open("output1.html", "w") as file:
    file.write(str(soup))

(With "URL" "EMAIL" and "PASS" being the website, my email and password.)

Still the page I get in output1.html is the logged out page, rather than what you would see after logging in? How can I make it so it logs in with the details and returns what's on the page after log in?

Cheers for any help!

M. Ram
  • 1

1 Answers1

0

Let me suggest another way to obtain desired page. It may be a little bit easy to troubleshoot.

  1. First, you should login manually with open any browser Developer tools's page Network. After sending login credentials, you will get a line with POST request. Open the request and right side you will get the "form data" information.

Chrome browser Developer tools screenshot

  1. Use this code to send login data and get response:

`

from bs4 import BeautifulSoup
import requests

session = requests.Session()

url = "your url"

req = session.get(url)
soup = BeautifulSoup(req.text, "lxml")

# You can collect some useful data here (like csrf code or some token)

#fill in form data here
params = {'login': 'your login',
          'password': 'your password'}

req = session.post(url)

I hope this code will be helpful.

Roman Mindlin
  • 852
  • 1
  • 8
  • 12