0

I am working on a scraping project trying to scrape library information from this website: http://www.americanlibrarydirectory.com.

Friday (after much frustration) I wrote the following code, which finally worked.

def scrape_alpha():
    cj = cookielib.CookieJar()
    br = mechanize.Browser()
    br.set_cookiejar(cj)
    br.open("http://www.americanlibrarydirectory.com/Login.asp")
    br.select_form(name="FORM1")
    br.form['USERNAME'] = 'myemailaddress'
    br.form['PASSWORD'] = 'mypasscode'
    br.submit()
    print(br.response().read())
    alpha_url = "http://www.americanlibrarydirectory.com/browse.asp?Query=A"
    r = br.open(alpha_url).read()
    soup = BeautifulSoup(r)

Now I cam back to the project today and despite the fact that the code worked last week, it is not working today, and I don't have the faintest idea how to start figuring out what is wrong- it doesn't give me any error messages, it simply does not log in and I remain on the log-in page.

If I try to log in manually (not in code) that works, so I don't believe that the issue is that my email/password are not correct or that my account has expired. Does anyone have advice on what I should try to do?

Amie
  • 103
  • 12
  • Is it legal/accepted to scrape their website. Maybe they block your bot user agent now. – Pyfisch Jun 06 '16 at 19:05
  • I don't know. I thought of that and read some of their privacy/cookies info (http://www.infotoday.com/privacy.shtml) but haven't found (here or anywhere else) anything saying scraping is illegal. How would I know for sure if they are blocking me? – Amie Jun 06 '16 at 19:19
  • Clear the cookies for that site might help. – Ola Jun 06 '16 at 19:23

0 Answers0