2

Suppose there is a password-protected website that I want to access to scrape some info from it and put it into a spreadsheet. For example, it could be my personal credit card account page and I would be scraping info about the latest transactions.

A variation of this would be if the site allowed to download the transaction info as a CSV file, in which case I would want to download that file.

If I want to write such scraper in Python, what packages should I use for the task? Does it depend on how a specific website is implemented, i.e. I might need one tool to scrape one site and another tool to scrape another.

Thank you

I Z
  • 5,719
  • 19
  • 53
  • 100

1 Answers1

1

I actually did something very similar to this, but in node. Are you definitely wanting to do this in Python?

If you want to stick to Python, take a look at these modules:

BeautifulSoup

requests

Someone wrote a really awesome module combining the above two modules:

Robobrowser

If you would like to venture down the node route, take a look at this:

nightmarejs

idjaw
  • 25,487
  • 7
  • 64
  • 83
  • `nightmarejs` sounds... promising :-) For now I want to stick with python so I'll definitely check your links. Might try node later. Thx – I Z Sep 21 '15 at 01:49