Crawl data from an internal room booking website

Question

currently I have a small task about crawl data from an internal web, but I still don't know where to start.

I have an internal website about lab-booking, you'll first need to enter username and password for access.

Come to the booking page, let say after filtered, I get a list of booking information of the lab A in 7 days, means that you will have 7 tables separately with columns are 0, 15, 30, 45, represent for minutes, and rows are 7:00, 8:00, .... 18:00 represent for hours. When you click on each cell, a new window appears with information contain in text boxes about the lab, and its status (Free/ Reserved). If the status is "Reserved", it comes with the info of who is booking, and till when. If the status is "Free", it comes with a form for you to fill in your booking information, but I guess we won't care much about this. My goal for this is after crawling the data, I'll have a csv file with columns are days, and rows are times, with information in the cells are who is booking when for reserved time slots. It can contain null value if that time slot is free.

Because this is our company's common internal booking website, but there's a lab rule when using in our place, so I need to check if anyone violate the lab booking rule or not, first by collect the data automatically. I have wrote a crawler from some websites by python, but those didn't come with this format so I'm a bit lost.

Instead of crawling the website, a simpler option is to get the data from the back-end DB, and work on that data. — Prem, Oct 16 '18 at 04:25

score 0 · Answer 1 · answered Oct 16 '18 at 04:35

0

If you are trying to automate this process I would suggest Selenium[1]: https://selenium-python.readthedocs.io/

Or if it just crawling you can go for packages like Urllib2 or Requests in combination with Beautiful Soup.

answered Oct 16 '18 at 04:35

Chai

396
2
5

Crawl data from an internal room booking website

1 Answers1