Scrapy inside javascript

Question

I'm trying to scrape data from http://apps.cu-citizenaccess.org/

It looks like they are trying to extract data from JSON, so I used code similar to that recommended in Scrapy, scrapping data inside a javascript.

My current code (using Python 3) is

jsonresponse = json.loads(response.body_as_unicode())
print(jsonresponse["val.restname"])

I was wondering whether it's an error in technique, or whether I should be doing else entirely?

Wouldn't it be _much_ simpler to just make a web request to the JSON file at http://apps.cu-citizenaccess.org/restaurants/api/restaurants/?format=json&rest_closed=False ? — Benjamin Gruenbaum, Apr 11 '15 at 23:21
I'm glad I could help, in general in the future it's best not to post links to actual sites but to create a minimal code sample. Consider posting an answer to your own question explaining how you solved the issue. Welcome to the site. — Benjamin Gruenbaum, Apr 11 '15 at 23:31
Yes I will post a comprehensive summary. Do you mind letting me know how you found the link you posted? — wwl, Apr 11 '15 at 23:47
Using the network tab in the chrome developer tools http://discover-devtools.codeschool.com/ — Benjamin Gruenbaum, Apr 11 '15 at 23:49

score 1 · Answer 1 · answered Apr 12 '15 at 00:50

The simplest way is to access the actual JSON file at http://apps.cu-citizenaccess.org/restaurants/api/restaurants/?format=json&rest_closed=False ?

This can be located by using the network tab in Google Chrome's developer tools.

Initially the page may display 20 entries only. So you can add a parameter: "limit = 1000". Thereafter you can add "offset=1000" to display the remaining entries.

Then use a JSON to CSV converter to get both pages into CSV format if needed. Both CSV files can easily be merged with a program such as Excel.

Scrapy inside javascript

1 Answers1