DISCLAIMER: I'm just learning by doing, I have no bad intentions
So, I would like to fetch the list of the applications listed on this website: http://roaringapps.com/apps
I've done similar things in the past, but with simpler websites; this time I'm having problems getting my hands on the data behind this webpage.
The scrolling from page to page is blazing fast so, to understand how the webpage works, I've fired up a packet sniffer and analyzed the traffic. I've noticed that, after the initial loading, no traffic is exchanged between the server and my client, even if I scroll over 2500 records in the browser. How is that possible?
Anyhow. My understanding is that the website is loading the data from a stream of some sort, and render it via Javascript. Am I correct?
So, I've fired up chromium devtools a looked at the "network" tab, and saw that a WebSocket request is made to the following address: wss://s-usc1c-nss-123.firebaseio.com
At this point, after googling a bit, I've tried to query the very same server, using the "v=5&ns=roaringapps" query I saw on the devtools window:
from websocket import create_connection
ws = create_connection('wss://s-usc1c-nss-123.firebaseio.com')
ws.send('v=5&ns=roaringapps')
print json.loads(ws.recv())
And got this reply:
{u't': u'c', u'd': {u't': u'h', u'd': {u'h': u's-usc1c-nss-123.firebaseio.com', u's': u'JUL5t1nC2SXfGaIjwecB6G13j1OsmMVv', u'ts': 1476799051047L, u'v': u'5'}}}
I was expecting to see a json response with the raw data about applications & so on. What I'm doing wrong?
Thanks a lot!
UPDATE
Actually, I just found out that the website is using json to load its data. I was not seeing it in iterated requests probably because of caching - but disabling it in chromium did the trick.