1

I want to visualize in an earth map all feeds from the user 'airqualityegg'. In order to do this I wrote the following script with Python (if you are gonna try yourself, indent correctly the code in the text editor you use):

import json  
import urllib
import csv

list=[]

for page in range(7):
   url = 'https://api.xively.com/v2/feeds?user=airqualityegg&per_page=100page='+str(page)
   rawData=urllib.urlopen(url)

   #Loads the data in json format
   dataJson = json.load(rawData)
   print dataJson['totalResults']
   print dataJson['itemsPerPage']

   for entry in dataJson['results']:

      try:
          list2=[]
          list2.append(entry['id'])
          list2.append(entry['creator'])
          list2.append(entry['status'])
          list2.append(entry['location']['lat'])
          list2.append(entry['location']['lon'])
          list2.append(entry['created'])
          list.append(list2)

      except:

          print 'failed to scrape a row'

def escribir():
   abrir = open('all_users2_andy.csv', 'w')
   wr = csv.writer(abrir, quoting=csv.QUOTE_ALL)
   headers = ['id','creator', 'status','lat', 'lon', 'created']
   wr.writerow (headers)

   for item in list:
       row=[item[0], item[1], item[2], item[3], item[4], item[5]]
       wr.writerow(row)
       abrir.close()

escribir()

I have included a call to 7 pages because the total numbers of feeds posted by this user are 684 (as you can see when writing directly in the browser 'https://api.xively.com/v2/feeds?user=airqualityegg')

The csv file that resulted from running this script does present duplicated rows, what might be explained for the fact that every time that a call is made to a page the order of results varies. Thus, a same row can be included in the results of different calls. For this reason I get less unique results that I should.

Do you know why might be that the results included in different pages are not unique?

Thanks, MarĂ­a

1 Answers1

0

You can try passing order=created_at (see docs).

The problem is because by default order=updated_at, hence the chances are that results will appear different on each page.

You should also consider using the official Python library.

errordeveloper
  • 6,716
  • 6
  • 41
  • 54