-2

I'm trying to scrape data from foursquare.
The result of this code:

lat = 60.172667
lng = 24.932009
radius = 500
LIMIT = 5

headers = {
    "Accept": "application/json",
    "Authorization": "APIKEY"
}

url = "https://api.foursquare.com/v3/places/nearby?ll={},{}&radius={}&limit={}"
url = url.format(lat, lng, radius, LIMIT)

results = requests.request("GET", url, headers=headers)

is something like:

{'results': [{'fsq_id': '4adcdb23f964a520dc6021e3',
   'categories': [{'id': 10004,
     'name': 'Art Gallery',
     'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/arts_entertainment/artgallery_',
      'suffix': '.png'}}],
   'chains': [],
   'distance': 81,
   'geocodes': {'main': {'latitude': 60.172127213487165,
     'longitude': 24.93101423081712}},
   'location': {'address': 'Nervanderinkatu 3',

Then I tried to build function:

    def getNearbyVenues(names, latitudes, longitudes, radius=500):
        headers = {
            "Accept": "application/json",
            "Authorization": "f#######Z3CJo="
        }

        URL = "https://api.foursquare.com/v3/places/search?ll={},{}&radius={}&limit={}"

        venues_list = []
        for name, lat, lng in zip(names, latitudes, longitudes):
            print(name)

            url = URL.format(lat, lng, radius, LIMIT)
            results = requests.request("GET", url, headers=headers).json()["response"]['groups'][0]['items']
            venues_list.append([(
                name,
                lat,
                lng,
                v['venue']['name'],
                v['venue']['location']['lat'],
                v['venue']['location']['lng'],
                v['venue']['categories'][0]['name']) for v in results])


            nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
            nearby_venues.columns = ['Neighborhood',
                                     'Neighborhood Latitude',
                                     'Neighborhood Longitude',
                                     'Venue',
                                     'Venue Latitude',
                                     'Venue Longitude',
                                     'Venue Category']
        return(nearby_venues)

and I'm getting KeyError: 'response'. I tried to remove ["response"]['groups'][0]['items'] then I started getting a string indices must be integers error.

martineau
  • 119,623
  • 25
  • 170
  • 301
Kaokor
  • 15
  • 1
  • 6
  • I guess you mean `results`, not `response`? But then there is missing key `items` (at least in the sample) – buran Jan 29 '22 at 15:52
  • @buran this the whole https://www.dropbox.com/s/z5x9rfobyc9l5jn/json.txt?dl=0 ,,, i turns out that foursquare changed the format of the json file so I think it doesn't have response anymore , I don't know how to interept json file , can you help , I'm trying to do what this guy did in 2.5 section https://github.com/bloonsinthesky/Data-Science-Portfolio/blob/main/The%20Battle%20of%20the%20Neighborhoods/The%20Battle%20of%20the%20Neighborhoods.ipynb – Kaokor Jan 29 '22 at 16:44
  • you can navigate based on keys, as buran noted, you have `results` . for example if i want to get `geocode` from the file you shared i will get like `with open(file_j, 'r') as f: file_as_dict=eval(f.read().strip()) file_As_json=json.loads(json.dumps(file_as_dict)) json.loads(json.dumps(l))['results'][0]['geocodes']` and will get me results `{'main': {'latitude': 60.24277, 'longitude': 24.92642}}`. So i go results-->first array instance(0)-->geocodes. – simpleApp Jan 29 '22 at 17:17
  • @simpleApp thank you for your response , I've been trying to do that but I couldn't figure out , can you help me by what should write exactly , this is my first project https://stackoverflow.com/questions/70907969/how-to-interpret-this-json-file – Kaokor Jan 29 '22 at 17:26

1 Answers1

0

here is one way to achieve this. I think the main confusion here was how to parse the json.

import pandas as pd
import requests
import json
# define the function
def getNearbyVenues(names, latitudes, longitudes, radius=500,limit=5):

    URL= "https://api.foursquare.com/v3/places/nearby?ll={},{}&radius={}&limit={}"
    headers = {
        "Accept": "application/json",
        "Authorization": "your API key here"
    }

    df_list=[]
    
    for name, lat, lng in zip(names, latitudes, longitudes):
        url = URL.format(lat, lng, radius, LIMIT)
        results = requests.request("GET", url, headers=headers).json()
        
        for each_result in results['results']: # filter the result based on JSON identification
            result={}
            result['Neighborhood']=name
            result['Neighborhood Latitude']=lat
            result['Neighborhood Longitude']=lng
            result['Name']=each_result['name']
            result['Venue Latitude']=each_result['geocodes']['main']['latitude']
            result['Venue Longitude']=each_result['geocodes']['main']['longitude']
            result['Locality']=each_result['location']['locality']
            result['Category_Names']=[each_name['name'] for each_name in each_result['categories']]
            df_list.append(result.copy())
    return pd.DataFrame(df_list) # return dataframe

the call the function like this

names=['test_locations']
latitudes = [60.172667]
longitudes = [24.932009]
radius = 500
LIMIT = 5
# call the function
df_result=getNearbyVenues(names,latitudes,longitudes)

enter image description here

simpleApp
  • 2,885
  • 2
  • 10
  • 19
  • wow , I was about to give up on this project , but I still get this error but I feel its small one `KeyError Traceback (most recent call last) ~\AppData\Local\Temp/ipykernel_9720/2861578039.py in 2 LIMIT = 5 3 # Saving to new dataframe ----> 4 df_venues = getNearbyVenues(names=df['Neighborhood'], 5 latitudes=df['Latitude'], 6 longitudes=df['Longitude'] ` – Kaokor Jan 29 '22 at 19:24
  • `~\AppData\Local\Temp/ipykernel_9720/3284967568.py in getNearbyVenues(names, latitudes, longitudes, radius, limit) 21 result['Venue Latitude']=each_result['geocodes']['main']['latitude'] 22 result['Venue Longitude']=each_result['geocodes']['main']['longitude'] ---> 23 result['Locality']=each_result['location']['locality'] 24 result['Category_Names']=[each_name['name'] for each_name in each_result['categories']] 25 df_list.append(result.copy()) KeyError: 'locality'` this is the rest of the code – Kaokor Jan 29 '22 at 19:25