4

The below block of code works however I'm not satisfied that it is very optimal due to my limited understanding of using JSON but I can't seem to figure out a more efficient method.

The steam_game_db is like this:

{
    "applist": {
        "apps": [
            {
                "appid": 5,
                "name": "Dedicated Server"
            },
            {
                "appid": 7,
                "name": "Steam Client"
            },
            {
                "appid": 8,
                "name": "winui2"
            },
            {
                "appid": 10,
                "name": "Counter-Strike"
            }
        ]
    }
}

and my Python code so far is

i = 0
x = 570

req_name_from_id = requests.get(steam_game_db)
j = req_name_from_id.json()

while j["applist"]["apps"][i]["appid"] != x:
    i+=1
returned_game = j["applist"]["apps"][i]["name"]
print(returned_game)

Instead of looping through the entire app list is there a smarter way to perhaps search for it? Ideally the elements in the data structure with 'appid' and 'name' were numbered the same as their corresponding 'appid'

i.e. appid 570 in the list is Dota2 However element 570 in the data structure in appid 5069 and Red Faction

Also what type of data structure is this? Perhaps it has limited my searching ability for this answer already. (I.e. seems like a dictionary of 'appid' and 'element' to me for each element?)

EDIT: Changed to a for loop as suggested

# returned_id string for appid from another query

req_name_from_id = requests.get(steam_game_db)
j_2 = req_name_from_id.json()

for app in j_2["applist"]["apps"]:
    if app["appid"] == int(returned_id):
        returned_game = app["name"]

print(returned_game)
Purdy
  • 83
  • 2
  • 7
  • 2
    This is a hash structure, but the quickest would be putting everything into a dictionary, a lookup is then instant. – user1767754 Jan 20 '18 at 09:31
  • 1
    @user1767754 if they're only looking up one thing, turning the whole list into a dictionary would be slower on average than just iterating through for the thing they want, although the subsequent lookup would be fast. – jonrsharpe Jan 20 '18 at 09:36
  • 2
    You should think about the *intent* of your code; `for app in j["applist"]["apps"]:` would be far clearer than messing about with `i`, for example, without the risk of an `IndexError` if the ID (currently named `x`, which is also unhelpful - why not `app_id`?) isn't found. – jonrsharpe Jan 20 '18 at 09:39
  • @jonrsharpe the x was unhelpful my bad, I changed it to a bland variable name as to not cause confusion (as I'm converting my string from somewhere else to an int and it may have looked a bit random/redundant). I have changed it to a for instead as you suggested and it does seem clearer. Will the for loop continue after finding the required value? As I assumed the while loop once it hits the right X value will not continue thus using less resources? Also any idea why I wouldn't actually need the `j` in front of the `j["applist"]["apps"]` and it still works? – Purdy Jan 20 '18 at 10:18
  • 1
    You can `return` (if in a function) or `break` to end the loop when appropriate. – jonrsharpe Jan 20 '18 at 10:25
  • @jonrsharpe of course, thank you – Purdy Jan 20 '18 at 10:27

2 Answers2

6

The most convenient way to access things by a key (like the app ID here) is to use a dictionary.

You pay a little extra performance cost up-front to fill the dictionary, but after that pulling out values by ID is basically free.

However, it's a trade-off. If you only want to do a single look-up during the life-time of your Python program, then paying that extra performance cost to build the dictionary won't be beneficial, compared to a simple loop like you already did. But if you want to do multiple look-ups, it will be beneficial.

# build dictionary
app_by_id = {}
for app in j["applist"]["apps"]:
  app_by_id[app["appid"]] = app["name"]

# use it
print(app_by_id["570"])

Also think about caching the JSON file on disk. This will save time during your program's startup.

Tomalak
  • 332,285
  • 67
  • 532
  • 628
  • Hi Tomalak, I will test this out. I do expect to reuse this, the program is constantly checking a list of steam ids for what they are playing, however valve will update the api occasionally as new games are released, so perhaps querying from it is the best option, or perhaps periodically? – Purdy Jan 20 '18 at 10:15
  • 1
    Well, it takes a while to get the file over the wire. Loading it from disk will be faster. You remember in a variable when you last refreshed it and re-download it when it is getting old. Unfortunately the Steam server does not seem to support [conditional HTTP](https://developer.mozilla.org/en-US/docs/Web/HTTP/Conditional_requests) for this file, so you have to keep track of its age manually. – Tomalak Jan 20 '18 at 10:23
  • Thanks, I will certainly give that a go if only to learn something new! I suppose refreshing it even once every few hours will be more efficient given this loops per 5 seconds – Purdy Jan 20 '18 at 10:26
  • Absolutely. And you only need to re-build the dictionary after the HTTP download, too. – Tomalak Jan 20 '18 at 10:27
  • P.S. instead of a variable, you can also simply look at the file's last write time to figure out if it is old enough to justify a re-download. – Tomalak Jan 20 '18 at 10:33
2

It's better to have the JSON file on disk, you can directly dump it into a dictionary and start building up your lookup table. As an example I've tried to maintain your logic while using the dict for lookups. Don't forget to encode the JSON it has special characters in it.

Setup:

import json

f = open('bigJson.json')

apps = {}
with open('bigJson.json', encoding="utf-8") as handle:
    dictdump = json.loads(handle.read())

    for item in dictdump['applist']['apps']:
        apps.setdefault(item['appid'], item['name'])

Usage 1: That's the way you have used it

for appid in range(0, 570):
    if appid in apps:
        print(appid, apps[appid].encode("utf-8"))

Usage 2: That's how you can query a key, using getinstead of [] will prevent a KeyError exception if the appid isn't recorded.

print(apps.get(570, 0))
user1767754
  • 23,311
  • 18
  • 141
  • 164
  • Thanks for this, I'll have to play with it a little as my knowledge is quite limited. (Particularly encoding it) – Purdy Jan 20 '18 at 10:23
  • 1
    It's easier to let the JSON library handle the file loading ([docs](https://docs.python.org/3/library/json.html#json.load)). `data = json.load('bigJson.json')` – Tomalak Jan 20 '18 at 10:25
  • But `json.load` expects a `file` like object? – user1767754 Jan 20 '18 at 10:37
  • That's weird, I'm on `3.6.3` and when I do `data = json.load('bigJson.json')` I get `str obj has no attribute read` – user1767754 Jan 20 '18 at 10:51