(ijson) Getting item with any prefix

Question

I am having a json file like this:

{
    "europe": [
      "germany",
      "france",
      ...
    ],
    "america": [
      "usa",
      "canada",
      ...
    ]
  }

I want to get all items of every prefix like this:

germany
france
usa
canada

I use this:

with open('file.json', 'r', encoding='utf-8') as f:
    for object in ijson.items(f, "item"):
        print (object)

I tried it with a regular expression that accept every string in front of item, but it does not work. I think there is a really easy solution I just don't see. Also looked in the documentation of ijson, but didn't find any solution either.

Maybe you can help me.

Greetings

score 0 · Answer 1 · answered Feb 04 '20 at 12:34

0

Do I understand correctly that you simply want a list of all the countries without the continents?

import json
with open('file.json', 'r', encoding='utf-8') as f:
    countries = [con for coun in json.load(f).values() for con in coun]
print(countries)

answered Feb 04 '20 at 12:34

Felix Kleine Bösing

606
3
13

Excactly. But i would like to use an iterative json parser like ijson, since the json file is pretty big. So is there a possiblity to use my ijson approach? – asdfyxcvqwer Feb 04 '20 at 12:38

score 0 · Answer 2 · answered Feb 20 '20 at 06:25

There's currently no way of doing this with items as it doesn't support wildcards or depth specifications. The closest you can get with no fuss (with 2.6) is doing:

for continent, countries in ijson.kvitems(f, ''):
   for country in countries:
      print(country)

If individual lists of countries are themselves too big to be held in memory you'd have to resort to a more manual approach based on ijson.parse() keeping track of the "depth" of your path.

(ijson) Getting item with any prefix

2 Answers2

Linked