normalizing json lines in pandas

Asked May 09 '18 at 03:16

Active May 09 '18 at 03:16

Viewed 247 times

I have a json line file, where each line has some structure which I am trying to (mostly) flatten, thus:

with open("/home/igor/data/feed.jsonl") as json_file:
    thelist2 = []
    for line in json_file:
        thelist2.append(json_normalize(json.loads(line)))

Followed by pd.concat(thelist2) The semantics of the above is correct, but what is not so good is that this is horrifically slow, while running the above without json_normalize is quite speedy (but does the wrong thing). Is there a way to normalize the dataframe after the fact, or some other speedier scheme?

asked May 09 '18 at 03:16

Igor Rivin

4,632
2
23
35

Can you add some data sample? – jezrael May 09 '18 at 03:54
@jezzrael I will try to scare some data up... – Igor Rivin May 09 '18 at 04:01

normalizing json lines in pandas

0 Answers0