-2

When i try to read json file

for index, js in enumerate(json_files):
    with open(os.path.join(path_to_json, js)) as json_file:
        json_text = json.load(json_file)
        t_id = json_text["id"]
        created_at = json_text["created_at"]
        text = json_text["text"]
        user_name = json_text["user"]["name"]
        location = json_text["user"]["location"]
        jsons_data.loc[index] = [t_id,created_at,text,user_name,location]

i got this error

TypeError: string indices must be integers

this are in my json file

"{\"created_at\":\"Wed Nov 07 06:01:26 +0000 2018\",\"id\":1060049570195853312,\"id_str\":\"1060049570195853312\",\"text\":\"RT @maulinaantika: Tempe Khot News:\\nDiduga pertemuan kontrak politik antara Polri & timses jokowi tahun 2014\\n\\nDalam foto tersebut terlihat\\u2026\",\"source\":\"\\u003ca href=\\\"https:\\/\\/mobile.twitter.com\\\" rel=\\\"nofollow\\\"\\u003eTwitter Lite\\u003c\\/a\\u003e\",\"truncated\"

when i try like this

with open('tm.json', 'r') as f:
    for line in f:
        text = line.encode("utf-8")
        json_text = json.loads(text)

print(json_text)

i got this result

{"created_at":"Sat Dec 08 12:58:14 +0000 2018","id":1071388484609413120,...

can someone guide me how to solve this problem?

1 Answers1

0

The easiest explanation for Why do I get this error considering your code is:

json_text = json.load(json_file)

is providing you with a string. Which you try to use like a dictionary:

 t_id = json_text["id"]
 created_at = json_text["created_at"]
 text = json_text["text"]
 user_name = json_text["user"]["name"]
 location = json_text["user"]["location"] 

You can use try: ... except Exception as e: ... to avoid this and get the name of your json that is the culprit. Then you can fix your json data:

for index, js in enumerate(json_files):
    with open(os.path.join(path_to_json, js)) as json_file:
        json_text = json.load(json_file)
        try:
            t_id = json_text["id"]
            created_at = json_text["created_at"]
            text = json_text["text"]
            user_name = json_text["user"]["name"]
            location = json_text["user"]["location"]
            jsons_data.loc[index] = [t_id,created_at,text,user_name,location]
        except TypeError as te:
            print("Bad json - not a dict: ", os.path.join(path_to_json, js))
            print("Json was deserialized into a : ", type(json_text) )
            break # exit while, fix your data, do until it works

See:

Patrick Artner
  • 50,409
  • 9
  • 43
  • 69