1

I do have a json format which is generated from docanno annotation tool. I want to convert the json into another format. Please check below for the format

Docanno json format :

{"id": 2, "data": "My name is Nithin Reddy and i'm working as a Data Scientist.", "label": [[3, 8, "Misc"], [11, 23, "Person"], [32, 39, "Activity"], [45, 59, "Designation"]]}
{"id": 3, "data": "I live in Hyderabad.", "label": [[2, 6, "Misc"], [10, 19, "Location"]]}
{"id": 4, "data": "I'm pusring my master's from Bits Pilani.", "label": [[15, 24, "Education"], [29, 40, "Organization"]]}

Required json format :

("My name is Nithin Reddy and i'm working as a Data Scientist.", {"entities": [(3, 8, "Misc"), (11, 23, "Person"), (32, 39, "Activity"), (45, 59, "Designation")]}),
("I live in Hyderabad.", {"entities": [(2, 6, "Misc"), (10, 19, "Location")]}),
("I'm pusring my master's from Bits Pilani.", {"entities": [(15, 24, "Education"), (29, 40, "Organization")]})

I tried the below code, but it's not working

import json

with open('data.json') as f:
    data = json.load(f)

new_data = []
for i in data:
    new_data.append((i['data'], {"entities": i['label']}))

with open('data_new.json', 'w') as f:
    json.dump(new_data, f)

Can anyone help me with the python code which will change the json to required format?

Nithin Reddy
  • 580
  • 2
  • 8
  • 18

0 Answers0