-2

I have the data as below

{
  "employeealias": "101613177", 
  "firstname": "Lion", 
  "lastname": "King", 
  "date": "2022-04-21", 
  "type": "Thoughtful Intake", 
  "subject": "Email: From You Success Coach"
}

{
  "employeealias": "101613177", 
  "firstname": "Lion",
  "lastname": "King",
  "date": "2022-04-21",
  "type": null,
  "subject": "Call- CDL options & career assessment"
}

I need to create a dictionary like the below:

enter image description here

halfer
  • 19,824
  • 17
  • 99
  • 186
NK7983
  • 125
  • 1
  • 14

2 Answers2

1

You have to create new dictionary with list and use for-loop to check if exists employeealias, firstname, lastname to add other information to sublist. If item doesn't exist then you have to create new item with employeealias, firstname, lastname and other information.

data = [
{"employeealias":"101613177","firstname":"Lion","lastname":"King","date":"2022-04-21","type":"Thoughtful Intake","subject":"Email: From You Success Coach"},
{"employeealias":"101613177","firstname":"Lion","lastname":"King","date":"2022-04-21","type":"null","subject":"Call- CDL options & career assessment"},
]

result = {'interactions': []}

for row in data:
    found = False
    for item in result['interactions']:
        if (row["employeealias"] == item["employeealias"]
           and row["firstname"] == item["firstname"]
           and row["lastname"] == item["lastname"]):
            item["activity"].append({
               "date": row["date"],
               "subject": row["subject"],
               "type": row["type"],
            })
            found = True
            break
        
    if not found:
        result['interactions'].append({
            "employeealias": row["employeealias"],
            "firstname": row["firstname"],
            "lastname": row["lastname"],
            "activity": [{
                           "date": row["date"],
                           "subject": row["subject"],
                           "type": row["type"],
                        }]
        })
            
print(result)            

EDIT:

You read lines as normal text but you have to convert text to dictonary using module json

import json

data = [] 

with open("/Users/Downloads/amazon_activity_feed_0005_part_00.json") as a_file:      
    for line in a_file:         
        line = line.strip()
        dictionary = json.loads(line)         
        data.append(dictionary)

print(data)
furas
  • 134,197
  • 12
  • 106
  • 148
  • thank you so much @furas. you made my day today. appreciate your quick response. – NK7983 May 04 '22 at 03:36
  • I have the data in a file and I tried making data list similar to what you have done but giving an error "employeealias": row["employeealias"], TypeError: string indices must be integers `data = [] with open("/Users/Downloads/amazon_activity_feed_0005_part_00.json","rt") as a_file: for line in a_file: data.appen(line) print(data)` – NK7983 May 04 '22 at 04:30
  • you read lines as normal text but you have to convert text to dictonary using module `json` and `dictionary = json.loads(line)`, `data.append(dictionary)`. I added example to answer. – furas May 04 '22 at 11:27
  • perfection again. !!! Wish , one day I can program like you! Thank you so much @furas. – NK7983 May 04 '22 at 12:13
  • When the data in the file grows, does this hit performance? should I re-engineer this in spark or pandas? does they help? – NK7983 May 04 '22 at 12:16
  • you would have to measure time for bigger data to see if you really have to change it. – furas May 04 '22 at 12:35
0

You can create a nested dictionary inside Python like this: student = {name : "Suman", Age = 20, gender: "male",{class : 11, roll no: 12}}

  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community May 04 '22 at 18:21