1

I have below pandas df :

id  mobile
1   9998887776
2   8887776665
1   7776665554
2   6665554443
3   5554443332

I want to group by on id and expected results as below :

id   mobile
1    [{"9998887776": {"status": "verified"}},{"7776665554": {"status": "verified"}}]
2    [{"8887776665": {"status": "verified"}},{"6665554443": {"status": "verified"}}]
3    [{"5554443332": {"status": "verified"}}]

I know to_json method won't help here and I have to write UDF. But I am new to this and bit stuck here.

j '
  • 191
  • 1
  • 2
  • 12

1 Answers1

2

Use list comprehension with GroupBy.apply with custom format for lists of dictionaries:

f = lambda x: [{y: {"status": "verified"}} for y in x]
df = df.groupby('id')['mobile'].apply(f).reset_index()
print (df)
   id                                             mobile
0   1  [{9998887776: {'status': 'verified'}}, {777666...
1   2  [{8887776665: {'status': 'verified'}}, {666555...
2   3             [{5554443332: {'status': 'verified'}}]

If need json format:

import json

f = lambda x: json.dumps([{y: {"status": "verified"}} for y in x])
df = df.groupby('id')['mobile'].apply(f).reset_index()
print (df)
   id                                             mobile
0   1  [{"9998887776": {"status": "verified"}}, {"777...
1   2  [{"8887776665": {"status": "verified"}}, {"666...
2   3           [{"5554443332": {"status": "verified"}}]
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • 1
    Thanks.. This is exactly what i was looking for. Can you help me explain the f statement here ? I really need to learn this for any further situations. – j ' Nov 26 '19 at 08:24
  • 1
    @j' - It is lambda function, same like `df = df.groupby('id')['mobile'].apply(lambda x: [{y: {"status": "verified"}} for y in x]).reset_index()`, I choose this format for nicer output. Also is possible create non lambda function from `f` like `def f(x): return [{y: {"status": "verified"}} for y in x]` – jezrael Nov 26 '19 at 08:26