Group data, count unique values and append this value to row

Question

I am trying to find the unique number of items in each 'group' of ID's. So in the code below I am trying to find the unique number of demographics (A, B, C) for each value of id_match (101, 201, 26).

tst = pd.DataFrame({'demographic' : ['A', 'B', 'B', 'A', 'C', 'C'],
            'id_match' : ['101', '101', '201', '201', '26', '26']})

tst['num_unq'] = tst.groupby('demographic')['id_match'].nunique()

Expected output

  demographic  id_match  num_unq
1     A          101        2
2     B          101        2
3     B          201        2
4     A          201        2
5     C          26         1
6     C          26         1

However instead of the expected output i simply get a columns of NaN's. Does anyone know why this happens and also an alternative method?

Thanks J

`groupby` + `transform` with `'nunique'`. – jpp Jan 16 '19 at 10:43 — jpp, Jan 16 '19 at 10:43

score 1 · Accepted Answer · answered Jan 16 '19 at 10:42

Use transform:

tst = pd.DataFrame({'demographic' : ['A', 'B', 'B', 'A', 'C', 'C'],
            'id_match' : ['101', '101', '201', '201', '26', '26']})

tst['num_unq'] = tst.groupby('demographic')['id_match'].transform('nunique')

print(tst)

Output

  demographic id_match  num_unq
0           A      101        2
1           B      101        2
2           B      201        2
3           A      201        2
4           C       26        1
5           C       26        1

Thanks, i though it was a simple one! I will except the answer after minimum time is up. — JDraper, Jan 16 '19 at 10:46

Group data, count unique values and append this value to row

1 Answers1