I am trying to take a pandas dataframe and perform a pivot like operation on a single column. I want to take multiple rows (grouped by some identification columns) and convert that single column into dummy indicator variables. I know of pd.get_dummies()
but I want to condense the multiple rows into a single row.
An example below:
import pandas as pd
import numpy as np
# starting data
d = {'ID': [1,1,1,2,2,3,3,3],
'name': ['bob','bob','bob','shelby','shelby','jordan','jordan','jordan'],
'type': ['type1','type2','type4','type1','type6','type5','type8','type2']}
df: pd.DataFrame = pd.DataFrame(data=d)
print(df.head(9))
ID name type
0 1 bob type1
1 1 bob type2
2 1 bob type4
3 2 shelby type1
4 2 shelby type6
5 3 jordan type5
6 3 jordan type8
7 3 jordan type2
I would like the end result to look like:
ID name type1 type2 type4 type5 type6 type8
0 1 bob 1 1 1 0 0 0
1 2 shelby 1 0 0 0 1 0
2 3 jordan 0 1 0 1 0 1