2

I have a dataframe that looks like the one below

ID  |  Value
1   |  A
1   |  B
1   |  C
2   |  B
2   |  C

I want to create a symmetric matrix of based on Value:

    A   B  C
 A  1   1  1
 B  1   2  2
 C  1   2  2

This basically indicates how many people have both values (v1,v2). I am currently using for loops to scan the dataframe for every combination but was wondering if there was an easier way to do it using pandas.

Nivi
  • 1,067
  • 4
  • 15
  • 28

1 Answers1

4

Use merge with cross join by ID column with crosstab and DataFrame.rename_axis for remove index and columns names:

df = pd.merge(df, df, on='ID')

df = pd.crosstab(df['Value_x'], df['Value_y']).rename_axis(None).rename_axis(None, axis=1)
print (df)
   A  B  C
A  1  1  1
B  1  2  2
C  1  2  2
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252