I have a Pandas dataframe df
with two columns, A
and B
such that all values in B
would be different for a value in A
, but two different values in A
can have duplicate values in B
.
A B
ABC1 XYZ XYZ 123
ABC2 XYZ XYZ 123
ABC1 XYZ XYZ 135
ABC3 XYZ PQR 123
ABC2 XYZ PQR 123
ABC3 XYZ XYZ 135
ABC4 XYZ XYZ 135
ABC2 XYZ PQR 987
ABC4 XYZ PQR 123
ABC5 PQR PQR 567
How can I create a heatmap from this dataframe that shows me the number of common B
values between any two A
values? I presume I'll have to create a pivot that might look something like this -
ABC1 ABC2 ABC3 ABC4 ABC5
ABC1 2 1 ...
ABC2 1 3
ABC3 0 1 ...
ABC4 0 0 ... ...
ABC5 0 0
Any help on how this can be done efficiently?