I have a big dataframe (shape 100 000*192). I calculated the pearson coefficient for each attribute already. Now I am looking for a way to calculate every group pearson correlations. What I mean is that for now I have
if A then B
and i want to calculate
if (A AND B) then C
if (A AND B AND C) then (D AND E)
For example
DataFrame 1
A B C
0|0 0 1
1|1 0 0
2|0 1 0
3|1 1 1
Here the column A and C do not seem to have a strong correlation, A and B or B and C neither, but when you take A and B then you have a correlation with C (if A = B then C =1 otherwise C = 0). I hope it helps to understand.
Is there any function or library already doing that or am I going to have to code a lot of iteration with the df.corr()
function of pandas
on my dataframe?