0

I have a big dataframe (shape 100 000*192). I calculated the pearson coefficient for each attribute already. Now I am looking for a way to calculate every group pearson correlations. What I mean is that for now I have

if A then B 

and i want to calculate

if (A AND B) then C 
if (A AND B AND C) then (D AND E)

For example

   DataFrame 1 
   A  B  C
 0|0  0  1 
 1|1  0  0 
 2|0  1  0 
 3|1  1  1

Here the column A and C do not seem to have a strong correlation, A and B or B and C neither, but when you take A and B then you have a correlation with C (if A = B then C =1 otherwise C = 0). I hope it helps to understand.

Is there any function or library already doing that or am I going to have to code a lot of iteration with the df.corr() function of pandas on my dataframe?

Mayeul sgc
  • 1,964
  • 3
  • 20
  • 35
  • what do your logical expressions have to do with "grouped" pearson correlations? Can you give a concrete example of what you are talking about and provide what you are expecting to see. Read [MCVE](http://stackoverflow.com/help/mcve) – piRSquared Feb 23 '17 at 07:05
  • I am trying to explain what is the grouped correlation. For now I have correlation between attributes but only 1 to 1. I want to know about the coefficient for 2 to 1, 2 to 2, ...etc and all the possibilities given by my 192 attributes. Is it clearer ? I am going to put example – Mayeul sgc Feb 23 '17 at 07:09
  • No, its not. Did you read the document I linked? – piRSquared Feb 23 '17 at 07:10
  • I edited with example – Mayeul sgc Feb 23 '17 at 07:23

0 Answers0