Amendment:
If I have a pandas DataFrame that includes 5 columns Col1
& Col2
& Col3
& Col4
& Col5
and I need to get max Pearson's correlation coefficient between(Col2
,Col3
) & (Col2
,Col4
) & (Col2
,Col5
) by considering the values in Col1
The modified values for Col2
which obtained by the next formula:
df['Col1']=np.power((df['Col1']),B)
df['Col2']=df['Col2']*df['Col1']
where B
is the changing variable (a single value) to get max Pearson's correlation coefficient between (new values of Col2
,Col3
) & (new values of Col2
,Col4
) & (new values of Col2
,Col5
).
Update:
The above table containing 5 columns as I mentioned above, the correlation between coefficient between (Col2
,Col3
) & (Col2
,Col4
) & (Col2
,Col5
) is illustrated below the table.
I need to change the values of Col2
based on two the mentioned equations where the changing value is B
.
So the question is how to get the best value of B
that gives a new correlation coefficient greater than or equal its counterpart(old)?
Update 2 :
Col1,Col2,Col3,Col4,Col5
2,0.051361397,2618,1453,1099
4,0.053507779,306,153,150
2,0.041236151,39,54,34
6,0.094526419,2755,2209,1947
4,0.079773397,2313,1261,1022
4,0.083891415,3528,2502,2029
6,0.090737243,3594,2781,2508
2,0.069552772,370,234,246
2,0.052401789,690,402,280
2,0.039930675,1218,846,631
4,0.065952096,1706,523,453
2,0.053064126,314,197,123
6,0.076847486,4019,1675,1452
2,0.044881545,604,402,356
2,0.073102611,2214,1263,1050
0,0.046998526,938,648,572