Pandas profiling correlation warning seems to be wrong

Asked Jun 10 '21 at 11:34

Active Jun 10 '21 at 11:34

Viewed 148 times

I'm working on this dateset:

https://www.kaggle.com/ronitf/heart-disease-uci?select=heart.csv

I'm viewing the results of pandas profiling and it suggests that age column has HIGH CORRELATION with thalach column.

I checked the 3 types of correlation between those fields:

print(f"pearson = ",df['age'].corr(df['thalach'], method='pearson'))
print(f"spearman = ",df['age'].corr(df['thalach'], method='spearman'))
print(f"kendall = ",df['age'].corr(df['thalach'], method='kendall'))

And I'm getting:

pearson =  -0.39852193812106734
spearman =  -0.3980524371044455
kendall =  -0.28000884141748783

The 3 types of correlation shows lower correlation.

What am I missing ? Is there a way pandas profiling is wrong ?

asked Jun 10 '21 at 11:34

Boom

1,145
18
44

What are the threshold values in `.../site_packages/pandas_profiling/config_default.yaml` file? (around 80th line) – Mustafa Aydın Jun 10 '21 at 12:08
Is there a way to print this value from jupter ? – Boom Jun 10 '21 at 13:20
Not sure; you can find this file in the installed packages in your system, did that work out? – Mustafa Aydın Jun 10 '21 at 13:21

Pandas profiling correlation warning seems to be wrong

0 Answers0