0

I'm working on this dateset:

https://www.kaggle.com/ronitf/heart-disease-uci?select=heart.csv

I'm viewing the results of pandas profiling and it suggests that age column has HIGH CORRELATION with thalach column.

I checked the 3 types of correlation between those fields:

print(f"pearson = ",df['age'].corr(df['thalach'], method='pearson'))
print(f"spearman = ",df['age'].corr(df['thalach'], method='spearman'))
print(f"kendall = ",df['age'].corr(df['thalach'], method='kendall'))

And I'm getting:

pearson =  -0.39852193812106734
spearman =  -0.3980524371044455
kendall =  -0.28000884141748783

The 3 types of correlation shows lower correlation.

What am I missing ? Is there a way pandas profiling is wrong ?

Boom
  • 1,145
  • 18
  • 44

0 Answers0