-2

enter image description hereI am trying to calculate the Pearson coefficient for all columns in my dataframe but when I try to make a heatmap I return Nan values in rows with zeroes in them. Any suggestions on how to fix it? Here is the screenshot of the code and output below:

#Calculate the correlation coefficients
corr = dfno.corr(method ='pearson') 
#plot it in the next line
corr.round(2).style.background_gradient(cmap='coolwarm')

Pearson Heatmap

Jswojcik
  • 19
  • 4

1 Answers1

1

NaN appears if at least one of your columns is constant values. If a column is a constant value, its standard deviation would be 0 and results in a division by 0, hence NaN in Pearson's correlation. Depending on your application, I think easiest way to deal with them is to replace NaNs with 0 in your heatmap output.

corr.fillna(0)
Ehsan
  • 12,072
  • 2
  • 20
  • 33