0

I'm using the kdeplot function of the seaborn package in python and I have a dataset called BH and the weight called MT of each value. Both variables are numpy arrays.

import seaborn as sns 
import matplotlib.pyplot as plt 
BH = np.array([141.19618068420274, 191.2406412961248, 346.01168490938585, 230.14257295050672, 185.01589850153252, 245.67488131757796, 175.6108949133985, 325.03739349020094, 379.41413332517686, 105.59295515652147 ])

MT = np.array([0.015004641668689452, 0.011144290004860507, 0.007974195145875648, 0.019437031417186952, 0.0036642005992589023, 0.0036642005992589023, 0.0036642005992589023,0.0036642005992589023, 0.0023554322266426775,0.0023554322266426775 ])

sns.kdeplot(x=BH, weights=MT, bw_adjust=0.2, log_scale=True, common_grid=True, bw_method='silverman')
plt.hist(BH, weights=MT, bins=np.logspace(2, 7.5, num=50), log=True, histtype='step', density=True)
plt.show()

Output

As you can see, the solid blue line is the kde plot and the orange line is the histogram. The problem is that the kde is 10 times large than the values of the histogram which clearly changes my results. Do you know how to fix this? I'm really new using seaborn.

Thanks,

tdy
  • 36,675
  • 19
  • 86
  • 83
  • 2
    The histogram needs `density=True` to be comparable with a kdeplot. You might want to add some test data to create a reproducible example and easier test what's going on. – JohanC Dec 17 '22 at 00:11
  • Thanks, I added the density=True to the plt.hist() and also I added test data but somehow the problem is still there. – Matías Liempi Dec 17 '22 at 00:26
  • 3
    The `log=True` parameter in `plt.hist` logs the y axis; the `log_scale=True` parameter in `sns.kdeplot` logs the `x` axis. – mwaskom Dec 17 '22 at 00:35
  • Thanks, all comments were useful. Finally I had to use another fit method (purely statistical not related to the code) but now is solved. – Matías Liempi Dec 24 '22 at 21:00

0 Answers0