I was wondering how to calculate skewness and kurtosis correctly in pandas.
Pandas gives some values for skew()
and kurtosis()
values but they seem much different from scipy.stats
values. Which one to trust pandas or scipy.stats
?
Here is my code:
import numpy as np
import scipy.stats as stats
import pandas as pd
np.random.seed(100)
x = np.random.normal(size=(20))
kurtosis_scipy = stats.kurtosis(x)
kurtosis_pandas = pd.DataFrame(x).kurtosis()[0]
print(kurtosis_scipy, kurtosis_pandas)
# -0.5270409758168872
# -0.31467107631025604
skew_scipy = stats.skew(x)
skew_pandas = pd.DataFrame(x).skew()[0]
print(skew_scipy, skew_pandas)
# -0.41070929017558555
# -0.44478877631598901
Versions:
print(np.__version__, pd.__version__, scipy.__version__)
1.11.0 0.20.0 0.19.0