Questions tagged [scipy.stats]
297 questions
0
votes
0 answers
Fitting a lognormal distribution to negative values with scipy
I have a 40 year time-series of surge levels in the ocean to which I'm trying to fit a lognormal distribution using scipy.stats. However, as far as I know (and read) a lognormal distribution cannot have negative values by definition. The scipy…

jchristiaanse
- 103
- 12
0
votes
0 answers
Time consumption of SciPy's bootstrap as a function of the number of resamples
I have a large dataset, with on the order of 2^15 entries, and I calculate the confidence interval of the mean of the entries with scipy.stats.bootstrap. For a dataset this size, this costs about 6 seconds on my laptop. I have a lot of datasets, so…

Georg
- 113
- 4
0
votes
0 answers
How does the scipy.stats.shapiro calculate a test statistic
I know that scipy.stats.shapiro is used to test for normality. I also know that the calculation of a test statistic always involves the mean m0 or variance of the population (as explained here). My question then:
What type of test statistic (t,…

Nemo
- 1,124
- 2
- 16
- 39
0
votes
0 answers
p-value from scipy.stats.ttest_1samp
I am running the one sample t-test using the following python code:
import scipy
import numpy as np
mu, sigma = 0.67, 0.11
s = np.random.normal(mu, sigma, 10000)
scipy.stats.ttest_1samp(s, popmean=0.60, alternative='greater')
which…

Giulia B.
- 31
- 1
- 3
0
votes
1 answer
Fitting & scaling a probability density function correctly to a histogram with a logarithmic x-axis?
I am trying to fit a gilbrat PDF to a dataset (that I have in form of a list). I want to show the data in a histogram with a logarithmic x-scale and add the fitted curve. However, the curve seems too flat compared to the histogram, like in this…

Conni
- 25
- 4
0
votes
0 answers
Can't run student t_test using combinations
I've been trying to implement a t_student test in a DataFrame but I always end up with an error like
raise KeyError(key)
KeyError: 'patid'
This is my DataFrame:
df = pd.DataFrame.from_records(data=[
dict(id=1, rd=True, drk=True, hn=True,…
0
votes
0 answers
Fast binning of geographical data with negative values
Trying to bin some geolocated data using scipy stats.binned_statisc_2d but it seems there cannot exist any negative values in the data. Is there a way to do this accurately and fast?
import numpy as np
ilats = np.linspace(90,-90, 4000)
ilons =…

Shejo284
- 4,541
- 6
- 32
- 44
0
votes
0 answers
Drawing sample and calculating sample probability from multivariate normal distribution using scipy.stats.multivariate_normal
I would like to do something that is likely very simple, but is giving me difficulty. Trying to draw N samples from a multivariate normal distribution and calculate the probability of each of those randomly drawn samples. Here I attempt to use…

BeginnersMindTruly
- 659
- 8
- 30
0
votes
1 answer
Check result of chi square test on pandas columns data
I wrote the test according to an approach I found. When looking in Stack Overflow I saw another approach (can be seen here) which was a little more complicated, and made me wonder if I chose the right one.
I'm looking for ways to check if my…

Ziv
- 109
- 10
0
votes
0 answers
How to fit data with log-normal distribution using norm.fit() in Scipy
I am trying to use Scipy.stats norm.fit() with some modifications to fit data with a log-normal distribution. And I want to verify the result with fitting the data using Scipy.stats lognorm.fit(). The result comes out to be just similar, but it…

Peter__C
- 1
- 2
0
votes
1 answer
scipy.stats.multivariable_norm.pdf: "The input matrix must be symmetric positive semidefinite."
So I have the following code below.
L = np.array([1,2,3])
M = np.array([1,2,3])
Q = np.random.uniform(0,10,size=(3,3))
S = Q.T*Q
print(sp.stats.multivariate_normal.pdf(L,M,S))
Clearly S is a symmetric positive semidefinite matrix. I can prove it…

Kookie
- 328
- 4
- 14
0
votes
2 answers
Excel vs. Sci Kit Learn Linear Regression or scipy.stats Provide Different Slopes, Intercepts, R2 Values
I cannot figure out why I get different values for slope, intercept, and r2 values from excel vs. scikit learn (or scipy.stats!). This is a very simple linear regression, literally six "x" values and six "y" values. I use Excel all the time for…

theZeigs
- 3
- 1
0
votes
0 answers
Python: How to discretize continuous probability distributions for Kullback-Leibler Divergence
I want to find out how many samples are needed at minimum to more or less correctly fit a probability distribution (In my case the Generalized Extreme Value Distribution from scipy.stats).
In order to evaluate the matched function, I want to compute…

Anton Reinecke
- 1
- 1
0
votes
0 answers
Using kstest from scipy.stats within python - it seems I'm calling the cdf() wrong?
I've been using kstest to try to see if a distribution fits my data, going through the discrete distributions from this link. I've managed to get to logser (logarithmic discrete random variable), but I can't figure out how to make this work.
I've…
0
votes
0 answers
How to compute the slope of random set using the bootstrap method?
Somehow, the scipy.stats.bootstrap is not working in my Jupyter notebook. Therefore, I decide to write a sample function for bootstrap estimation.
Here is what I did.
def bootstrap(x, Nboot, statfun):
'''Bootstrap code'''
x = np.array(x)
…

Adnan
- 51
- 6