Questions tagged [scipy.stats]

297 questions
0
votes
0 answers

Plot probability density function in Python 3d surface plot

What is the best way to plot a two-dimensional (bivariate) probability distribution (scipy.stats.norm) in 3d like the example surface plots below. I don't think seaborn comes out of the box with these, but thinking first in terms of pyplot or…
develarist
  • 1,224
  • 1
  • 13
  • 34
0
votes
1 answer

Overflow error subclassing a distribution using scipy.stats.rv_continuous

In the documentation page of rv_continuous we can find a 'custom' gaussian being subclassed as follows. from scipy.stats import rv_continuous import numpy as np class gaussian_gen(rv_continuous): "Gaussian distribution" def _pdf(self, x): …
JTFreitas
  • 3
  • 1
0
votes
1 answer

scipy.stats.lognorm.expect returning an odd result

I am trying to get the E(x) for a lognormal random variable using scipy.stats.expect. Using the fit() module, the shape, loc and scale parameters are shape = 0.9577226550971423, loc=-1.1217451814333423, scale=0.744230342110942 The expected value…
0
votes
1 answer

How to extract the distance and transport matrices from Scipy's wasserstein_distance?

The scipy.stats.wasserstein_distance function only returns the minimum distance (the solution) between two input distributions, p and q. But that distance is the result of the product of a distance matrix and an optimal transport matrix that must…
develarist
  • 1,224
  • 1
  • 13
  • 34
0
votes
0 answers

How do I get the standard ANOVA table which has SSR, SSE, SSTO in Python?

I am trying to get an extra/sequence SS output with the following code: reg2 = smf.ols(formula='Y ~ X3 + X2', data = df).fit() anova_results2 = sm.stats.anova_lm(reg2, typ=1) (anova_results2) I would like it to look like the standard ANOVA output…
0
votes
0 answers

Why are randomly-generated distributions from numpy.random and scipy.stats so different that their probabilities don't sum properly?

When randomly generating random numbers using the numpy.random package, and the scipy.stats package, why is the histogram (total probabilities) generated by the former package have such large values with a maximum near 4, whereas the latter's…
develarist
  • 1,224
  • 1
  • 13
  • 34
0
votes
1 answer

Wilcoxon rank sum test between two data frames in python

I am trying to perform a Wilcoxon rank-sum test between two data frames. I would like to perform the test only between the rows. for example, the test should only be done between row 1 in df1 (A, 1, 2, 3) and df2 (A ,10, 12 ,13), row 2 in…
Monica
  • 51
  • 5
0
votes
1 answer

Can I use hypothesis Testing on Train and Test data?

I was wondering if I could use Hypothesis Testing against trainning and testing data, after splitting my dataset. My objective is to check if both of the data samples group are well balanced, distributed and so Will provide a Nice environment for…
jaymzleutz
  • 155
  • 2
  • 10
0
votes
1 answer

Fit gamma distribution with fixed mean with scipy?

scipy.stats.rv_continuous.fit, allows you to fix parameters when fitting a distribution, but it's dependent on scipy's choice of parametrization. For the gamma distribution is uses the k, theta (shape, scale) parametrization, so it would be easy to…
Jacob
  • 558
  • 2
  • 7
0
votes
1 answer

Scipy.stats error "Too many values to unpack" when attempting linregress on .csv dataset

I'm trying to fit a line to my experimental data. When I run the code I usually use I get the error Traceback (most recent call last): File "/home/h/oscillator1.py", line 21, in slope, intercept, r_value = scipy.stats.linregress(data) ValueError:…
lain
  • 5
  • 1
0
votes
0 answers

Is it possible to do running correlation with one fixed series in Python column-wise instead of rows?

I asked a question here two months ago (Is it possible to do running correlation with one fixed series in Python?) where I recieved great help from a fellow user. My goal is to do running correlation with one fixed series in Pandas. This can be…
Vichtor
  • 197
  • 2
  • 10
0
votes
2 answers

Does fitting Weibull distribution to data using scipy.stats perform poor?

I am working on fitting Weibull distribution on some integer data and estimating relevant shape, scale, location parameters. However, I noticed poor performance of scipy.stats library while doing so. So, I took a different direction and checked the…
begumgenc
  • 393
  • 1
  • 2
  • 12
0
votes
1 answer

Multivariate random variables with scipy.stats rvs() function

The scipy.stats suite of statistical distributions (scipy.stats.norm, scipy.stats.uniform, scipy.stats.t etc) all produce univariate data series using their own .rvs() function, and only one has a multivariate rendition: multivariate_normal, which…
develarist
  • 1,224
  • 1
  • 13
  • 34
0
votes
0 answers

How does scipy.stats.distributions.fit works?

I was trying to approximate some hard distribution with normal one. But scipy.stats.norm.fit gives some awkward results. Approximational result is very far from original. Then I have tried manually set parameters and it looks far more better.…
Nourless
  • 729
  • 1
  • 5
  • 18
0
votes
1 answer

Should I encode my ordinal variables before calculating Spearmans Rank Correlation (scipy)?

I am using scipy.stats.spearmanr to calculate Spearman's Rank Correlation of 2 ordinal variables. I wasn't sure whether to encode them or not. I tried it both ways and the function seems to spit out results regardless. So I am not sure which way to…