Questions tagged [statistics]

Consider whether your question would be better asked at https://stats.stackexchange.com. Statistics is the mathematical study of using probability to infer characteristics of a population from a limited number of samples or observations.

Statistics is the scientific study of the collection, analysis, interpretation, presentation, and organization of data. Numerous programming languages provide support for implementing statistical techniques.

Consider whether your question would be better asked at CrossValidated, a Stack Exchange site for probability, statistics, data analysis, data mining, experimental design, and machine learning. StackOverflow questions on statistics should be about implementation and programming problems, not about theoretical discussions of statistics or research design. Therefore, this tag should never be used alone but always in combination with a specific programming language (like for example r, python, spss, sas, matlab).

16319 questions

votes

12 answers

How do I determine the standard deviation (stddev) of a set of values?

I need to know if a number compared to a set of numbers is outside of 1 stddev from the mean, etc..

c# math statistics numerical

asked May 22 '09 at 00:26

dead and bloated

votes

6 answers

Calculate the Cumulative Distribution Function (CDF) in Python

How can I calculate in python the Cumulative Distribution Function (CDF)? I want to calculate it from an array of points I have (discrete distribution), not with the continuous distributions that, for example, scipy has.

python numpy machine-learning statistics scipy

asked Jul 16 '14 at 18:36

wizbcn

1,064
1
12
19

votes

3 answers

Multivariate time series modelling in R

I want do fit some sort of multi-variate time series model using R. Here is a sample of my data: u cci bci cpi gdp dum1 dum2 dum3 dx 16.50 14.00 53.00 45.70 80.63 0 0 1 6.39 17.45 16.00 64.00 …

r statistics time-series

asked Nov 11 '09 at 10:19

Karl

5,573
8
50
73

votes

8 answers

How to generate distributions given, mean, SD, skew and kurtosis in R?

Is it possible to generate distributions in R for which the Mean, SD, skew and kurtosis are known? So far it appears the best route would be to create random numbers and transform them accordingly. If there is a package tailored to generating…

r statistics skew frequency-distribution

asked Jan 26 '11 at 16:56

Aaron B

votes

2 answers

why does scikitlearn says F1 score is ill-defined with FN bigger than 0?

I run a python program that calls sklearn.metrics's methods to calculate precision and F1 score. Here is the output when there is no predicted sample: /xxx/py2-scikit-learn/0.15.2-comp6/lib/python2.6/site-packages/sklearn/metr\ ics/metrics.py:1771:…

python machine-learning statistics scikit-learn

asked Jan 13 '16 at 02:56

Tim

votes

2 answers

Python pandas returns empty correlation matrix

I am running Python 2.7.6, pandas 0.13.1. I am unable to compute a correlation matrix from a DataFrame, and I'm not sure why. Here is my example DataFrame (foo): A B C 2011-10-12 0.006204908…

python pandas dataframe statistics correlation

asked Mar 18 '14 at 13:44

Max

1,670
1
12
17

votes

6 answers

Function to calculate R2 (R-squared) in R

I have a dataframe with observed and modelled data, and I would like to calculate the R2 value. I expected there to be a function I could call for this, but can't locate one. I know I can write my own and apply it, but am I missing something…

r function statistics

asked Dec 01 '16 at 02:05

Esme_

1,360
3
18
30

votes

5 answers

Meaning of X = X[:, 1] in Python

I am studying this snippet of python code. What does X = X[:, 1] mean in the last line? def linreg(X,Y): # Running the linear regression X = sm.add_constant(X) model = regression.linear_model.OLS(Y, X).fit() a = model.params[0] b…

python statistics

asked Nov 03 '15 at 05:05

Taewan

1,167
4
15
25

votes

3 answers

predict.lm() in a loop. warning: prediction from a rank-deficient fit may be misleading

This R code throws a warning # Fit regression model to each cluster y <- list() length(y) <- k vars <- list() length(vars) <- k f <- list() length(f) <- k for (i in 1:k) { vars[[i]] <- names(corc[[i]][corc[[i]]!= "1"]) f[[i]] <-…

r statistics linear-regression lm

asked Oct 25 '14 at 01:56

Mahsa

votes

3 answers

R Random Forests Variable Importance

I am trying to use the random forests package for classification in R. The Variable Importance Measures listed are: mean raw importance score of variable x for class 0 mean raw importance score of variable x for class…

r statistics data-mining random-forest

asked Apr 10 '09 at 02:18

thirsty93

2,602
6
26
26

votes

4 answers

Computing cross-correlation function?

In R, I am using ccf or acf to compute the pair-wise cross-correlation function so that I can find out which shift gives me the maximum value. From the looks of it, R gives me a normalized sequence of values. Is there something similar in Python's…

python r statistics numpy scipy

asked Aug 09 '11 at 04:46

Legend

113,822
119
272
400

votes

5 answers

plotting a histogram on a Log scale with Matplotlib

I have a Pandas DataFrame that has the following values in a Series x = [2, 1, 76, 140, 286, 267, 60, 271, 5, 13, 9, 76, 77, 6, 2, 27, 22, 1, 12, 7, 19, 81, 11, 173, 13, 7, 16, 19, 23, 197, 167, 1] I was instructed to plot two histograms in a…

python pandas numpy matplotlib statistics

asked Dec 16 '17 at 21:41

Tommy

votes

5 answers

How do I do a F-test in python

How do I do an F-test to check if the variance is equivalent in two vectors in Python? For example if I have a = [1,2,1,2,1,2,1,2,1,2] b = [1,3,-1,2,1,5,-1,6,-1,2] is there something similar to scipy.stats.ttest_ind(a, b) I found sp.stats.f(a,…

python statistics

asked Feb 01 '14 at 04:39

DrewH

1,657
3
14
10

votes

2 answers

Python p-value from t-statistic

I have some t-values and degrees of freedom and want to find the p-values from them (it's two-tailed). In the real world I would use a t-test table in the back of a Statistics textbook; how do I do the equivalent in Python? e.g. t-lookup(5, 7) =…

python scipy statistics

asked Jul 09 '13 at 23:19

Andrew Latham

5,982
14
47
87

votes

5 answers

How to get GitHub Clone stats?

There used to be a "Clones" sub-tab in the "Stats & Graphs" tab of GitHub (for example https://github.com/TeamMentor/TeamMentor-Documentation/graphs/impact) but that is gone. Is there another way to get these stats? It would be great if we could get…

github statistics

asked Apr 07 '12 at 17:23

Dinis Cruz

4,161
2
31
49

Prev 1 2 3

…

99 100 Next