Questions tagged [statistics]

Consider whether your question would be better asked at https://stats.stackexchange.com. Statistics is the mathematical study of using probability to infer characteristics of a population from a limited number of samples or observations.

Statistics is the scientific study of the collection, analysis, interpretation, presentation, and organization of data. Numerous programming languages provide support for implementing statistical techniques.

Consider whether your question would be better asked at CrossValidated, a Stack Exchange site for probability, statistics, data analysis, data mining, experimental design, and machine learning. StackOverflow questions on statistics should be about implementation and programming problems, not about theoretical discussions of statistics or research design. Therefore, this tag should never be used alone but always in combination with a specific programming language (like for example , , , , ).

16319 questions
4
votes
2 answers

Dynamic Programming: Number of ways to get at least N bubble sort swaps?

Let's say I have an array of elements for which a total ordering exists. The bubble sort distance is the number of swaps that it would take to sort the array if I were using a bubble sort. What is an efficient (will probably involve dynamic…
dsimcha
  • 67,514
  • 53
  • 213
  • 334
4
votes
4 answers

R: how can I create a table with mean and sd according to experimental group alongside p-values?

I know how I can do all that for individual variables but I need to report this information for a large number of variables and would like to know if there is an efficient way to do this.
Mike
  • 455
  • 2
  • 7
  • 9
4
votes
2 answers

API for NCAA 2011/12 mens basketball team and player stats

I am looking for an existing API or a method to obtain NCAA Men's Basketball player and team stats. I have failed to finding anything that is easy to use or up-to-date. Any suggestions out there?
sholmes
  • 195
  • 2
  • 8
4
votes
2 answers

Ideas for GPU implementation of Hoeffding's "D" (Dependence) coefficient?

I am trying to come up with a very fast algorithm for calculating this very interesting statistic that takes full advantage of the capabilities of a powerful GPU. Ideally I will do this in Matlab using Jacket, but other ideas in CUDA code or OpenCL…
Doodles
  • 195
  • 1
  • 2
  • 7
4
votes
2 answers

inverse of a cdf

I would like to compute the inverse cumulative density function (inverse cdf) of a given pdf. The pdf is directly given as a histogram, ie., a vector of N equally spaced components. My current approach is to do : cdf = cumsum(pdf); K = 3; %// some…
nbonneel
  • 3,286
  • 4
  • 29
  • 39
4
votes
1 answer

Getting MQ queue statistics in Java

From my application I need to query some Websphere MQ per-queue statistics (last message get/put time, number of en/dequeued messages, current queue depth, number of connected clients). I managed to get the queue depth via PCFAgent, but I'm kind of…
CAFxX
  • 28,060
  • 6
  • 41
  • 66
4
votes
2 answers

open sourced / free statistical engine for .NET / c# projects?

I have an ASP.NET C# web application that required some statistical functions. So far I've hand written them myself however as the statistical side expands, I'd rather reuse an open sourced or free statistical engine as a library. I looked at R and…
DeepSpace101
  • 13,110
  • 9
  • 77
  • 127
4
votes
1 answer

How to use the GSL implementation of the Pearson correlation coefficient?

I have two vectors of floats, x and y, and I want to compute the Pearson correlation coefficients. As I have to do it on a lot of data (for instance 10 millions different vectors x and 20 thousand different vectors y), I am using C++, and more…
tflutre
  • 3,354
  • 9
  • 39
  • 53
4
votes
3 answers

Calculate interquartile mean from Ruby array?

I have this array: [288.563044, 329.835918, 578.622569, 712.359026, 866.614253, 890.066321, 1049.78037, 1070.29897, 2185.443662, 2492.245562, 4398.300227, 13953.264379] How do I calculate the interquartile mean from this? That Wikipedia link…
Shpigford
  • 24,748
  • 58
  • 163
  • 252
4
votes
3 answers

Javascript equivalent for Inverse normal function ? eg Excel's NORMSINV() or NORMINV()?

I'm trying to convert something from my excel spreadsheets into Javascript and came along the NORMSINV() macro in my spreadsheets. The NormSInv() is nicely documented at http://office.microsoft.com/en-us/excel-help/normsinv-HP005209195.aspx.…
DeepSpace101
  • 13,110
  • 9
  • 77
  • 127
4
votes
6 answers

SQL Top 10 Sales Every Month

Greeting all. I have a SQL 2008 express database, lets name is tbl_Merchant, similar as following: Merchant | Sales | Month Comp.1 100 1 Comp.2 230 1 Comp.3 120 1 Comp.1 200 2 Comp.2 130 2 Comp.3 240…
DragonZelda
  • 168
  • 1
  • 2
  • 12
4
votes
1 answer

R: Levelplot gives extraneous whitespace when range of row.values and column.values is small

When using the levelplot (from lattice package) in R, I noticed that there is extra whitespace around the edges of the graph if the range of column.values and row.values is small (e.g range is less than 1). This problem disappears if the range of…
solvingPuzzles
  • 8,541
  • 16
  • 69
  • 112
4
votes
3 answers

What to do when KMeans returns fewer than K clusters?

I've implemented K-Means in Java and have a bit of a head scratcher. I select my initial centroids by choosing a random value in each dimension within the range of values of the data points. I've run into cases where this results in one or more of…
bab
  • 183
  • 2
  • 8
4
votes
1 answer

Accurate way of measuring overhead in kernel space

I recently implemented a security mechanism for Linux which hooks into system calls. Now I have to measure the overhead caused by it. The project requires to compare the execution time of typical Linux apps with and without the mechanism. By typical…
Łukasz Sowa
  • 1,287
  • 2
  • 11
  • 14
4
votes
1 answer

How to perform Hartley's test in R

I can find zero information on this. So if you have a web link or just know how to do it in R please let me know. Here is the one-way anova example from some stats text book: summary(av1) Df Sum Sq Mean Sq F value Pr(>F) station …
tora0515
  • 2,479
  • 12
  • 33
  • 40