Questions tagged [hypothesis-test]

Functions used to choose between competing hypotheses about one or more probability distributions. For statistical questions, please use stats.stackexchange.com.

Common hypothesis tests include the one-sample and paired t-test for means, the z-test, which approximates the t-test for large samples, F-test for differences in variance, and Chi-square test for independence, and Fisher's exact test for differences in proportion.

Please note that this tag is totally different from , which refers to software testing.

349 questions
5
votes
2 answers

Statistical tests: how do (perception; actual results; and next) interact?

What is the interaction between perception, outcome, and outlook? I've brought them into categorical variables to [potentially] simplify things. import pandas as pd import numpy as np high, size = 100, 20 df = pd.DataFrame({'perception':…
A T
  • 13,008
  • 21
  • 97
  • 158
5
votes
1 answer

p-value from fisher.test() does not match phyper()

The Fisher's Exact Test is related to the hypergeometric distribution, and I would expect that these two commands would return identical pvalues. Can anyone explain what I'm doing wrong that they do not match? #data (variable names chosen to match…
R-Peys
  • 123
  • 1
  • 9
5
votes
3 answers

Is there an Anderson-Darling implementation for python that returns p-value?

I want to find the distribution that best fit some data. This would typically be some sort of measurement data, for instance force or torque. Ideally I want to run Anderson-Darling with multiple distributions and select the distribution with the…
5
votes
1 answer

Peacock test implementation

I would like to compare two 2D distributions statistically. Thus I would like to use the Peacock test (a 2D analogue of the Kolmogorov-Smirnov test). There is an R package called Peacock.test which claims to implement it. But the documentation is…
MassCorr
  • 349
  • 1
  • 8
5
votes
1 answer

Poorly implemented two-sample Kolmogorov-Smirnov test (kstest2) in Matlab?

Am I missing something obvious or Matlab's kstest2 is giving very poor p-values? Under very poor I mean that I have the suspicion that it is even wrongly implemented. Help page of kstest2 states that the function calculates asymptotic p-value,…
rozsasarpi
  • 1,621
  • 20
  • 34
5
votes
1 answer

Multiple T-test in R

I have a 94 varibles(sample+proteins+group) and 172 observations in a matrix as: Sample Protein1 Protein2 ... Protein92 Group 1 1.53 3.325 ... 5.63 0 2 2.32 3.451 ... 6.32 0 . . . 103 3.24 …
PrincessJellyfish
  • 149
  • 1
  • 1
  • 9
4
votes
1 answer

R: Adding constant in linear combination, glht()

So I am trying to replicate a stata function I saw in Priciples of Econometrics, by Hill, Griffiths and Lim. The function I want to replicate looks like this in stata; lincom _cons + b_1 * [arbitrary value] - c This is to the null hypothesis H0 :…
im2wddrf
  • 551
  • 2
  • 5
  • 19
4
votes
0 answers

T-test with confidence other than 0.95 using HypothesisTests.jl

When I call the following example I receive a pretty report, but with confidence equal to 95% specifically: julia> OneSampleTTest([1,2,3], 2) One sample t-test ----------------- Population details: parameter of interest: Mean value under…
Luke
  • 1,369
  • 1
  • 13
  • 37
4
votes
1 answer

Why does SciPy return `nan` for a t-test with samples with 0 variance?

I am using SciPy in Python and the following return a nan value for whatever reason: >>>stats.ttest_ind([1, 1], [1, 1]) Ttest_indResult(statistic=nan, pvalue=nan) >>>stats.ttest_ind([1, 1], [1, 1, 1]) Ttest_indResult(statistic=nan,…
under_the_sea_salad
  • 1,754
  • 3
  • 22
  • 42
4
votes
1 answer

Multivariate K-S test in R

So we can run a K-S test to assess if we have a difference in the distribution of dtwo datasets, as outlined here. So lets take the following data set.seed(123) N <- 1000 var1 <- runif(N, min=0, max=0.5) var2 <- runif(N, min=0.3, max=0.7) var3 <-…
lukeg
  • 1,327
  • 3
  • 10
  • 27
3
votes
1 answer

pydantic custom hypothesis build

Problem in a nutshell I am having issues with the hypothesis build strategy and custom pydantic data types (no values are returned when invoking the build strategy on my custom data type. Problem in more detail Given the following pydantic custom…
Josmoor98
  • 1,721
  • 10
  • 27
3
votes
1 answer

Proportion Test: Z-test vs bootstrap/permutation - different results

I'm learning hypothesis testing, and going through the following example: The CEO of a large electric utility claims that 80 percent of his 1,000,000 customers are very satisfied with the service they receive. To test this claim, the local newspaper…
jafjaf
  • 33
  • 5
3
votes
3 answers

Jarque Bera Test with NA's

I want to perform a Jarque-Bera Test with the tseries package on a data.frame with about 200 columns but it doesn't work with NA values. My data.frame looks like this: d1 <- structure(list(Time=structure(17942:17947, class="Date"), …
Pogi93
  • 63
  • 6
3
votes
2 answers

Keep multiple values of chisq.test in summarised tibble

I have grouped data I'm performing a chi-squared test on and would like to returned a summary table that includes multiple values from the htest object. For example (from a previous question), library(dplyr) set.seed(1) foo <- data.frame( …
merv
  • 67,214
  • 13
  • 180
  • 245
3
votes
1 answer

Check whether two coefficients in a regression differ in Python statsmodels

In R, the car::linearHypothesis function can be used to test the hypothesis that two coefficients are equal (that their difference differs significantly from zero). Here's an example from its documentation: linearHypothesis(mod.duncan, "income =…
Max Ghenis
  • 14,783
  • 16
  • 84
  • 132
1
2
3
23 24