T-test for multiple classes (>2)

Question

I have read the following sentence:

Functional MRI data are high dimensional compared to the number of samples (usually 50000 voxels for 1000 samples). In this setting, machine learning algorithm can perform poorly. However, a simple statistical test can help reducing the number of voxels.

The Student’s t-test (scipy.stats.ttest_ind) performs a simple statistical test that determines if two distributions are statistically different. It can be used to compare voxel timeseries in two different conditions (when houses or faces are shown in our case). If the timeserie distribution is similar in the two conditions, then the voxel is not very interesting to discriminate the condition.

This test returns p-values that represents probabilities that the two timeseries are drawn from the same distribution. The lower is the p-value, the more discriminative is the voxel.

From: http://nilearn.github.io/building_blocks/manipulating_mr_images.html

Can this t-test also applied to 4 classes (conditions) and if yes, how?

Is there a Matlab implementation of this available?

I guess the question should be restated saying if a T Test can be applied for several n samples, and testing if they come from a single distribution. I dont know if there is a test for doing that. As said by ABC, i think the direct, faster, but unelegant choice will be doing the test by (n 2) pairs, and then state a binomial test threshold for deciding how many incorrect test should be accepted under the 'YES' hypothesis... — Brethlosze, May 26 '15 at 00:12

Ryan J. Smith · Accepted Answer · 2015-05-26T03:17:19.457

You need to perform an ANOVA (Analysis of Variance) test for each of the voxels.

From the above linked Wikipedia page:

In its simplest form, ANOVA provides a statistical test of whether or not the means of several groups are equal, and therefore generalizes the t-test to more than two groups

The question asks for you to identify voxels with states that change significantly depending on the condition, which is what ANOVA will do for you.

This can be implemented in MATLAB using anova1, (documentation).

score 0 · Answer 2 · answered May 26 '15 at 00:03

0

T-tests compare only 2 distributions. I would actually suggest you perform the z-test instead. If you go this route as you have a large amount of points, and the standard z-score you'll want to compare to is 1.96. It will tell you that there is a 95% (or 97.5, depending) chance that the distributions of the data are different.

I'm not sure if it's available somewhere in the interwebs, but I'm willing to bet it is. If not, it's really easy to implement and shouldn't take you at all long to do by hand especially in matlab.

answered May 26 '15 at 00:03

ABC

665
1
6
15

The Z-Test cannot be applied for unknown statistics of such Normal distribution... for that reason the T Test is used instead!. – Brethlosze May 26 '15 at 00:07
Well above a certain amount of input examples it doesn't matter if you use t-test versus z-test. Unless what is happening is that they are comparing each example against the other examples (which was not obvious to me from the text?) To me it sounded like within a set of input data (say the example provided) - 1000 examples- there are 4 classes- so the data is labelled. The idea is to see if the data within each of these class- say 250 per class - come from different distributions. I guess clarification would be needed. – ABC May 26 '15 at 00:40

Brethlosze · Answer 3 · 2015-05-26T16:57:17.670

This is precisely the focus of the Hotelling T2 Test, which is the multivariate version of the Student T Test. In this case, every sample is a point inside a single multivariate sample.

Check here for the theoretical explanation.

Here, p is the quantity of sample taken (in this case, 4), and n (the degrees of freedom) is the size of data from every sample (in this case, the length of the sample). The p parameter is similar to the n degrees of freedom on a parameter on the Student T Test.

Its matlab implementation is here.

Cheers...

T-test for multiple classes (>2)

3 Answers3