1

I am fairly new to programming, so forgive me if I give too little information. I have a df which looks like something like this:

Diagnosis Value Brainregion
NC 2 region_a
NC 3 region_b
BD 4 region_a
BD 5 region_b

I would like to perform a permutation test between same brain regions of different diagnoses (to clarify: mean value of region_a in BD vs mean value of region_a in NC, mean value of region_b in BD vs mean value of region_b in NC and so on).

I would like to use a code that would help me do it in one step for every region.

I tried adapting the method described below, but I can't seem to make it work as intended.

Multiple groups tests via permutation

Can someone please help me?

P.S. I have another version of the same dataframe which looks like this, if it can be more useful:

Diagnosis Region_a Region_b
NC 2 3
BD 4 5
rinnegab
  • 13
  • 4
  • What is your expected output? – Maël Sep 04 '22 at 08:34
  • I expect the code to perform n permutation t tests (where n is the number of regions) to get n p-values. To be clearer, the code should perform a test to compare the mean value of region _a for diagnosis=scz vs the mean value of region_a for diagnosis=nc, looping it for each region – rinnegab Sep 04 '22 at 10:27

1 Answers1

0

Using by to split the data into regions and performing the t.tests.

by(dat, dat$region, \(x) {
  tt <- with(x, t.test(value ~ diagnosis))
  data.frame(region=el(as.character(x$region)), tt[c('statistic', 'p.value')],
             hypothesis=toString(unique(x$diagnosis)))
}) |> do.call(what=rbind)
#   region statistic    p.value hypothesis
# a      a  1.628979 0.23082956     NC, BD
# b      b -2.813154 0.05840455     NC, BD
# c      c  1.030808 0.36206117     NC, BD

Data:

dat <- structure(list(diagnosis = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), levels = c("NC", 
"BD"), class = "factor"), region = structure(c(1L, 1L, 2L, 2L, 
3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L), levels = c("a", 
"b", "c"), class = "factor"), value = c(0.914806043496355, 0.937075413297862, 
0.286139534786344, 0.830447626067325, 0.641745518893003, 0.519095949130133, 
0.736588314641267, 0.13466659723781, 0.656992290401831, 0.705064784036949, 
0.45774177624844, 0.719112251652405, 0.934672247152776, 0.255428824340925, 
0.462292822543532, 0.940014522755519, 0.978226428385824, 0.117487361654639
)), out.attrs = list(dim = structure(2:3, names = c("diagnosis", 
"region")), dimnames = list(diagnosis = c("diagnosis=NC", "diagnosis=BD"
), region = c("region=a", "region=b", "region=c"))), row.names = c(NA, 
-18L), class = "data.frame")
jay.sf
  • 60,139
  • 8
  • 53
  • 110
  • Thank you, I guess this solved it for me! One last quick question for clarification: is this performing a non parametric permutation t test? I need it to be a non parametric test because I can't assume that the variable follows a Gaussian distribution. P.S.: sorry if I'm being unclear, I may need to revise my statistics knowledge – rinnegab Sep 04 '22 at 21:00
  • Hello, and thank you for your answer again, I found it extremely helpful. I have another problem with same package, sorry if I ask but could you please take a look? Thank you – rinnegab Apr 19 '23 at 13:17