R. lapply multinomial test to list of dataframes

Question

I have a data frame A, which I split into a list of 100 data frames, each having 3 rows (In my real data each data frame has 500 rows). Here I show A with 2 elements of the list (row1-row3; row4-row6):

A <- data.frame(n = c(0, 1, 2, 0, 1, 2),
                prob = c(0.4, 0.5, 0.1, 0.4, 0.5, 0.1),
                count = c(24878, 33605, 12100 , 25899, 34777, 13765))

# This is the list:
nest <- split(A, rep(1:2, each = 3))

I want to apply the multinomial test to each of these data frames and extract the p-value of each test. So far I have done this:

library(EMT)

fun <- function(x){
  multinomial.test(x$count,
                   prob=x$prob,
                   useChisq = FALSE, MonteCarlo = TRUE,
                   ntrial = 100, # n of withdrawals accomplished
                   atOnce=100)
}

lapply(nest, fun)

However, I get:

 "Error in multinomial.test(x$counts_set, prob = x$norm_genome, useChisq = F,  : 
   Observations have to be stored in a vector, e.g.  'observed <- c(5,2,1)'"

Does anyone have a smarter way of doing this?

I don't get an error when I run your code. – Nairolf Nov 10 '18 at 05:31 — Nairolf, Nov 10 '18 at 05:31

score 1 · Accepted Answer · answered Nov 10 '18 at 05:10

The results of split are created with names 1, 2 and so on. That's why x$count in fun cannot access it. To make it simpler, you can combine your splitted elements using the list function and then use lapply:

n <- c(0,1,2,0,1,2)
prob <- c(0.4, 0.5, 0.1, 0.4, 0.5, 0.1)
count <- c(24878, 33605, 12100 , 25899, 34777, 13765)
A <- cbind.data.frame(n, prob, count)

nest = split(A,rep(1:2,each=3))

fun <- function(x){
  multinomial.test(x$count,
                   prob=x$prob,
                   useChisq = F, MonteCarlo = TRUE,
                   ntrial = 100, # n of withdrawals accomplished
                   atOnce=100)
}

# Create a list of splitted elements
new_list <- list(nest$`1`, nest$`2`)

lapply(new_list, fun)

score 1 · Answer 2 · answered Nov 10 '18 at 06:58

1

A solution with dplyr.

A = data.frame(n = c(0,1,2,0,1,2),
               prob = c(0.4, 0.5, 0.1, 0.4, 0.5, 0.1),
               count = c(43, 42, 9, 74, 82, 9))

library(dplyr)
nest <- A %>%
  mutate(pattern = rep(1:2,each=3)) %>%
  group_by(pattern) %>%
  dplyr::summarize(mn_pvals = multinomial.test(count, prob)$p.value)
nest

answered Nov 10 '18 at 06:58

paoloeusebi

1,056
8
19

Hi @paoloeusebi, your solution also works great, but Vishesh's posted a solution first. Thanks very much – Lucas Nov 10 '18 at 22:19
The most important things are that you have two solutions and I have learned something new. – paoloeusebi Nov 10 '18 at 22:21
Glad to hear that ! – Lucas Nov 12 '18 at 18:12

R. lapply multinomial test to list of dataframes

2 Answers2