Here is a solution using base R. You can probably use some of the ideas in this answer to make a much more concise answer. Let me know if this works for you!
# Create dataframe
df <- data.frame(Species = c("cat", "dog", "bird"),
year_2016 = c(14, 16, 10),
year_2017 = c(8, 12, 5),
stringsAsFactors = F)
# Create columns to later convert to a matrix
df$absent <- 0
df$present <- df$year_2016 - df$year_2017
# Tranpose the dataframe to use lapply
df_t <- t(df)
colnames(df_t) <- as.vector(df_t[1,])
df_t <- df_t[-1,]
class(df_t) <- "numeric"
# Use lapply to create matrices
matrix_list <- lapply(1:ncol(df_t), function(x) matrix(as.vector(df_t[,x]), 2, 2, byrow = T))
names(matrix_list) <- colnames(df_t)
matrix_list
$cat
[,1] [,2]
[1,] 14 8
[2,] 0 6
$dog
[,1] [,2]
[1,] 16 12
[2,] 0 4
$bird
[,1] [,2]
[1,] 10 5
[2,] 0 5
# Lots of fisher.tests
lapply(matrix_list, fisher.test)
$cat
Fisher's Exact Test for Count Data
data: X[[i]]
p-value = 0.01594
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
1.516139 Inf
sample estimates:
odds ratio
Inf
$dog
Fisher's Exact Test for Count Data
data: X[[i]]
p-value = 0.1012
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
0.7200866 Inf
sample estimates:
odds ratio
Inf
$bird
Fisher's Exact Test for Count Data
data: X[[i]]
p-value = 0.03251
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
1.195396 Inf
sample estimates:
odds ratio
Inf
And then if you want the p-values you could get them in a vector using sapply
:
sapply(tests, "[[", "p.value")
cat dog bird
0.01594203 0.10122358 0.03250774
EDIT: this is probably a slight improvement. It is a little more concise. I can check how it scales with microbenchmark
later today fi you are concerned with performance (or you have a large number of tests to run). Also, remember to penalize those p-values with all those tests ;). Also, @tmfmnk posted a great tidyverse
solution if you prefer tidyverse over base.
# Create columns to later convert to a matrix
df$absent <- 0
df$present <- df$year_2016 - df$year_2017
df_t <- t(df[-1]) # tranpose dataframe excluding column of species
# Use lapply to create the list of matrices
matrix_list <- lapply(1:ncol(df_t), function(x) matrix(as.vector(df_t[,x]), 2, 2, byrow = T))
names(matrix_list) <- df$Species
# Running the fisher's test on every matrix
# in the list and extracting the p-values
tests <- lapply(matrix_list, fisher.test)
sapply(tests, "[[", "p.value")
cat dog bird
0.01594203 0.10122358 0.03250774
Last EDIT. Was able to run them through microbenchmark
and wanted to post results for anyone who comes across this post in the future:
Unit: milliseconds
expr min lq mean median uq max neval
tidyverse_sol 12.506 13.497 15.130 14.560 15.827 26.205 100
base_sol 1.120 1.162 1.339 1.225 1.296 5.712 100