I'm having trouble accurately defining my permutation design/hierarchy in the "permute" package in R.
Given a hypothetical set of plots, in which I've recorded species occurrences, I'd like to shuffle species within plots while maintaining the number of species in each plot, and also maintaining the overall abundance of individual species across the entire species pool.
Ultimately I'm trying to build a null distribution that is constrained at the plot level (n species per plot), and also at the overall species pool level (total observations of each species).
# build dataset representing the presence/absence of 10 species (columns)
# in 100 plots (rows)
set.seed(123)
dat = matrix(
sample(c(0,1), size = 100*10, replace = T, prob = c(0.75, 0.25)),
nrow = 100,
ncol = 10) # let this matrix represent the observed data
rowSums(dat) # represents the number of species present in each plot
colSums(dat) # represents the overall number of observations of each species
relative_abund = colSums(dat) / sum(dat)
# proportion of occurrences of each species in the entire species pool
# use "permute" package to shuffle species in plots
# while maintaining the total number of species in each plot
# and the relative abundance of all species in the species pool
library(permute)
# single permutation of "plot # 1" maintaining number of species per plot
dat[1, shuffle(dat[1,])]
# single permutation maintaining total observations of "species # 1"
dat[shuffle(dat[,1]), 1 ]
# use permutation design/control to shuffle data, such that
rowSums(permuted_dat) == rowSums(dat)
colSums(permuted_dat) == colSums(dat) # at least approximately