I am trying to create a null expectation in R. The dataset has four species and associated sample values of x.real
and y.real
. I want to create a null expectation by only shuffling the x.real for each species. Estimate slope. Repeat 1000 times, say. And then calculate the mean of the slope.
Here is my attempt. I get an error that says Error in filter(., sp == "A") : object '*tmp*' not found
Any suggestions on how to correct this to get the desired output (attached).
set.seed(111)
library(truncnorm)
x.real <- rtruncnorm(n = 288,a = 0,b = 10,mean = 5,sd = 2)
y.real <- rnorm(288,0,4)
sp <- rep(c("A","B","C","D"), each = 72)
df <- data.frame(x.real, y.real, sp)
output <- tibble(y.intercept = numeric(),
slope = numeric(),
sp = character(),
set = numeric())
set.seed(42)
for(i in 1:1000){
samp1 <- df %>% filter(sp == 'A') %>% mod1 <- lm(y.real ~ sample(x.real, length(x.real), replace = TRUE)) %>% summarise(y.intercept = mean(mod1$coefficients[1]), slope = mean(mod1$coefficients[2])) %>% mutate(set = i)
samp2 <- df %>% filter(sp == 'B') %>% mod2 <-lm(y.real ~ sample(x.real, length(x.real),replace = TRUE)) %>% summarise(y.intercept = mean(mod2$coefficients[1]), slope = mean(mod2$coefficients[2])) %>% mutate(set = i)
samp3 <- df %>% filter(sp == 'C') %>% mod3 <-lm(y.real ~ sample(x.real, length(x.real),replace = TRUE)) %>% summarise(y.intercept = mean(mod3$coefficients[1]), slope = mean(mod3$coefficients[2])) %>% mutate(set = i)
samp4 <- df %>% filter(sp == 'D') %>% mod4 <-lm(y.real ~ sample(x.real, length(x.real),replace = TRUE)) %>% summarise(y.intercept = mean(mod4$coefficients[1]), slope = mean(mod4$coefficients[2])) %>% mutate(set = i)
output %>% add_row(bind_rows(samp1, samp2, samp3, samp4)) -> output
}
y.intercept slope sp
2 2 A
3 1 B
1 -4 C
2 -1 D