-1

This question is a continuation from an earlier post, but with different a data set and more specifics added.

I am trying to the bootstrap the proportional occurrence of diet items for 7 individuals and calculate a sd() using the data below.

data <- structure(list(IndID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 
6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L), .Label = c("P01", 
"P02", "P03", "P04", "P05", "P06", "P07"), class = "factor"), 
    PreyGen = structure(c(1L, 1L, 1L, 1L, 6L, 5L, 4L, 5L, 4L, 
    4L, 4L, 4L, 4L, 5L, 5L, 4L, 5L, 4L, 5L, 5L, 5L, 4L, 4L, 4L, 
    4L, 4L, 4L, 4L, 5L, 5L, 4L, 5L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
    4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 5L, 4L, 
    4L, 4L, 4L, 4L, 4L, 4L, 4L, 2L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
    5L, 2L, 4L, 3L, 4L, 4L, 4L, 3L, 4L, 4L, 3L, 4L, 4L, 4L, 4L, 
    4L, 4L, 4L, 4L, 1L, 4L, 1L, 5L, 4L, 5L, 4L, 4L, 4L, 5L, 4L, 
    4L), .Label = c("Beaver", "Bobcat", "Coyote", "Deer", "Elk", 
    "Raccoon"), class = "factor")), .Names = c("IndID", "PreyGen"
), class = "data.frame", row.names = c(NA, -100L))

The summary of which look like this.

> summary(data)
 IndID       PreyGen  
 P01: 6   Beaver : 6  
 P02:23   Bobcat : 2  
 P03:12   Coyote : 4  
 P04:20   Deer   :71  
 P05:21   Elk    :16  
 P06: 7   Raccoon: 1  
 P07:11  

There are 7 different individuals (IndID) of the same species and 6 prey species (PreyGen). Each individual ate a different number of prey in different proportions (this is the main difference from the earlier post).

My goal is to bootstrap the proportional occurrence of each diet item for each individual. The loop below generates five diets for each individual that were sampled with replacement. The data are stored as a list of the individuals, each of which contains a list of the sample diets.

EDIT added set.seed() and full output for P01

set.seed(1)
BootIndDiet <- list()
IndTotboot <- list()
for(i in unique(data$IndID)){
    for(j in 1:5){
        BootIndDiet[[j]] <- prop.table(table(sample(data$PreyGen[data$IndID == i], 
                        length(data$PreyGen[data$IndID == i]),replace = T)))
                        }
            IndTotboot[[i]] <- BootIndDiet
            }

The bootstrapped diets are specific to each individual (i) in proportion and sample size. The five bootstrapped samples for P01 are shown below as an example.

   > IndTotboot[[1]]
[[1]]

   Beaver    Bobcat    Coyote      Deer       Elk   Raccoon 
0.6666667 0.0000000 0.0000000 0.0000000 0.3333333 0.0000000 

[[2]]

   Beaver    Bobcat    Coyote      Deer       Elk   Raccoon 
0.8333333 0.0000000 0.0000000 0.0000000 0.1666667 0.0000000 

[[3]]

   Beaver    Bobcat    Coyote      Deer       Elk   Raccoon 
0.3333333 0.0000000 0.0000000 0.0000000 0.1666667 0.5000000 

[[4]]

   Beaver    Bobcat    Coyote      Deer       Elk   Raccoon 
0.6666667 0.0000000 0.0000000 0.0000000 0.1666667 0.1666667 

[[5]]

   Beaver    Bobcat    Coyote      Deer       Elk   Raccoon 
0.8333333 0.0000000 0.0000000 0.0000000 0.1666667 0.0000000 

I am trying to calculate the sd() of the proportional occurrence of each prey species for each individual. Equivocally, for each individual (P01 - P07) I want the sd() of the proportional occurrence of each prey species across the 5 diets.

While my loop produces the correct results, am wondering how to calculate the sd() of the bootstrapped diets for each prey species when the data are contained in nested lists.

I have only included 5 samples (bootstraps) for each individual here, but hope to generate 10000.

EDIT As per the suggestion by @MrFlick

An example output would look like this

           P01 P02 P03 P04 P05 P06 P07
    Beaver  A
    Bobcat  B
    Coyote  C
     Deer   D
      Elk   E
    Raccoon F

Where "A" is the sd of the proportions of beaver eaten by P01 in all five samples. Using output of P01 from above, "A" = sd(0.6666667, 0.8333333, 0.3333333, 0.6666667, 0.8333333). Moving down, "B" would represent the sd of the proportions of Bobcat eaten by P01 in all five samples, and so on for each prey species and individual.

Thanks in advance.

Community
  • 1
  • 1
B. Davis
  • 3,391
  • 5
  • 42
  • 78
  • (-1) This question states the same goal as the linked question. If those answers didn't work for you, you can offer a bounty. – Rich Scriven Sep 01 '14 at 18:43
  • @Richard Scriven, the answers worked given the lack of specificity in my previous post. In addition, the data I included in the previous post did not reflect my real data. While the goal is the same, the data to work with are different. IMO the linked question can be removed, but figured the masters behind SO would make the call. – B. Davis Sep 01 '14 at 18:47
  • 1
    You should also include your desired output. It would also help to use `set.seed()` when using functions that are stochastic like `sample` that are results are reproducible. When you say you want the "sd of the proportional occurrences" does this mean you just want the `sd` of all the values for each animal for each individual? so for P01, you would get a sd for all the beaver proportions, then an sd for all the bobcat proportions etc? What is the "shape" of the result you desire? – MrFlick Sep 01 '14 at 19:21
  • See edit per @MrFlick comment – B. Davis Sep 01 '14 at 21:06

1 Answers1

1

So here's how I might approach it. First, i define an *apply-friendly version of rbind to make things a bit cleaner

rbindlist <- function(x) do.call(rbind, x)

Then I apply this to each person, creating a matrix of proportions, then I use the base apply function along the columns to calculate the sd of each

sapply(lapply(IndTotboot, rbindlist), apply, 2, sd)

This returns

              P01        P02        P03        P04        P05       P06        P07
Beaver  0.2041241 0.00000000 0.00000000 0.00000000 0.00000000 0.1749636 0.04979296
Bobcat  0.0000000 0.00000000 0.00000000 0.00000000 0.05429407 0.0000000 0.00000000
Coyote  0.0000000 0.00000000 0.00000000 0.02236068 0.03984095 0.0000000 0.00000000
Deer    0.0000000 0.09425862 0.09128709 0.05477226 0.07968191 0.1749636 0.11853095
Elk     0.0745356 0.09425862 0.09128709 0.04472136 0.03984095 0.0000000 0.08131156
Raccoon 0.2173067 0.00000000 0.00000000 0.00000000 0.00000000 0.0000000 0.00000000

as desired.

MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • Thanks @MrFlick! Very helpful. Any more specifics your "*apply-friendly" function...? Thanks again. – B. Davis Sep 02 '14 at 02:56
  • 1
    It's just apply friendly in the sense that the apply functions pass the data as the first parameter and `do.call` needs the data as the second parameter. So the helper function just pushed the data back a parameter slot. – MrFlick Sep 02 '14 at 03:29