0

I have a sample data.frame, "events" which has multiple prey captures occurring on a single dive. Based on the Capture column, I have used the word "handling" to tally up the number of captures per dive.

However, in some instances I have multiple prey types in a single dive. How can I work out the number of prey captures based on species (i.e. how many fish.a and how many fish.b caught in a single dive)?

Any advice would be appreciated.

events <- data.frame(Prey_present =c("fish.a", "fish.a","", "fish.b", 
"fish.b","fish.b"),
Capture = c("","","handling", "", "", "handling") ,
Dive_id =c("dive.1", "dive.1","dive.1", "dive.1","dive.1", "dive.1"))

temp<- tapply(events$Capture, events$Dive_id, function(x) rle(x == 
"handling"))
ncaptures<- data.frame(id = names(temp), 
tally = unlist(lapply(temp, function(x) sum(x$values))))
final<-ncaptures[order(ncaptures$id),] 

My final output (which I will bind to my bigger data.frame) should be something like:

final <- data.frame(fish.a =c(1),
fish.b = c(1),
Dive_id =c("dive.1"))                    
Grace Sutton
  • 113
  • 10
  • you can use aggregate function to find out the total number of preys captured in a single dive – Hunaidkhan Nov 19 '18 at 05:02
  • @Hunaidkhan yes I've tried to aggregate by dive but it is not reliable as the numbers I would get from the Prey_present column would not reflect the actual number of things being caught. – Grace Sutton Nov 19 '18 at 05:11
  • Can you do one thing just provide screenshot or dput() of the expected output, i will help you out with the code. – Hunaidkhan Nov 19 '18 at 05:13
  • table(events1$Dive_id,events1$Prey_present) this will work – Hunaidkhan Nov 19 '18 at 05:17
  • @Hunaidkhan your suggestion gives a count of the column "Prey_present". What I'm trying to work out is a count of handling for each factor in Prey_present in a single id/dive. – Grace Sutton Nov 19 '18 at 05:36

2 Answers2

1

Get rid of the Capture column and use the dplyr library to aggregate

library(dplyr)

capture_tally <- events %>% group_by(Dive_id, Prey_present) %>% 
    summarise(Count_of_Captures = n())

It will group by Dive_id and Prey_Present. then use the summarise function to perform the counts for each particular dive and prey type captured.

You can name the Count_of_Captures column whatever you want.

EDIT: Here's the output of the above code.

 Dive_id        Prey_present         Count_of_Captures
  <fctr>       <fctr>               <int>
1  dive.1                              1
2  dive.1       fish.a                 2
3  dive.1       fish.b                 3

EDIT: ok, try this.

library(tidyr); 

events %>% group_by(Dive_id, Prey_present) %>% 
   filter(Capture != "") %>%  # filter out captured ones (handling)
   summarise(Count = n()) %>%  #get the count for each fish type (long format)
   spread(Prey_present, Count) # Use the spread() function from tidyr package to convert the data from long to wide format

I'm guessing you're anytime the Capture Column is blank, no fish has been captured. and that you're counting only the instances it says handling. I might have misunderstood you again, so I apologize.

Wally Ali
  • 2,500
  • 1
  • 13
  • 20
  • Can you include your output as a dput? When I run your suggested code, I still get the same result (i.e. the sum of the prey present column) – Grace Sutton Nov 19 '18 at 09:27
  • This still does not answer my question. The preycapture column indicates something is captured, not the prey present column. So 2 things are captured. Fisha once and Fishb once. See output that I am looking for in my question. Thanks for giving it a go! – Grace Sutton Nov 19 '18 at 22:54
0
library(dplyr)               
new1<- events %>% group_by(Dive_id,Prey_present) %>% summarise(Capture = NROW(Capture))

this will give you required output

Hunaidkhan
  • 1,411
  • 2
  • 11
  • 21
  • Still not giving the right thing. The code still produces a count of prey_present column. The output of dput(new1): structure(list(Dive_id = structure(c(1L, 1L, 1L), .Label = "dive.1", class = "factor"), Prey_present = structure(1:3, .Label = c("", "fish.a", "fish.b" ), class = "factor"), Capture = 1:3), row.names = c(NA, -3L ), class = c("grouped_df", "tbl_df", "tbl", "data.frame"), vars = list( Dive_id), drop = TRUE, .Names = c("Dive_id", "Prey_present", "Capture")) – Grace Sutton Nov 19 '18 at 05:55