3

I have a DB composed of: Species ID (as factor), counts, site, visit, year. Find a subset in here [Google Drive]

I want to create a 4D array with the dimensions: species, site, visit and year. Counts as cell values. For which I am using the following code:

y<-tapply(counts,list(species,site,visit,year), sum) 

Some sites were not always visited within and along the years. Therefore obtaining NAs in the cell, which actually is fine. My problem is when some sites were visited at a given visit and year but the species was not seen. The original DB only has the counts were seen (except few exceptions). Therefore the code gives a NA value to those cells, but I want a 0- value.

Anyone has any advise on how to add these 0-value cells when the site was visited but the species was not seen, while keeping NAs for when sites were not visited?

Many thanks in advance.

YMC
  • 63
  • 6

1 Answers1

0

Given your data is in data.frame df

library(reshape2)

tmp <- dcast(df, site + visit + year ~ species, value.var = 'counts', fill = 0)
df <- melt(tmp, id.vars = c('site', 'visit', 'year'), variable.name = 'species', value.name = 'counts')
y <- tapply(df$counts, list(df$species, df$site, df$visit, df$year), sum)
danas.zuokas
  • 4,551
  • 4
  • 29
  • 39
  • Thanks you danas! It works. I only added the function sum to the dcast so to have the sum of observations when value is not zero. tmp <- dcast(df, site + visit + year ~ species, sum,value.var = "counts", fill = 0) – YMC Mar 30 '16 at 14:05