Edit: Packages used are: plyr
and vegan
. R is most up to date version.
My base data is this:
X1 = c('Archea01', 'Bacteria01', 'Bacteria02')
Sample1 = c(0.2,NA,NA)
Sample2 = c(0, 0.001, NA)
Sample3 = c(0.04, NA, NA)
df = data.frame(X1,Sample1,Sample2,Sample3)
df
X1 Sample1 Sample2 Sample3
1 Archea01 0.2 0.000 0.04
2 Bacteria01 NA 0.001 NA
3 Bacteria02 NA NA NA
Data purposefully made with NAs, to reflect real data.
My goal is to sum the frequency of bacterial/archeal occurrence in each sample, which would ideally create this type of data frame:
Sample1 Sample2 Sample3
23 11 12
I have managed to create a list of frequency:
dfFreq <- apply(df, 2, count)
Although this looks good, it's not quite what I want:
head(dfFreq)[2]
$Sample2
x freq
1 0.000 23
2 0.001 5
3 <NA> 50
The next logical step would be to convert the list into a dataframe and sum frequency (or vice versa), but my code has not worked. I have tried:
df.data <- ldply (dfFreq, data.frame)
dfSUM <- apply(dfFreq, 2, sum)
Trying to sum the list simply hasn't worked (unsurprisingly). Regarding transforming into a dataframe, I have looked all over Stack Overflow and have seen a lot suggesting the above or lapply
, but the data frame that is created from the code suggested is:
x freq
Archea01 1
Bacteria01 1
etc etc
Which is not what I want.
Any thoughts about how to either A) sum frequency and then convert into a data frame like the one I want, or B) convert the list into a sensible data frame whose frequency column can be summed? I think A is the only way I can get to the point I want, but any thoughts about this would be greatly appreciated.
Edit 2.0: Ryan Morton suggested the following code:
require(dplyr)
dfBound <- rbind(dfFreq)
Which has resulted in this data frame:
X1 Sample1
dfFreq list(x = 1:1885, freq = c(1, 1, 1) list(x = c(1, 2, 3)
Although this certainly seems closer to the solution, I notice that each list either follows the format of X1, or the format of Sample1 (x = c(1,2,3, etc), which indicates that something wrong happened in the process of binding the lists.
Any ideas of why this may not be working, and what solution there may be for summing the frequency found within the list?
Thanks very much.