0

How do I calculate the relative abundance for each variable for each sample in R? I would then like to create a new data frame with relative abundances in each column? I have 1000 variables (columns) and 500 samples (rows). I also have a total count for each sample.

    ID  var1    var2  var3  etc.    total count
    1   10      57     16               400
    2   8       66     34               412 
    3   7       88     57               405
    4   1       90     94               402
    5   20      44     33               488
    etc.    
   Expected output:
    ID  var1    var2  var3 etc.
    1   0.03    0.14  0.04
    2   0.02    0.16  0.08  
    etc
Bhargav Rao
  • 50,140
  • 28
  • 121
  • 140
AlexP
  • 147
  • 2
  • 9

1 Answers1

1

You can solve this with a simple for loop:

df <- data.frame(
  id = 1:5,
  var1 = c(10,8,7,1,20),
  var2 = c(57,66,88,90,44),
  var3 = c(16,34,57,94,33),
  total_count = c(400,412,405,402,488)
)

abundance <- df
for (i in 2:(ncol(df) - 1)) {
  abundance[i] <- abundance[i] / abundance$total_count
}
abundance
eastclintw00d
  • 2,250
  • 1
  • 9
  • 18
  • Thank you for this. For each var I have 1722 rows(samples) how do enter each row then for e.g var1 = c(10,8,7,1,20) ? – AlexP Jul 05 '19 at 23:02
  • You mean how to get the data into a `data.frame`? You would normally read directly from a .csv or .xlsx file or something similar. Checkout functions like `read.table`, `read.csv` or packages like `openxlsx`, `readr` or `readxl`. – eastclintw00d Jul 05 '19 at 23:12
  • Btw, alistaire's solution in the comments to your question is more elegant. – eastclintw00d Jul 05 '19 at 23:14