0

I have a table (using R in Spotfire) where I am trying to determine the adjusted peak area based on data in the same table. So below is an example of the table

df <- data.frame(Sample_Name = c("Smpl 1", "Smpl 1", "Smpl 2", "Smpl 2"), 
                 Peak_Area = c(100, 101, 50, 51),
                 Analyte = c("Asn","Asn*","Leu","Leu*"),
                 Int_Std = c("Asn*","","Leu*",""))

So for me to determine the adjusted peak area, I need to find the internal standard peak area for the same sample and same analyte name by matching the Int Std field with the Analyte field. So for sample 1, the calculated value would be 100 / 101.

Essentially, I want to look at each row. If it has both a "Analyte" and "Int Std" field value, I want it to find the other row that matches the same "Sample Name" and where "Analyte" = "Int Std", and divide the original row "Peak Area" value by the found row "Peak Area" value (100 / 101)

  • I think most of the people here are not familiar with the problem you are investigating. Can you convert it into the language of programming? Does it mean that you want to divide the second value with the first value per ID? – tmfmnk Nov 03 '19 at 20:38
  • @tmfmnk correct, I want to look at each row. If it has both a "Analyte" and "Int Std" field value, I want it to find the other row that matches the same "Sample Name" and where "Analyte" = "Int Std", and divide the original row "Peak Area" value by the found row "Peak Area" value (100 / 101). – Douglas L Scheesley Nov 03 '19 at 20:43
  • Might be good if you could provide a larger sample dataset. Am I assuming correctly that `aminoacid*` always implies an internal standard? – Dunois Nov 03 '19 at 21:09
  • @Dunios correct, the asterix donates internal standard. Perhaps I could put a larger set together. – Douglas L Scheesley Nov 03 '19 at 21:13

1 Answers1

0

You could try:

library(dplyr)

df %>%
  mutate_if(is.factor, as.character) %>%
  group_by(`Sample Name`) %>%
  mutate(
    `Adjusted Peak` = if (any(Analyte %in% `Int Std`)) `Peak Area`[!Analyte %in% `Int Std`] / `Peak Area`[Analyte %in% `Int Std`] else NA_real_
  )

Output:

# A tibble: 4 x 5
# Groups:   Sample Name [2]
  `Sample Name` `Peak Area` Analyte `Int Std` `Adjusted Peak`
  <chr>               <dbl> <chr>   <chr>               <dbl>
1 Smpl 1                100 Asn     Asn*                0.990
2 Smpl 1                101 Asn*    ""                  0.990
3 Smpl 2                 50 Leu     Leu*                0.980
4 Smpl 2                 51 Leu*    ""                  0.980
arg0naut91
  • 14,574
  • 2
  • 17
  • 38
  • I am getting an error. Error: Column `Adjusted_Peak` must be length 4 (the group size) or one, not 2 Execution halted. Thanks for your help! Added code here: [link](https://rnotebook.io/anon/37e684a048bc0fdb/notebooks/Untitled.ipynb?kernel_name=ir) – Douglas L Scheesley Nov 04 '19 at 00:25
  • Please add a more complex example then that better reflects your real data. – arg0naut91 Nov 05 '19 at 21:03