0

I have a data frame from a measurement where for each measurement a background is also measured:

 Wavelength   Background_1   1   Background_2   2   ...
 300          5              11  4              12  ...
 301          3              12  5              10  ...
 ...          ...            ... ...            ... ...

I want to subtract the appropriate "Background_xyz" column from the corresponding column (e.g. subtract "Background_1" from "1". It would then look like this:

 Wavelength   1_corrected   2_corrected   ...
 300          6             8             ...
 301          9             5             ...
 ...          ...           ...           ...

I can get this far no problem. The problem is, sometimes there are 3 measurements, so 3 columns with background and "real" data each, sometimes there are only 1 or 2 measurements. I am looking for a way to have R "correct" columns by subtracting the background independent of the number of columns to do so. I was thinking maybe an if function checking for the column names would to the trick but I am not experienced enough to figure out a way to do that yet. Help is greatly appreciated!

Nuramon
  • 79
  • 6

1 Answers1

1

You can first find the columns which have only numbers using grep, you can then get the corresponding "Background" columns and subtract.

cols <- grep('^\\d+$', names(df), value = TRUE)
new_cols <- paste0(cols, '_corrected')
df[new_cols] <- df[cols] - df[paste0('Background_', cols)]
df[c("Wavelength", new_cols)]

#  Wavelength 1_corrected 2_corrected
#1        300           6           8
#2        301           9           5

data

df <- structure(list(Wavelength = 300:301, Background_1 = c(5L, 3L), 
`1` = 11:12, Background_2 = 4:5, `2` = c(12L, 10L)), 
class = "data.frame", row.names = c(NA, -2L))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Thanks for the reply, I am just wondering, why you use "new_cols" because I don't see it being defined and I get an error when I try to use it like you suggest – Nuramon Aug 03 '20 at 12:18
  • @Nuramon Sorry, i forgot to include the line. Updated the answer. Can you check now? – Ronak Shah Aug 03 '20 at 12:21
  • I get a different error now: Can't subset columns that don't exist. x Columns `Background_1`, `Background_2`, and `Background_3` don't exist. – Nuramon Aug 04 '20 at 09:15
  • @Nuramon Please adjust the answer based on your real data. Check the column names that you have in your data and the ones that you have shown. Do you really have columns with names as "1", "2" etc or are they named differently. If you are unable to adjust the answer according to your data then add your data using `dput(head(df))` instead of `.......` so that we can help. – Ronak Shah Aug 04 '20 at 09:38
  • I figured out the error (on my part), thank you for your help! – Nuramon Aug 04 '20 at 10:12