So I have some code that looks at two data frames and subtracts a column value named "Intensity" for certain Compositions of molecules. However for instance if the molecule is not in the other data frame, it completely gets rid of that row for some reason not too sure why.
blankdata3 and data3 are my two dataframes that I am subtracting. So I am subtracting a molecules's Intensity such as
(data3 - blankdata3) = datasubtracted
I have the code below that subtracts intensity based on if they have the same composition. However if data3 has a composition that is not found in blankdata3, it will completely get rid of that row when I print my results of datasubtracted. I'm not sure why it is doing that because wouldn't it jut be subtracting by zero if its not found in blankdata3?
#data3 looks like this but with more rows
m.z Intensity Relative Delta..ppm. RDB.equiv. Composition
301.14093 7646 100.00 -0.34 5.5 C16 H22 O4 Na
149.02331 4083458.5 23.60 -0.08 6.5 C8 H5 O3
279.15908 33256 18.64 -0.03 5.5 C16 H23 O4
#blankdata3 looks like this but with more rows
m.z Intensity Relative Delta..ppm. RDB.equiv. Composition
331.11233 4324 94.00 -0.33 6.5 C17 H26 O5 Na
149.02331 3056982.3 23.60 -0.08 6.5 C8 H5 O3
279.15908 20000 18.64 -0.03 5.5 C16 H23 O4
#This is the current code I have for subtraction
datasubtracted <- blankdata3 %>% left_join(select(data3, Intensity, Composition), by ="Composition") %>%
mutate(Intensity = ifelse (is.na(Intensity.y), -Intensity.x, Intensity.y - Intensity.x)) %>%
select(-Intensity.y, -Intensity.x ) %>%
bind_rows(anti_join(data3, blankdata3, by = "Composition") %>%
mutate( Intensity = -Intensity))
#I expect to see something like this
m.z Intensity Relative Delta..ppm. RDB.equiv. Composition
301.14093 7646 100.00 -0.34 5.5 C16 H22 O4 Na
331.11233 -4324 94.00 -0.33 6.5 C17 H26 O5 Na
149.02331 1026476.2 23.60 -0.08 6.5 C8 H5 O3
279.15908 13256 18.64 -0.03 5.5 C16 H23 O4
When running your code it gave me this
m.z Intensity Relative Delta..ppm. RDB.equiv. Composition
301.14093 7646 100.00 -0.34 5.5 C16 H22 O4 Na
149.02331 4083458.5 23.60 -0.08 6.5 C8 H5 O3
279.15908 33256 18.64 -0.03 5.5 C16 H23 O4
331.11233 -4324 94.00 -0.33 6.5 C17 H26 O5 Na
149.02331 -3056982.3 23.60 -0.08 6.5 C8 H5 O3
279.15908 -20000 18.64 -0.03 5.5 C16 H23 O4
It looks like it ket the data3 intensities intact and blankdata3 intensities became negative. SO it just combined both data frames but it did no subtraction of Intensities based on similar Composition.
An exact replica of my data is shown below
#data3
m.z Intensity Relative Delta..ppm. RDB.equiv. Composition C H O N Na S
301.14093 7646 100.00 -0.34 5.5 C16 H22 O4 Na 16 22 4 0 1 0
149.02331 3056982.3 23.60 -0.08 6.5 C8 H5 O3 8 5 3 0 0 0
279.15908 33256 18.64 -0.03 5.5 C16 H23 O4 16 23 4 0 0 0
#blankdata3
m.z Intensity Relative Delta..ppm. RDB.equiv. Composition C H O N Na S
331.11233 4324 94.00 -0.33 6.5 C17 H26 O5 Na 17 26 5 0 1 0
149.02331 4083458.5 23.60 -0.08 6.5 C8 H5 O3 8 5 3 0 0 0
279.15908 13256 18.64 -0.03 5.5 C16 H23 O4 16 23 4 0 0 0