Although i have try the answer from @Onyambu at "Extract numbers from Chemical Formula in R", but the new problem was coming. The reference code is as following,
library(tidyverse)
library(stringr)
dat%>%mutate(Composition=gsub("\\b([A-Za-z]+)\\b","\\11",Composition),
name=str_extract_all(Composition,"[A-Za-z]+"),
value=str_extract_all(Composition,"\\d+"))%>%
unnest()%>%spread(name,value,fill=0)
m.z Intensity Relative Delta..ppm. RDB.equiv. Composition C H Na O
1 149.0233 4083459 23.60 -0.08 6.5 C8 H5 O3 8 5 0 3
2 279.1591 NA 18.64 -0.03 5.5 C16 H23 O4 16 23 0 4
3 301.1409 NA 100.00 -0.34 5.5 C16 H22 O4 Na1 16 22 1 4
For example. My question is how to process the formula like this, "C7H5NO4"? I only got the ("C" "H" "NO") and ("7" "5" "4"); the right form is the ("C" "H" "N","O") and ("7" "5" ,"1","4").
if we can insert 1
into the "N" and "O", the problem may be solved. I do not know how to handle it.
Thanks
Hees