I have two dataframes, one with raw data labels and one with the correct adjusted values the data needs to be matched to. The labels are numeric but can differ up to +/- 2. I am trying to figure out how to write a coded if/then loop since amatch does not work well for numerics. The goal is to have a loop where for every value in the raw data, it will check against the values in the corrected data and match to the closest match if one is present where corrected - raw is between -2 and 2. I have pasted my attempted but very much nonfunctional attempt below.
My thought is that it may be possible to use amatch to select the best/closest match when one is found, since in some cases there are several data label values close together while in others there is a shift in the number up to 2.
Is there a way to write such a code, or another better way to accomplish this? The goal is to have a corrected column matched to the raw data labels that I can then use to merge with the raw data and the additional metadata attached to the corrected labels, but for my full list of labels only about 60% match without needing this adjustment (you can see in the sample data, for example, 1910 should match to 1911 and 2056 needs to match to 2057). Because of the nature of the data, the differences are not consistent and I want this to be a function that I can apply to more than just a single instance of data so that I do not have to go through and match every data label by hand.
raw <- c(1419, 1444, 1460, 1485, 1501, 1542, 1581, 1590,
1606, 1622, 1647, 1663, 1688, 1704, 1743, 1791,
1793, 1809, 1850, 1866, 1891, 1905, 1910, 1954,
1956, 1976, 1996, 2012, 2028, 2041, 2053, 2056,
2067, 2100, 2102, 2122)
corrected <- c(1419, 1444, 1460, 1485, 1501, 1542, 1562,
1581, 1590, 1606, 1622, 1630, 1647, 1663,
1688, 1704, 1743, 1792, 1793, 1809, 1825,
1834, 1850, 1866, 1891, 1905, 1911, 1914,
1938, 1954, 1955, 1971, 1976, 1996, 2012,
2019, 2028, 2053, 2057, 2100, 2101, 2122)
labelmatch <- function(x, y) {data.frame(glycan=x, glycan_name=
(for(i in 1:length(x)) {
for(n in 1:length(y)) {
if (n-i <= 2 & n-i >=-2) {
match(x, y)} else{
if (n-i >= 2 | n-i <=-2){
next}}}}))
}
labelmatch(raw, corrected)