-1

I have a large dataset as follows:

head(humic)
SUERC.No GU.Number d13.C Age.(BP)    error Batch.Number AMS.USED Year Type 

Sampletype
400    32691       535   -28 3382.981 34.74480            1       S3 2011    2         ha
401    32701       536   -28 3375.263 34.86087            1       S3 2011    2         ha
402    32711       537   -28 3308.103 34.83100            1       S3 2011    2         ha
403    32721       538   -28 3368.721 31.58641            1       S3 2011    2         ha
404    32731       539   -28 3368.604 34.72326            1       S3 2011    2         ha
405    32741       540   -28 3314.713 32.83147            1       S3 2011    2         ha

tail(humic)
     SUERC.No GU.Number d13.C Age.(BP)    error Batch.Number AMS.USED Year Type Sampletype
5445    70880      3962 -28.4 3390.458 29.12815           34       S4 2016    2         ha
5446    70890      3963 -28.5 3358.861 37.14896           34       S4 2016    2         ha
5447    70900      3964 -28.5 3363.626 26.71573           34       S4 2016    2         ha
5448    70910      3965 -28.5 3408.907 26.69665           34       S4 2016    2         ha
5449    70920      3966 -28.5 3348.463 29.01492           34       S4 2016    2         ha
5450    70930      3967 -28.4 3375.247 26.78261           34       S4 2016    2         ha

I am looking to create a variable to identify pairs of odd and even based on the variable GU.Number. These numbers identify duplicates of the same object - have same d13.C values.

For example, 535 - 536 537 - 538 3963-3964 3965-3966 are pairs.

Note, the column of GU.Number is not a sequence, some numbers are missing.

  • The lab that collected the data identifies the odd as the original and the following even number as the duplicate. – Fanni Feb 09 '17 at 00:07

1 Answers1

1
even.rows <- which(!(humic$GU.Number %% 2))

has.pair  <- rep(0,nrow(humic))

for(i in even.rows){
        has.pair[i] <- max((humic$GU.Number[i] + c(1,-1)) %in% humic$GU.Number)
}

# add as column of data
humic$has.pair <- has.pair

The has.pair column will be 1 if the GU.Number is even and there exists an odd GU.Number one less or one greater than the given GU.Number. Otherwise it will be 0. As a one-liner:

humic$has.pair <- sapply(1:nrow(humic), 
            function(x) with(humic,(!(GU.Number[x] %% 2))*max((GU.Number[x] + c(1,-1)) %in% GU.Number)))
Ryan
  • 934
  • 8
  • 14