-1

I have a dataframe as follows:

Col1    Col2    Col3    Col4     Col5    Col6    Col7
   A       B       C                E       F      NA
   J               L      NA        P      NA      NA
   Z       M               P       NA       M      NA
   H               J      NA       NA      NA      NA
   A       B               D        B      NA      NA

How do I insert a new column stating whether or not the last non-NA value exists in that row? I would want the final output to look like this:

Col1    Col2    Col3    Col4     Col5    Col6    Col7    Exist?
   A       B       C                E       F      NA        No
   J               L      NA       NA      NA      NA        No
   Z       M               P        M      NA      NA       Yes
   H               J      NA       NA      NA      NA        No
   A       B               D        B      NA      NA       Yes
nak5120
  • 4,089
  • 4
  • 35
  • 94

1 Answers1

1

We can use apply to loop over the rows (MARGIN = 1), remove the NA elements (x[!is.na(x)]), find if there are any duplicates (anyDuplicated), convert to a logical vector and change it to 'Yes', 'No' by converting the logical to numeric index

df1$Exist <- apply(df1, 1, FUN = function(x) 
            c("No", "Yes")[(anyDuplicated(x[!is.na(x) & x != "" ])!=0) +1])
df1$Exist
#[1] "No"  "No"  "Yes" "No"  "Yes"
akrun
  • 874,273
  • 37
  • 540
  • 662
  • thank you. I realized that there also NA's and blanks in the middle of my dataset which is messing this up. Is there a way to use this formula to account for that? – nak5120 Mar 24 '17 at 15:46
  • I modified the question – nak5120 Mar 24 '17 at 15:46
  • @NickKnauer You can change the `!is.na(x)` to `!is.na(x) & x != ""` Updated the post – akrun Mar 24 '17 at 15:48
  • That still didn't work for me, everything is coming up as yes even though it doesn't exist. I'm going to provide a new dataset. Give me a sec – nak5120 Mar 24 '17 at 16:08
  • @NickKnauer I tried with your dataset and it is giving me exactly the same output. Please check if you have blanks (`""`) or white spaces (`" "`) – akrun Mar 24 '17 at 16:10
  • Ok, I'm going to post a new dataset in a different question. Thank you! – nak5120 Mar 24 '17 at 16:10
  • @NickKnauer You can update in the same post. Please use `dput` to show the dataset so that we can reproduce the same data – akrun Mar 24 '17 at 16:11
  • this is how this entire question got started it interested. http://stackoverflow.com/questions/43006044/match-dataframes-excluding-last-non-na-value-and-disregarding-order – nak5120 Mar 24 '17 at 18:19
  • looking at the data, yours is definitely right, i realized that there were duplicates in my dataset before the last non-na value in each row which made it not work. Any suggestions on how to do this without using the duplicate option? I asked this question here: http://stackoverflow.com/questions/43022095/determine-if-value-in-final-column-exists-in-respective-rows – nak5120 Mar 25 '17 at 21:31