3

I have a dataframe with a few columns, where for each row only one column can have a non-NA value. I want to combine the columns into one, keeping only the non-NA value, similar to this post:

Combine column to remove NA's

However, in my case, some rows may contain only NAs, so in the combined column, we should keep an NA, like this (adapted from the post I mentioned):

data <- data.frame('a' = c('A','B','C','D','E','F'),
                   'x' = c(1,2,NA,NA,NA,NA),
                   'y' = c(NA,NA,3,NA,NA,NA),
                   'z' = c(NA,NA,NA,4,5,NA))

So I would have

  a  x  y  z
1 A  1 NA NA
2 B  2 NA NA
3 C NA  3 NA
4 D NA NA  4
5 E NA NA  5
6 F NA NA NA

And I would to get

 'a' 'mycol'  
  A   1  
  B   2  
  C   3  
  D   4  
  E   5  
  F   NA

The solution from the post mentioned above does not work in my case because of row F, it was:

cbind(data[1], mycol = na.omit(unlist(data[-1])))

Thanks!

arielle
  • 915
  • 1
  • 12
  • 29

2 Answers2

3

One option is coalesce from dplyr

library(tidyverse)
data %>% 
   transmute(a, mycol = coalesce(!!! rlang::syms(names(.)[-1])))
#    a mycol
#1 A     1
#2 B     2
#3 C     3
#4 D     4
#5 E     5
#6 F    NA

Or we can use max.col from base R

cbind(data[1], mycol= data[-1][cbind(1:nrow(data), 
       max.col(!is.na(data[-1])) * NA^!rowSums(!is.na(data[-1]))+1)])
#   a mycol
#1 A     1
#2 B     2
#3 C     3
#4 D     4
#5 E     5
#6 F    NA

Or only with rowSums

v1 <- rowSums(data[-1], na.rm = TRUE)
cbind(data[1], mycol = v1 * NA^!v1)

Or another option is pmax

cbind(data[1], mycol = do.call(pmax, c(data[-1], na.rm = TRUE)))

or pmin

cbind(data[1], mycol = do.call(pmin, c(data[-1], na.rm = TRUE)))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • could you specify which package you are using for the syms function? I see different packages with a function of the same name. – arielle Apr 17 '18 at 16:20
3

Using base R...

data$mycol <- apply(data[,2:4], 1, function(x) x[!is.na(x)][1])

data
  a  x  y  z mycol
1 A  1 NA NA     1
2 B  2 NA NA     2
3 C NA  3 NA     3
4 D NA NA  4     4
5 E NA NA  5     5
6 F NA NA NA    NA
Andrew Gustar
  • 17,295
  • 1
  • 22
  • 32