Sum two columns in a new columns in R only if one Na

Question

I want to sum two columns let's say the columns "apinten" and "apmod". I want R to ignore Na if there is Na in only one of the two columns, but I want to report "Na" if the two columns are Na ... For the moment I did it :

etude1<-within(etude1,{mvpascore<-rowSums(cbind(apinten,apmod), na.rm = T, dims = 1)})

and it

etude1<-within(etude1,{mvpascore<-apply(cbind(apinten,apmod), 1, sum, na.rm = TRUE)})

With these commands, if only one is missing R reports only the value of the other one in the new columns, but if both columns apinten and apmod are Na, R reports the value "0" in the news columns ... 0 is a value and I don't want it.

score 2 · Answer 1 · answered Nov 03 '17 at 16:03

2

Since you have not posted a dataset example, let me create one first.

etude1 <- data.frame(apinten = 1:5, apmod = 11:15)
etude1$apinten[2:3] <- NA
etude1$apmod[3:4] <- NA

Now just apply an anonymous function to each row of the data frame. The function determines whether all row values are NA, and if not, sums them.

etude1$mvpascore <- apply(etude1, 1, function(x)
                       ifelse(all(is.na(x)), NA, sum(x, na.rm = TRUE)))

answered Nov 03 '17 at 16:03

Rui Barradas

70,273
8
34
66

Thank you very much ! I just don't understand what "function(x) " is .. Could you explain ? Thanks – Cyril Forestier Nov 03 '17 at 16:17
It's an anonymous function. `apply` applies a function to a certain `MARGIN`, in this case the first dimension of the data frame. So a function with no name will process each row `x` and return `NA` if all values are `NA`, their sum otherwise. – Rui Barradas Nov 03 '17 at 16:38

score 0 · Answer 2 · answered Nov 03 '17 at 16:02

I have made up two vector rows to make it reproducible and because I don´t know the format of your data, but I guess you will get the idea. Substituting by your columns if you are using a dataframe is a piece of cake. I´ve also supposed the two vectors are the same size when calculating the lengths.

I go for a loop in these example because lately I´m obsessed with them, but I guess you could also make a function and apply it to your vectors.

The trick is knowing how to work with the logical XOR function, which gives the desired output.

apinten<-as.vector(c(NA,0,1,NA))
apmod<-as.vector(c(0,NA,1,NA))
sum_vector<-as.vector(c())
vector_length<-length(apinten)

for (i in 1:vector_length){
  if (xor(is.na(apinten[i]),(is.na(apmod[i])))){
    sum_vector[i]<-sum(apinten[i],apmod[i],na.rm = TRUE)}

  else{sum_vector[i]<-sum(apinten[i],apmod[i],na.rm = FALSE)}  
}

You can check the output, sum vector, is (0,0,2,NA).

I know the code needs some adapting for your case and there are probably better solutions... but I think this can help you carry on.

score 0 · Answer 3 · answered Nov 04 '17 at 03:49

Using the example from @RuiBarradas post, another option would be to use rowSums on a logical matrix, convert it to 1 and NA (for rows that have only NA), then multiply with the rowSums on the actual dataset to replace the 0 with NA

(NA^!rowSums(!is.na(etude1))) * rowSums(etude1, na.rm = TRUE)
#[1] 12 12 NA  4 20

Sum two columns in a new columns in R only if one Na

3 Answers3