0

I have some variables that I wanted to add together but there are missing observations in some of them and when adding together, it will make the whole row with one or more missing as missing. For example, suppose I have the following with the last column as my expectation

df <- matrix(c(23,  NA, 56, NA, NA, 43, 67, NA, 11, 10, 18, 39), byrow = T, nrow = 3)
colnames(df)<- c("X",   "y",    "z",    "sum")
df
      X  y  z sum
[1,] 23 NA 56  NA
[2,] NA 43 67  NA
[3,] 11 10 18  39

Here is my expectation

df2 <- matrix(c(23, NA, 56, 79,
                 NA,    43, 67, 110,
                 11,    10, 18, 39), byrow = T, nrow = 3)

 colnames(df2)<- c("X", "Y", "Z", "sum")

 df2
      X  Y  Z sum
[1,] 23 NA 56  79
[2,] NA 43 67 110
[3,] 11 10 18  39

How can I get this result?

I am using R version 3.6 on Window 10.
AbcAeffchen
  • 14,400
  • 15
  • 47
  • 66
  • 1
    What code are you using to sum each row - `rowSums`? If so, are you including `na.rm = TRUE`? – Ben Jan 02 '20 at 21:52

1 Answers1

1

As Ben pointed out I think all you want is na.rm = TRUE, so something like this:

df <- matrix(c(23,  NA, 56, NA, 43, 67, 11, 10, 18), byrow = T, nrow = 3)
colnames(df)<- c("X",   "y",    "z")
cbind(df, summ = rowSums(df, na.rm = TRUE))
#       X  y  z summ
# [1,] 23 NA 56   79
# [2,] NA 43 67  110
# [3,] 11 10 18   39

Or if you are working with a dataframe, something like this

    library(dplyr)
    df_frame <- data.frame(df)
    df_frame <- df_frame %>%
      mutate(summ = rowSums(., na.rm = TRUE))
    df_frame
    #    X  y  z summ
    # 1 23 NA 56   79
    # 2 NA 43 67  110
    # 3 11 10 18   39




#OR this if you just want to select numeric variables from the dataframe:

    df_frame <- data.frame(df)
    df_frame <- df_frame %>%
      mutate(summ = rowSums(select_if(., is.numeric), na.rm = TRUE))
    df_frame
    #    X  y  z summ
    # 1 23 NA 56   79
    # 2 NA 43 67  110
    # 3 11 10 18   39
user63230
  • 4,095
  • 21
  • 43