0

Suppose I have a 27 columns data frame. The first column is the ID, and the rest of columns (A to Z) are just data. I want to take out all the rows whose A to Z columns are NA. How should I do it? The straightforward way is just

data %>%
filter(!(is.na(A) & is.na(B) .... & is.na(Z)))

Is there a more efficient or easier way to do it?

This question is different from This one because I want to exclude rows whose value are ALL NA, and keep the rows whose value are partially NA.

Z.Lu
  • 67
  • 10

1 Answers1

0

Using tidyverse:

library(tidyverse)

Load data:

ID <- c(1:8)
Col1<-c(34564,NA,43456,NA,45655,6789,99999,87667)
Col2<-c(34565,43456,55555,NA,65433,22234,NA,98909)
Col3<-c(45673,88789,11123,NA,55676,76566,NA,NA)

mydf <- data_frame(ID,Col1,Col2,Col3)
mydf %>% 
    slice(which(complete.cases(.)))

Whether you want to preserve selected columns removing rows with all NAs you may run:

mydf %>% 
    mutate(full_incomplete_cases=rowSums(is.na(.[-1]))) %>% 
    filter(full_incomplete_cases<length(mydf[,-1])) %>% 
    select(ID:Col3)
Scipione Sarlo
  • 1,470
  • 1
  • 17
  • 31