How to keep all rows of all columns that have NAs

Question

Any idea on how to apply a function on a dataframe using dplyr in a way that I keep only rows that have any missing value?

Calum You · Answer 1 · 2018-05-08T20:25:22.717

4

Using @DJack's sample data here, we can do this in dplyr using filter_all. filter_all takes an argument quoted in all_vars or any_vars and applies it to all columns. Here, we keep any row that returns TRUE for is.na in any column.

m <- matrix(1:25, ncol = 5)
m[c(1, 6, 13, 25)] <- NA
df <- data.frame(m)
library(dplyr)
df %>%
  filter_all(any_vars(is.na(.)))
#>   X1 X2 X3 X4 X5
#> 1 NA NA 11 16 21
#> 2  3  8 NA 18 23
#> 3  5 10 15 20 NA

Created on 2018-05-08 by the reprex package (v0.2.0).

edited May 08 '18 at 20:25

answered May 08 '18 at 20:16

Calum You

14,687
4
23
42

That worked just fine in a very elegant way. Any hints about these other two situations - Remove all columns with missing values and , Keep only columns with missing values. – Joni Hoppen May 08 '18 at 20:31
1

Both are done with `select_if`. In `dplyr`, `filter` verbs allow you to keep rows, `select` verbs allow you to keep columns in various ways. – Calum You May 08 '18 at 20:34

DJack · Answer 2 · 2018-05-08T21:04:19.883

3

Here is a (not dplyr) solution:

df[rowSums(is.na(df)) > 0,]

#  X1 X2 X3 X4 X5
#1 NA NA 11 16 21
#3  3  8 NA 18 23
#5  5 10 15 20 NA

Or as suggested by MrFlick:

df[!complete.cases(df),]

Sample data

m <- matrix(1:25, ncol = 5)
m[c(1,6,13,25)] <- NA
df <- data.frame(m)
df

#  X1 X2 X3 X4 X5
#1 NA NA 11 16 21
#2  2  7 12 17 22
#3  3  8 NA 18 23
#4  4  9 14 19 24
#5  5 10 15 20 NA

edited May 08 '18 at 21:04

answered May 08 '18 at 20:09

DJack

4,850
3
21
45

score 2 · Answer 3 · answered May 08 '18 at 20:22

2

I don't know how to solve this with dplyr, but maybe this helps:

First, I created this df:

df <- tribble( ~a ,  ~b, ~c,
               1  , NA ,  0,
               2  ,  0 ,  1,
               3  ,  1 ,  NA,
               4  ,  1 ,  0
             )

Then, this will return only rows with NA:

df[!complete.cases(df),]

See more: Subset of rows containing NA (missing) values in a chosen column of a data frame

answered May 08 '18 at 20:22

Wlademir Ribeiro Prates

555
3
17

This good, just trying both approaches. Let you guys know how it goes. – Joni Hoppen May 08 '18 at 20:24
I was answering at the same time! But this validates your solution, right?! – Wlademir Ribeiro Prates May 08 '18 at 21:03

How to keep all rows of all columns that have NAs

3 Answers3

Sample data