0

In R you can select all non-NA rows from a dataframe via:

set.seed(123)
nadf = data.frame(a = 1:20, b=sample(c("a","b",NA), 20, replace=TRUE))
nadf[complete.cases(nadf),]

The nice thing is that I do not need to specify which columns to check. R will check every value of every row and column if it is NA. I would now like to do this with dplyr.

In dplyr my first reaction would be to use the filter command. However, through filter I would need to specify every column that I want to filter on. Assuming that this dataframe has column names a...z and it would look something like this:

df = df %>% filter( a != NA, b != NA, ..., z != NA) 

This is very verbose and I imagine there must be a better way to do this operation in dplyr. Is there?

cantdutchthis
  • 31,949
  • 17
  • 74
  • 114
  • 1
    You should provide some example dataset. I don't think `df[!is.na(df),]` would give the intended results, when the number of `NA`s are different for each column. But, you can get a vector out of `df[!is.na(df)]` – akrun Jan 26 '15 at 09:53
  • wow. ill refrain from writing questions without examples to check myself. my bad! made edit. – cantdutchthis Jan 26 '15 at 10:02
  • From the linked duplicate: "`na.omit()` takes 20x as much time as the other two solutions" – Henrik Jan 26 '15 at 10:04

0 Answers0