2

I have the following data frame which I create after a count:

df <- structure(list(Procedure_priority = structure(c(4L, 1L, 2L, 3L, NA, 5L),
                                                    .Label = c("A", "B", "C", "D", "-1"), 
                                                    class = "factor"), n = c(10717L, 4412L, 2058L, 1480L, 323L, 2L)), 
                class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -6L), .Names = c("Procedure", "n"))


# A tibble: 6 x 2
  Procedure     n
  <fct>     <int>
1 D         10717
2 A          4412
3 B          2058
4 C          1480
5 <NA>        323
6 -1            2

I want to filter the "-1". But if I make a filter on "-1" I also loose my NA. That is:

df %>% 
  filter(Procedure!="-1")

# A tibble: 4 x 2
  Procedure     n
  <fct>     <int>
1 D         10717
2 A          4412
3 B          2058
4 C          1480

I need my NA's.

xhr489
  • 1,957
  • 13
  • 39
  • 1
    `NA == "-1"` evaluates to `NA` and `filter` removes anything that evaluates to `FALSE` or `NA`. Info here https://stackoverflow.com/questions/32908589/why-does-dplyrs-filter-drop-na-values-from-a-factor-variable – Jack Brookes Feb 14 '19 at 14:01

2 Answers2

4

From the Help file of filter()

...Only rows where the condition evaluates to TRUE are kept...

NA != -1
[1] NA

Since your condition returns a NA (hence not TRUE) you need a second OR condition:

df %>% 
  filter(Procedure != -1 | is.na(Procedure))
RandowMalk
  • 223
  • 1
  • 7
2

Your question is already answered, but if you have a shorter list (i.e., you are not just excluding one value) you can use %in% and still keep NA's.

# Keep A, D, and NA; aka dropping B, C, and -1
keep_these_procs <- c("A", "D", NA)

df %>%
  filter(Procedure %in% keep_these_procs)
Andrew
  • 5,028
  • 2
  • 11
  • 21