How to drop unused levels after filtering by factor?

Question

Here is an example that was taken from a fellow SO member.

# define a %not% to be the opposite of %in%
library(dplyr)
# data
f <- c("a","a","a","b","b","c")
s <- c("fall","spring","other", "fall", "other", "other")
v <- c(3,5,1,4,5,2)
(dat0 <- data.frame(f, s, v))
#  f      s v
#1 a   fall 3
#2 a spring 5
#3 a  other 1
#4 b   fall 4
#5 b  other 5
#6 c  other 2
(sp.tmp <- filter(dat0, s == "spring"))
#  f      s v
#1 a spring 5
(str(sp.tmp))
#'data.frame':  1 obs. of  3 variables:
# $ f: Factor w/ 3 levels "a","b","c": 1
# $ s: Factor w/ 3 levels "fall","other",..: 3
# $ v: num 5

The df resulting from filter() has retained all the levels from the original df.

What would be the recommended way to drop the unused level(s), i.e. "fall" and "others", within the dplyr framework?

I have been using spreadsheets quite a lot for data pre-processing, but since I discovered `dplyr` that seems to have changed ;-) However, when one applies filters in a spreadsheet, the "hidden" range seems to be nonexistent for copy/paste operations. That's why I was surprised finding the filtered content partially transferred to the new df after applying `filter()`. Therefore I asked how to get the same effect *within* the `dplyr` framework, expecting that there might be an argument for that. — ils, Nov 09 '14 at 10:01
If it will declutter the environment I'll do so gladly. Hope that both helpers won't mind the downvote... — ils, Nov 09 '14 at 10:10
It seems that I can't downvote until the answers are edited :-/ — ils, Nov 09 '14 at 10:13
Just leave it as is. The answers show some additional implementation on `dplyr` — David Arenburg, Nov 09 '14 at 10:14
My understanding is that duplicate questions should be _closed_, not necessarily deleted because they might help others find the original question and answers in the future. — talat, Nov 09 '14 at 10:33

score 54 · Answer 1 · answered Nov 09 '14 at 09:48

54

You could do something like:

dat1 <- dat0 %>%
  filter(s == "spring") %>% 
  droplevels()

Then

str(df)
#'data.frame':  1 obs. of  3 variables:
# $ f: Factor w/ 1 level "a": 1
# $ s: Factor w/ 1 level "spring": 1
# $ v: num 5

answered Nov 09 '14 at 09:48

talat

68,970
21
126
157

Thanks! That seems quite a logical approach. – ils Nov 09 '14 at 09:57

score 4 · Answer 2 · answered Nov 09 '14 at 09:38

4

You could use droplevels

 sp.tmp <- droplevels(sp.tmp)
 str(sp.tmp)
 #'data.frame': 1 obs. of  3 variables:
 #$ f: Factor w/ 1 level "a": 1
 #$ s: Factor w/ 1 level "spring": 1
# $ v: num 5

answered Nov 09 '14 at 09:38

akrun

874,273
37
540
662

Thanks a lot! Isn't there a way to do this *while* `filter()`ing? – ils Nov 09 '14 at 09:43
@ils I guess `filter` does not have an argument to drop the levels (to my knowledge). – akrun Nov 09 '14 at 09:44

How to drop unused levels after filtering by factor?

2 Answers2

Linked