Conditional filtering based on the level of a factor R

Question

I would like to clean up the following code. Specifically, I'm wondering if I can consolidate the three filter statements so that I end up with the final data.frame (the rind()) that contains the row of data "spring" if it exists, the row of data for "fall" if "spring" doesn't exist, and finally the row of data if neither "spring" nor "fall" exist. The code below seems very clunky and inefficient. I am trying to free myself of for(), so hopefully the solution won't involve one. Could this be done using dplyr?

# define a %not% to be the opposite of %in%
library(dplyr)
`%not%` <- Negate(`%in%`)
f <- c("a","a","a","b","b","c")
s <- c("fall","spring","other", "fall", "other", "other")
v <- c(3,5,1,4,5,2)
(dat0 <- data.frame(f, s, v))
sp.tmp <- filter(dat0, s == "spring")
fl.tmp <- filter(dat0, f %not% sp.tmp$f, s == "fall")
ot.tmp <- filter(dat0, f %not% sp.tmp$f, f %not% fl.tmp$f, s == "other")
rbind(sp.tmp,fl.tmp,ot.tmp)

Is it ever possible there are multiple "spring"s within an "a", a "b", or a "c"? And if it is, do you want to keep all of them or just the first? — David Robinson, Jul 10 '14 at 15:08
It is not possible to have multiple "spring"s for "a", "b", ... — cdd, Jul 10 '14 at 15:11

score 3 · Accepted Answer · edited May 23 '17 at 11:58

3

It looks like within each group of f, you want to extract the row of, in descending order of preference, spring, fall, or other.

If you first make your ordering of preference the actual factor ordering:

dat0$s <- factor(dat0$s, levels=c("spring", "fall", "other"))

Then you can use this dplyr solution to get the minimum row (relative to that factor) within each group:

newdat <- dat0 %.% group_by(f) %.% filter(rank(s) == 1)

edited May 23 '17 at 11:58

Community

1
1

answered Jul 10 '14 at 15:13

David Robinson

77,383
16
167
187

Thanks this does exactly what I want. – cdd Jul 10 '14 at 15:17

Conditional filtering based on the level of a factor R

1 Answers1

Linked