Filtering a dataframe in R

Question

I have the following dataframe (df)

 start     end
1    14379   32094
2   151884  174367
3   438422  449382
4   618123  621256
5   698271  714321
6   973394  975857
7   980508  982372
8   994539  994661
9  1055151 1058824
.   .       .
.   .       .
.   .       .

And a long boolean vector with boolean values (vec).

I would like to filter out all ranges in df that contain at least one TRUE value in the corresponding locations in vec.

In other words, a row with start=x end=y will be outputted iff !any(vec[x:y]).

Any ideas on how to accomplish that?

possible duplicate of [Adding a column to a dataframe in R](http://stackoverflow.com/questions/3651651/adding-a-column-to-a-dataframe-in-r) — hadley, Sep 06 '10 at 18:27
@hadley how is that a duplicate? I used the same dataframe but the question is different (filtering a dataframe vs. adding columns to a dataframe). — David B, Sep 07 '10 at 06:50

score 5 · Accepted Answer · edited May 23 '17 at 12:13

5

This is the same question as: Adding a column to a dataframe in R so it has the same answer... use apply, but with any instead of mean...

> ranges <- apply(DF,1,function(row) !any(vec[ row[1]:row[2] ]))
> DF[ranges,]

edited May 23 '17 at 12:13

Community

1
1

answered Sep 06 '10 at 14:18

Joshua Ulrich

173,410
32
338
418

score 2 · Answer 2 · answered Sep 06 '10 at 14:35

2

I have read your other posts about this topic, if you want to achieve this with plyr, try this:

new.df <- adply(df, .margins=1, function(x){if(!any(vec[x$start:x$end])) return(x)})

answered Sep 06 '10 at 14:35

Gary Li

136
1
2

Filtering a dataframe in R

2 Answers2