2

I have a dataframe and I want to filter it:

employee <- c('John Doe','Peter Gynn','Jolie Hope', 'Michael')
salary <- c(21000, 23400, 26800, 25000)
number <- c(1,2,3,5)
df <- data.frame(employee,salary,number)

> df
    employee salary number
1   John Doe  21000      1
2 Peter Gynn  23400      2
3 Jolie Hope  26800      3
4    Michael  25000      5

I also have these vectors:

vectorMin <- c(22000,1.5)
vectorMax <- c(26000,4.5)

I want to filter the dataframe with the rows which salary between 22000 and 26000 and the number between 1.5 and 4.5. In this case I want the dataframe with only Peter Gynn. I have tried:

(df >= vectorMin) & (df <= vectorMax)

But this doesn´t work. How can I do it?

Jaap
  • 81,064
  • 34
  • 182
  • 193
  • 2
    Well this is very easy and you just need proper R indexing syntax, but note that none of the records in your example would pass the filters you wrote (but then the ones you wrote conflict with the implied filters in your vectors -- 1.5 vs 2.5). Assuming you meant 1.5 not 2.5: `df[df$salary >= 22000 & df$salary <= 26000 & df$number >= 1.5 & df$number <= 4.5,]` – Hack-R Dec 22 '16 at 21:23
  • @Hack-R thanks for the response. I need to do it with the vectors Min and Max. In my case I have a very big datasheet, this is only an example, but my vectors are big and also my dataframe. For this I need to use the vectors. – ruber robert Dec 22 '16 at 21:29

3 Answers3

5

You can use Map to perform one-to-one operations on the vector/list elements. Then we can combine the results with Reduce. First we define a function between (written for the specific needs of the question - bounds included) that returns a logical vector for the values in the relevant range. You could also just use an anonymous function in Map.

between <- function(x, min, max) x >= min & x <= max

df[Reduce("&", Map(between, df[-1], vectorMin, vectorMax)), ]
#     employee salary number
# 2 Peter Gynn  23400      2
Rich Scriven
  • 97,041
  • 11
  • 181
  • 245
3
vectorMin <- c(22000,1.5)
vectorMax <- c(26000,4.5)

df[df$salary > vectorMin[1] & df$salary < vectorMax[1]
   & df$number > vectorMin[2] & df$number < vectorMax[2],]
Hack-R
  • 22,422
  • 14
  • 75
  • 131
AidanGawronski
  • 2,055
  • 1
  • 14
  • 24
0

Easy to achieve with the formidable dplyr package:

library(dplyr)
df %>%
  filter(salary >= 22000, salary <= 26000, number >= 1.5, number <= 4.5)
# 1 Peter Gynn  23400      2
Jan
  • 42,290
  • 8
  • 54
  • 79
  • 1
    I didn't downvote, but if you read OPs comment on the post it appears that this method won't be useful. – figurine Dec 22 '16 at 21:43