I have a dataframe and a predictive model that I want to apply to the data. However, I want to filter out records for which the model might not apply very well. To do this, I have another dataframe that contains for every variable the minimum and maximum observed in the training data. I want to remove those records from my new data for which one or more values fall outside the specified range.
To make my question clear, this is what my data might look like:
id x y
---- ---- ---------
1 2 30521
2 -1 1835
3 5 25939
4 4 1000000
This is what my second table, with the mins and maxes, could look like:
var min max
----- ----- -------
x 1 5
y 0 99999
In this example, I would want to flag the following records in my data: 2 (lower than the minimum for x) and 4 (higher than the max for y).
How could I easily do this in R? I have a hunch there's some clever dplyr
code that would accomplish this task, but I wouldn't know what it would look like.