First, your data:
df1 <- read.table(text = "subject x y
7G001-0024-10 0,00 15
7G001-0024-10 97,29 18
7G001-0024-10 197,34 21
7G001-0024-10 314,66 22
7G001-0024-10 482,77 25
7G001-0030-10 0,00 12
7G001-0030-10 99,50 16
7G001-0030-10 184,37 20
7G001-0030-10 301,89 25
7G001-0030-10 585,67 27", header = TRUE, dec = ",")
df2 <- read.table(text = "subject Threshold
7G001-0024-10 177,08
7G001-0030-10 385,13", header = TRUE, dec = ",")
You can use simple apply
to solve the task:
do.call("rbind", apply(df2, 1, FUN = function(a) {df1[a[1] == df1$subject & df1$x >= 0 & df1$x <= as.numeric(a[2]), ]}))
# subject x y
# 1 7G001-0024-10 0.00 15
# 2 7G001-0024-10 97.29 18
# 6 7G001-0030-10 0.00 12
# 7 7G001-0030-10 99.50 16
# 8 7G001-0030-10 184.37 20
# 9 7G001-0030-10 301.89 25
How does it work?
First, the function apply(df2, 1, FUN)
applies a function to each row in the data frame df2
. The value 1
means that the function is applied to the 1st dimension of the object (the second dimension would be columns).
Have a look at a simple function. It just returns the first and second row of df2
. Note that in the output, the rows are arranged as columns.
> apply(df2, 1, FUN = function(a) a)
[,1] [,2]
subject "7G001-0024-10" "7G001-0030-10"
Threshold "177.08" "385.13"
Since we want to extract a subset of df1
a more complex function is needed. So, I specified:
FUN = function(a) {df1[a[1] == df1$subject & df1$x >= 0 & df1$x <= as.numeric(a[2]), ]}
In this function, a
represents a row of the data frame df2
. This function is aplied two times, once for both rows of df2
. a[1]
is the subject number, a[2]
is the corresponding threshold.
The function extracts a subset of rows of the data frame df1
by three criteria:
- The subjects are identical (
a[1] == df1$subject
)
- The
x
value is at least zero (df1$x >= 0
)
- The
x
value is not higher than the threshold (df1$x <=
as.numeric(a[2])
)
Note: The value a[2]
needs to be transformed to a number by as.numeric
. This is necessary since the subject id in df2
is represented as character and thereby apply
converts the whole row (including the threshold value) into characters.
Each of these criteria returns a logical vector. These vectors are combined with &
into a single logical vector indicating whether all three criteria are fullfilled. With df1[logical.vector, ]
all rows of df1
where the logical vector is TRUE
are selected. Since nothing is specified after the ,
, all columns are selected.
The rows of df1
for which all three criterial are fullfilled are returned by the apply
function.
> apply(df2, 1, FUN = function(a) {df1[a[1] == df1$subject & df1$x >= 0 & df1$x <= as.numeric(a[2]), ]})
[[1]]
subject x y
1 7G001-0024-10 0.00 15
2 7G001-0024-10 97.29 18
[[2]]
subject x y
6 7G001-0030-10 0.00 12
7 7G001-0030-10 99.50 16
8 7G001-0030-10 184.37 20
9 7G001-0030-10 301.89 25
The function apply
returns a list of two data frames, one for each row of df2
.
In the last step, the data frames in the list are combined into one data frame. The function do.call("rbind", list)
executes the function rbind
and passes the arguments in the list to it. For a list of length 2, this is equivalent to rbind(list[[1]], list[[2]])
. In this way, both data frames in the list returned by apply
are combined.