I am trying to add a list column to a tibble
data frame. The resulting list column is calculated from two columns contained in the data frame and a vector which is external / independent.
Suppose that the data frame and the vector are the following:
library(dplyr)
library(magrittr)
dat <- tibble(A = c(12, 27, 22, 1, 15, 30, 20, 28, 19),
B = c(68, 46, 69, 7, 44, 76, 72, 50, 51))
vec <- c(12, 25, 28, 58, 98)
Now, I would like to add (mutate
) the column y
so that for each row y
is a list containing the elements of vec
between A
and B
(inclusive).
The not-so-proper way to do this would be via loop. I initialize the column y
as list and update it row-wise based on the condition A <= vec & vec <= B
:
dat %<>%
mutate(y = list(vec))
for (i in 1:nrow(dat)){
dat[i,]$y[[1]] <- (vec[dat[i,]$A <= vec & vec <= dat[i,]$B])
}
The result is a data frame with y
being a list of dbl
of variable length:
> dat
# A tibble: 9 x 3
A B y
<dbl> <dbl> <list>
1 12 68 <dbl [4]>
2 27 46 <dbl [1]>
3 22 69 <dbl [3]>
4 1 7 <dbl [0]>
5 15 44 <dbl [2]>
6 30 76 <dbl [1]>
7 20 72 <dbl [3]>
8 28 50 <dbl [1]>
9 19 51 <dbl [2]>
The first four values of y
are:
[[1]]
[1] 12 25 28 58
[[2]]
[1] 28
[[3]]
[1] 25 28 58
[[4]]
numeric(0)
Note: the 4-th list is empty, because no value of vec
is between A=1
and B=7
.
I have tried as an intermediate step with getting the subscripts via which
using mutate(y = list(which(A <= vec & vec <= B)))
or with a combination of seq
and %in%
, for instance mutate(y = list(vec %in% seq(A, B)))
. These both give an error. However, I don't need the subscripts, I need a subset of vec
.