0

Original Sample code:

v1 = c('Tom','Dick','Harry')
d1 <- data.frame(v1)

l1 <- c('20200101','20200202','20200303')
l2 <- c('20200101','20200202')
l3 <- c('20200101','20200202','20200303','20200404')
v2 = c(l1,l2,l3)
d1$v2 = v2
d2 <- d1

d2$len_v2 = c(3,2,4)

Revised Sample code that solves Q1 problem. Issue was that I did not pass in a list of lists.

v2 = c(l1,l2,l3) flattens the list of lists and causes an error:

library(tidyverse)
v1 <- c('Tom','Dick','Harry')
d1 <- data.frame(v1)

l1 <- c('20200101','20200202','20200303')
l2 <- c('20200101','20200202')
l3 <- c('20200101','20200202','20200303','20200404')
v2 <- list(l1,l2,l3)
d1$v2 <- v2
d2 <- d1 %>% mutate(len_v2 = lengths(v2))

I typically use tidyverse.

Q1 >> How to calculate 'len_v2' as a variable in D1? Is there a rowwise operation to do this?

A1 >> SOLVED.

Q2 >> Are there ways to apply operations to the lists in the v2 variable, such as filtering, either resulting in a new list variable, or dropping rows that did not pass a condition on the v2 list variable?

For example, if I wanted to return all rows which had 20200303 in them to get a resulting d3 with the first and third rows only?

Thanks!

tomPorter
  • 41
  • 4

1 Answers1

0

If it is a list column, then we can use lengths

library(dplyr)
d2 %>%
   mutate(len_v2 = lengths(v2))

-output

#    v1                                     v2 len_v2
#1   Tom           20200101, 20200202, 20200303      3
#2  Dick                     20200101, 20200202      2
#3 Harry 20200101, 20200202, 20200303, 20200404      4

For filtering, can use map to loop over the list, apply the logical expression and return a logical vector to filter the rows

library(purrr)
d2 %>%
    filter(map_lgl(v2, ~ '20200303' %in% .x))

#     v1                                     v2 len_v2
#1   Tom           20200101, 20200202, 20200303      3
#2 Harry 20200101, 20200202, 20200303, 20200404      4

assuming that

v2 <- list(l1,l2,l3)

in the OP's code

akrun
  • 874,273
  • 37
  • 540
  • 662