R + dplyr nest + purrr - Evaluating a model across nestings within a dataframe row

Question

This will be difficult to come up with a reproducible example for as there is no open source data as yet, and I'm not sure I'm allowed to share the data I have. I will try my best to explain it, and if this doesn't work, I can maybe take some time to simulate some data at a later point. Hopefully it's an easy solution though...

Background

I am busy creating an R package for kinetic modelling in the field that I work in (https://github.com/mathesong/kinfitr). I am trying as best as I can to make everything amenable to tidyverse tooling. However, there is a particular use case for which I can't figure out how to do it as it involves pulling data from several different formats in rather different structures, and pulling them together in the model.

In the README on the page, I present a solution for Reference Region models, where all inputs are of the same length and I can work with the following workflow:

data %>% 
  gather() %>%
  group_by() %>%
  do()

The Issue

However, for arterial models, the input arguments are as follows:

Brain kinetic data: times, values, weights - each vectors of the same length, in this case 38
Blood kinetic data: bloodinput - data frame of 4096 rows x 4 columns. For the sake of convenience, all models read this in as a data frame with all the information already interpolated.

Each of the models requires inputs of all three vectors, as well as the bloodinput data frame.

I currently have all the data stored in a list, with an element for each measurement. Each element of the list contains 1. a data frame with the brain kinetic data (each region of the brain, let's say 3 regions), as well as times and weights, and 2. a data frame containing the bloodinput data. Thus I create my final data frame

datdf <- map(dat, 'braindf') %>%  # Extract the brain data
  bind_rows(.id = "id") %>%   # Add an id column
  select(PET = id, Times = Times, Weights=weights, R1 = Region1, R2 = Region2, R3 = Region3) %>%  # Rename and select columns
  group_by(PET) %>%    # Group by each measurement
  nest() %>%    # Nest everything
  rename(braindata=data) %>%     # Rename
  mutate(Subjname = stringr::str_extract(....)), # Add subject acronym
         PETNo = as.numeric(stringr::str_extract(....)), # Add measurement number
         input=map(dat, 'bloodinput'))  # Add blood input data frame as a nested column

This leaves me with the following

# A tibble: 6 × 5
     PET         braindata Subjname PETNo               bloodinput
   <chr>            <list>    <chr> <dbl>                   <list>
1 s1_1 <tibble [38 × 6]>     s1       1 <data.frame [4,096 × 4]>
2 s1_2 <tibble [38 × 6]>     s1       2 <data.frame [4,096 × 4]>
3 s2_1 <tibble [38 × 6]>     s2       1 <data.frame [4,096 × 4]>
4 s2_2 <tibble [38 × 6]>     s2       2 <data.frame [4,096 × 4]>
5 s1_1 <tibble [38 × 6]>     s3       1 <data.frame [4,096 × 4]>
6 s2_2 <tibble [38 × 6]>     s3       2 <data.frame [4,096 × 4]>

where each brain data contains the following:

head(datdf[1,]$braindata[[1]])

# A tibble: 6 × 6                   
    Times   Weights R1      R2      R3
    <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
1   0   0   0.00E+00    0.00E+00    0.00E+00
2   22  0.3 1.12E-03    4.14E-03    4.78E-04
3   32  0.5 5.61E-01    4.08E-01    7.38E-01
4   42  0.7 4.53E+01    4.50E+01    5.61E+01
5   52  0.7 8.12E+01    8.07E+01    1.02E+02
6   62  0.9 1.03E+02    1.04E+02    1.31E+02

From this point, I cannot figure out how to fit the model for each row.

This is what I have tried:

R1_outcomes <- datdf %>%
  group_by(PET) %>%  # or rowwise()
  mutate(onetcmout = onetcm(t_tac=.$braindata[[1]]$Times/60,
                             tac=.$braindata[[1]]$R1,
                             input=.$bloodinput,
                             weights=.$braindata[[1]]$Weights))

R1_outcomes <- datdf %>%
rowwise() %>%
do(onetcmout = onetcm(t_tac=.$braindata[[1]]$Times/60,
                           tac=.$braindata[[1]]$R1,
                           input=.$bloodinput,
                           weights=.$braindata[[1]]$Weights))

I'm sure there's a way of doing this with the map functions, but I can't quite figure out how.

I would really appreciate any advice on how I might be able to do this. Thank you to anyone in advance!

I think your approach doesn't work because `braindata` is actually a list column - which breaks the `.$braindata` accessing. Have you tried maybe mapping the model function like `dat_df %>% mutate(onetcmout = purrr::map(braindata, onetcm)`. — davidski, Jan 13 '17 at 11:22
on a second thought - you might need to wrap the `onetcm` function in your own which would unwrap the passed in dataframe `braindata` (alternatively consider using `purrr::lift` maybe). if you could post some simulated sample data i can try and help you get this working. — davidski, Jan 13 '17 at 11:28
Thanks so much for the comments! I'll try to find some time this afternoon to simulate some data, otherwise I can try to do so over the weekend. Will get back to you asap! Thanks again! — Granville, Jan 13 '17 at 12:08
I'd guess `map2(braindata, bloodinput, ~onetcm(t_tac = .x$Times / 60, tac = .x$R1, input = .y, weights = .x$Weights)`. (within a `mutate` call, that is.) — Axeman, Jan 13 '17 at 12:09
@Axeman: that was it! That fixed everything! And it's done in so few lines and with so little fuss: super elegant solution. The solution was indeed to use `purrr::map2` to have the function call on both nested data frames simultaneously. Thank you so much! Do you want to write it as a proper answer so that I can mark it as answered, or should I just write up an edit containing the answer? I'm still a little new on here, and unsure of proper etiquette. Thanks again! — Granville, Jan 14 '17 at 23:54

R + dplyr nest + purrr - Evaluating a model across nestings within a dataframe row

Background

The Issue

0 Answers0