How to use List of List of Dataframes

Question

I´m not sure if this is possible or even how to get a good resolution for the following R problem.

Data / Background / Structure: I´ve collected a big dataset of project based cooperation data, which maps specific projects to the participating companies (this can be understood as a bipartite edgelist for social network analysis). Because of analytical reasons it is advised to subset the whole dataset to different subsets of different locations and time periods. Therefore, I´ve created the following data structure

sna.location.list 
[[1]]           (location1)
     [[1]]      (is a dataframe containing the bip. edge-list for time-period1)
     [[2]]      (is a dataframe containing the bip. edge-list for time-period2)
     ...
     [[20]]     (is a dataframe containing the bip. edge-list for time-period20)
[[2]]           (location2)
     ...         (same as 1)
 ...
[[32]]          (location32)
     ...

Every dataframe contains a project id and the corresponding company ids.

My goal is now to transform the bipartite edgelists to one-mode networks and then do some further sna-related-calculations (degree, centralization, status, community detection etc.) and save them.

I know how to these claculation-steps with one(!) specific network but it gives me a really hard time to automate this process for all of the networks at one time in the described list structure, and save the various outputs (node-level and network-level variables) in a similar structure.

I already tried to look up several ways of for-loops and apply approaches but it still gives me sleepless nights how to do this and right now I feel very helpless. Any help or suggestions would be highly appreciated. If you need more information or examples to give me a brief demo or code example how to tackle such a nested structure and do such sna-related calculations/modification for all of the aforementioned subsets in an efficient automatic way, please feel free to contact me.

score 1 · Answer 1 · answered Oct 28 '18 at 03:32

Let's say you have a function foo that you want to apply to each data frame. Those data frames are in lists, so lapply(that_list, foo) is what we want. But you've got a bunch of lists, so we actually want to lapply that first lapply across the outer list, hence lapply(that_list, lapply, foo). (The foo will be passed along to the inner lapply with .... If you wish to be more explicit you can use an anonymous function instead: lapply(that_list, function(x) lapply(x, foo)).

You haven't given a reproducible example, so I'll demonstrate applying the nrow function to a list of built-in data frames

d = list(
  list(mtcars, iris),
  list(airquality, faithful)
)

result = lapply(d, lapply, nrow)
result
# [[1]]
# [[1]][[1]]
# [1] 32
# 
# [[1]][[2]]
# [1] 150
# 
# 
# [[2]]
# [[2]][[1]]
# [1] 153
# 
# [[2]][[2]]
# [1] 272

As you can see, the output is a list with the same structure. If you need the names, you can switch to sapply with simplify = FALSE.

This covers applying functions to a nested list and saving the returns in a similar data structure. If you need help with calculation efficiency, parallelization, etc., I'd suggest asking a separate question focused on that, with a reproducible example.

thank you very much that really helped me to understand! On the other side, because I´m new to this, the apply family is sometimes confusing me. Therefore I first try to get the logic into a normal for-loop and then try out some of the apply functions. Because of that I would kindly ask if this whole thing could also be achived by a nested loop like: for i seq_along(sna.location.list[[i]]) .... for j seq_along(sna.location.list[[i]][[j]]).... This seems for me me even more intuitive for the beginning, for example if I want to access and change attributes of the newly created igraph-objects. — Mr.Morgan, Oct 28 '18 at 16:02
Yup, it sure could. If you need help with a `for` loop approach, I'd recommend (a) coming up with a toy example of your own that's fully reproducible, (b) attempting to solve it yourself with a for loop, and (c) if you have trouble, post a new question showing your reproducible example along with your best attempt. — Gregor Thomas, Oct 28 '18 at 18:00

How to use List of List of Dataframes

1 Answers1