Return a list of data.frames created within a function in R

Question

I am trying to reduce duplicated code in a script and in order to do this I am creating some helper functions.

One function I am working on has no arguments that it takes in but rather uses a data set already loaded into the global environment to create a few subsets and then returns those data.frames.

I have created a simple example below that doesn't do exactly what I am describing but will give an idea of how it is structured.

# Create function
my_func <- function(){
  a <- as.data.frame("ID" = c(1, 2, 3, 4, 5, 6), 
                     "TYPE" = c(1, 1, 2, 2, 3, 3), 
                     "CLASS" = c(1, 2, 3, 4, 5, 6))
  b <- as.data.frame("ID" = c(1, 2, 3, 4, 5, 6), 
                     "STATUS" = c(1, 1, 2, 2, 3, 3))
  return(list(a, b))
}

# Call to the function
list[a, b] <- my_func()

The issue I am having is not within the function, but rather when calling the function and trying to store the results. If I call the function like this:

my_func()

It prints the 2 data.frames as a list, however, when trying to assign them names it gives me the error that a does not exist. I am assuming I am just returning them incorrectly or trying to store them incorrectly.

Thanks!

UPDATE

For reference the reason I was trying to use this syntax is due to this post: How to assign from a function which returns more than one value?

Also, I was hoping to capture the return in 1 line instead of having to assign it individually.

For example, in this case it is easy enough to assign it as:

test <- my_func()
a <- test[[1]]; b <- test[[2]]

However, if I had a much longer list, this would get very tedious.

I'm not completely clear on what you're asking, but it seems like maybe you just want to change the `return` to `return(list(a = a, b = b))` — IceCreamToucan, Jun 27 '18 at 21:23
Either that or if you don't want to hard code it in the function, `setNames(my_func(),c("a","b"))`. — joran, Jun 27 '18 at 21:24
@Ryan I already tried it like that and it still gave me the same error. — , Jun 27 '18 at 21:25
The line giving the error is the call to the function. The problem is when trying to call those data frames within the list created, I must not be calling them correctly. As I said above if I call the function like `my_func()` it returns the list with no issues. — , Jun 27 '18 at 21:27
`list[a,b] <- something` won't work for several reasons: not just that `a` and `b` need not exist in the global environment, you can't subset a function (`list` -- you have square brackets after `list` which means subsetting). You can assign the `my_func()` output to a variable, which will then be a list with 2 components (`a <- my_func()` --> then `a[[1]]` will be what is called `a` within `my_func` and `a[[2]]` is the `b` from within `my_func` — lebatsnok, Jun 27 '18 at 21:27
Yeah, I wasn't sure whether `list[a, b]` was an actual attempt at something or if it was just a generic placeholder, because that's not even valid R syntax. — joran, Jun 27 '18 at 21:30
Oddly enough I found that syntax in another post, so that is the reason I was using that. https://stackoverflow.com/questions/1826519/how-to-assign-from-a-function-which-returns-more-than-one-value — , Jun 27 '18 at 21:33
It might be a little hard to tell from how he wrote that answer, but that syntax depends on a custom function that he wrote (he linked to the definition) and that he subsequently added to one of his packages. But it's not a part of R itself. — joran, Jun 27 '18 at 21:38

score 3 · Accepted Answer · edited Jun 27 '18 at 21:47

The function as.data.frame() converts an existing object to a dataframe. The function data.frame() is what you need to create a data frame. You also don't want to pass in your column names as strings. If you remove the quotes, and change the function to data.frame() it will work!

# Create function
my_func <- function(){
  a <- data.frame(ID = c(1, 2, 3, 4, 5, 6), 
                  TYPE = c(1, 1, 2, 2, 3, 3), 
                  CLASS = c(1, 2, 3, 4, 5, 6))
  b <- data.frame(ID = c(1, 2, 3, 4, 5, 6), 
                  STATUS = c(1, 1, 2, 2, 3, 3))
  return(list(a, b))
}

# Call to the function
test <- my_func()

R functions can only return a single value, so we join a and b into a list and return that. To access the data frames, you can select them by index:

test[[1]]  # returns data.frame 'a' (yes, indices in R start with 1)
test[[2]]  # returns data.frame 'b'

Jimmy TwoCents · Answer 2 · 2020-08-17T14:48:08.080

Here is a solution for longer lists of data frames.

    my_func <- function(n){
    df_list<-list()
    for (i in 1:n){
    df_list[[i]]<-data.frame('ID'=rep(i,n), 'sqrt'=rep(sqrt(i),n),    'Class'=rep(sample.int(i,1), n))
    
    return(list(my_df=sapply(1:n, function(i)list(df_list[[i]]))))
    }
   output= my_func(10)$my_df[[1]]
   print(output)

If you want to put the dataframes "back together", you can loop through the list with the rbind function to return one frame. This is hopefully what you need. Let me know.

Return a list of data.frames created within a function in R

2 Answers2