0

I have a complex list structure in R that I'd like to put in a dataframe format. The data are below. The list contains eight columns of data on five different music albums. The number of rows in each list element is equal to the number of personnel who contributed to the album, which is my main interest.

What is an efficient way to to transform the list into a dataframe with 12 rows and 8 columns? Ideally, I'd like to make a function that takes one argument, (a list of any number of music albums with the same 8 columns) and returns a dataframe.

I've tried using flatten, unnest and map in various ways, but can't crack this one.

df2<-list(NULL, structure(list(join = c(\"/\", \"/\", \"/\", \"/\", \"/\", \"\"\r\n), name = c(\"Art Tatum\", \"Lionel Hampton\", \"Harry Edison\", \"Buddy Rich\", \r\n\"Red Callender\", \"Barney Kessel\"), anv = c(\"\", \"\", \"\", \"\", \"\", \r\n\"\"), tracks = c(\"\", \"\", \"\", \"\", \"\", \"\"), role = c(\"\", \"\", \"\", \r\n\"\", \"\", \"\"), resource_url = c(\"https://api.discogs.com/artists/265634\", \r\n\"https://api.discogs.com/artists/136133\", \"https://api.discogs.com/artists/258469\", \r\n\"https://api.discogs.com/artists/57620\", \"https://api.discogs.com/artists/272014\", \r\n\"https://api.discogs.com/artists/253476\"), id = c(265634L, 136133L, \r\n258469L, 57620L, 272014L, 253476L), `frset$title` = c(\"The Tatum Group Masterpieces, Vol. 5\", \r\n\"The Tatum Group Masterpieces, Vol. 5\", \"The Tatum Group Masterpieces, Vol. 5\", \r\n\"The Tatum Group Masterpieces, Vol. 5\", \"The Tatum Group Masterpieces, Vol. 5\", \r\n\"The Tatum Group Masterpieces, Vol. 5\")), class = \"data.frame\", row.names = c(NA, \r\n6L)), structure(list(join = \"\", name = \"Art Tatum\", anv = \"\", \r\n    tracks = \"\", role = \"\", resource_url = \"https://api.discogs.com/artists/265634\", \r\n    id = 265634L, `frset$title` = \"This Is...Art Tatum  - Vol 1\"), class = \"data.frame\", row.names = 1L), \r\n    structure(list(join = c(\"/\", \"/\", \"\"), name = c(\"Oscar Peterson\", \r\n    \"Erroll Garner\", \"Art Tatum\"), anv = c(\"\", \"\", \"\"), tracks = c(\"\", \r\n    \"\", \"\"), role = c(\"\", \"\", \"\"), resource_url = c(\"https://api.discogs.com/artists/254394\", \r\n    \"https://api.discogs.com/artists/262816\", \"https://api.discogs.com/artists/265634\"\r\n    ), id = c(254394L, 262816L, 265634L), `frset$title` = c(\"Great Jazz Pianists\", \r\n    \"Great Jazz Pianists\", \"Great Jazz Pianists\")), class = \"data.frame\", row.names = c(NA, \r\n    3L)), structure(list(join = \"\", name = \"Art Tatum\", anv = \"\", \r\n        tracks = \"\", role = \"\", resource_url = \"https://api.discogs.com/artists/265634\", \r\n        id = 265634L, `frset$title` = \"Art Tatum\"), class = \"data.frame\", row.names = 1L), \r\n    structure(list(join = \"\", name = \"Art Tatum\", anv = \"\", tracks = \"\", \r\n        role = \"\", resource_url = \"https://api.discogs.com/artists/265634\", \r\n        id = 265634L, `frset$title` = \"Art!\"), class = \"data.frame\", row.names = 1L))
df2<-unlist(lapply(df, function(x) if (length(x)==8) list(x) else x), recursive=FALSE) # remove NULL lists or those with wrong dimensions

albumdata<-list()
personnel<-function(df){
 for(i in 1:length(df))
  albumdata[[i]]<-as.data.frame(unlist(df[i+1],recursive=FALSE))
  return(albumdata)
  message("Album #",dim(albumdata[[2]])[1]) 
      }

This function returns a list again. Why?

Ben
  • 1,113
  • 10
  • 26
  • I can't figure out why you thought it would return anything other than a list. (You do know that a dataframe is a list, right?) `albumdata` started out as a list and then you assigned something to it that might not have been a list, but that would not change the class of `albumdata`. I suspect you get no message, since that `message` call comes after the `return` statement. I was surprised you didn't get an error when you tried to `unlist(df[i+1]` – IRTFM Nov 04 '19 at 22:35

1 Answers1

2

After cleaning up the data, this worked fine

jsonlite::rbind_pages(df2)
Carl Boneri
  • 2,632
  • 1
  • 13
  • 15