Converting a 1 level list with irregular items to a data frame

Question

I'm looking for simple solution to the above. I seem to run into this problem frequently when APIs return JSON which is subsequently converted to a list.

Reprex data:

result_head <- list(list(name = "JEFFREY", gender = "male", probability = 1L, 
count = 932L), list(name = "Jan", gender = "male", probability = 0.6, 
count = 1663L), list(name = "Elquis", gender = NULL), list(
name = "ELQUIS", gender = NULL), list(name = "Francisco", 
gender = "male", probability = 1L, count = 1513L))

The task is, as simply as possible to convert this to a 5 row data frame. Given the items within each list element are irregular, NAs will need to introduced for missing items similar to how bind_rows works when stacking data frames with irregular columns.

What I've tried:

map_dfr(result, bind_rows)

do.call(bind_rows, result_head)

flatten(result_head)

bind_rows(flatten(result_head))

I asked a similar question here: Extracting to a data frame from a JSON generated multi-level list with occasional missing elements

... but the solution is totally over-engineered for a less complex list.

I'd like a solution that is hopefully elegant as possible - I run into this kind of operation so often and yet there doesn't seem to be a consistent way of doing this with few levels of function abstraction.

I realise questions around this may have already been asked and I may have missed something, but there doesn't seem to be a consistent and simple way of tackling what seems a common problem.

Thanks.

akrun · Accepted Answer · 2019-02-23T17:01:00.400

3

Here is another option with map after flattening and converting to tibble

library(tidyverse)
map_df(result_head, ~ flatten(.x) %>%
                as_tibble)
# A tibble: 5 x 4
#  name      gender probability count
#  <chr>     <chr>        <dbl> <int>
#1 JEFFREY   male           1     932
#2 Jan       male           0.6  1663
#3 Elquis    <NA>          NA      NA
#4 ELQUIS    <NA>          NA      NA
#5 Francisco male           1    1513

Or as @G.Groethendieck mentioned in the comments

map_dfr(result_head, flatten)

edited Feb 23 '19 at 17:01

answered Feb 23 '19 at 12:33

akrun

874,273
37
540
662

2

or just `library(purrr); map_dfr(result_head, flatten)` – G. Grothendieck Feb 23 '19 at 12:52
Thanks. Out of interest why does the syntax of map call for .x and instead of x? Likewise in the documentation it mentions .x and .f. – nycrefugee Feb 23 '19 at 14:26
1

@nycrefugee It could be to prevent any clash with other identifiers named as 'x'. It is less likely to have a object named as `.x` – akrun Feb 23 '19 at 17:03

score 2 · Answer 2 · answered Feb 23 '19 at 12:20

library(purrr) # transpose and map_if
library(rlist) # list.stack

result_head <- list(
  list(name = "JEFFREY", gender = "male", probability = 1L, count = 932L), 
  list(name = "Jan", gender = "male", probability = 0.6, count = 1663L), 
  list(name = "Elquis", gender = NULL), 
  list(name = "ELQUIS", gender = NULL), 
  list(name = "Francisco", gender = "male", probability = 1L, count = 1513L)
)

list.stack(transpose(
  lapply(transpose(result_head), function(y) map_if(y, is.null, function(x) NA))
))

       name gender probability count
1   JEFFREY   male         1.0   932
2       Jan   male         0.6  1663
3    Elquis   <NA>          NA    NA
4    ELQUIS   <NA>          NA    NA
5 Francisco   male         1.0  1513

Converting a 1 level list with irregular items to a data frame

2 Answers2