0

I realize there are several questions on Stack Overflow that ask something similar to this already, but I cannot seem to apply them to my specific problem. I'm trying to convert the following json formatted data to a data frame. This data is from scraped kickstarter data from the following site: https://webrobots.io/kickstarter-datasets/

"{\"id\":704877813,\"name\":\"Wellmii\",\"is_registered\":null,\"chosen_currency\":null,\"avatar\":{\"thumb\":\"https://ksr-ugc.imgix.net/assets/022/981/694/75c6b5ca6616e3a3adaa295fcef9d318_original.png?ixlib=rb-1.1.0&w=40&h=40&fit=crop&v=1541445663&auto=format&frame=1&q=92&s=872ecbdca14ada8169b88c1794d29591\",\"small\":\"https://ksr-ugc.imgix.net/assets/022/981/694/75c6b5ca6616e3a3adaa295fcef9d318_original.png?ixlib=rb-1.1.0&w=160&h=160&fit=crop&v=1541445663&auto=format&frame=1&q=92&s=99039218188220e2690206b2b508b19f\",\"medium\":\"https://ksr-ugc.imgix.net/assets/022/981/694/75c6b5ca6616e3a3adaa295fcef9d318_original.png?ixlib=rb-1.1.0&w=160&h=160&fit=crop&v=1541445663&auto=format&frame=1&q=92&s=99039218188220e2690206b2b508b19f\"},\"urls\":{\"web\":{\"user\":\"https://www.kickstarter.com/profile/704877813\"},\"api\":{\"user\":\"https://api.kickstarter.com/v1/users/704877813?signature=1544762516.4e88d80e492ef75c79caff24e220b49c87d522c7\"}}}"

If I apply the following code to the data, I get a data frame where the "web" and "api" variables are tibbles. I just want the data in a regular data frame. How do I get these variables to just be regular data frame variables?

df <- data %>%

    # make json, then make list
    fromJSON() %>%

    # remove classification level
    purrr::flatten() %>%

    # turn nested lists into dataframes
    map_if(is_list, as_tibble) %>%

    # bind_cols needs tibbles to be in lists
    map_if(is_tibble, list) %>%

    # creates nested dataframe
    bind_cols()

The data frame should have the following variables: id, name, is_registered, chosen_currency, thumb, small, medium, web.user, api.user. The last two variables don't really need the .user at the end of them. "id" should have 704877813 as it's data, name should have Wellmii, is_registered should be null or NA, etc. There's two larger sections in the data, one is referred to as "avatar", and the other as "urls", where the "avatar" section includes the thumb, small, and medium variables, and the urls section includes the web.user and api.user variables.

user8229029
  • 883
  • 9
  • 21

1 Answers1

1

Unsure as to necessity of the map_if use, but you can use unnest to make the list columns into standard vectors. This approach won't work if the list cols end up with different dimensions. In that case you should directly extract what you need.

library(tidyverse)
library(jsonlite)

data <- "{\"id\":704877813,\"name\":\"Wellmii\",\"is_registered\":null,\"chosen_currency\":null,\"avatar\":{\"thumb\":\"https://ksr-ugc.imgix.net/assets/022/981/694/75c6b5ca6616e3a3adaa295fcef9d318_original.png?ixlib=rb-1.1.0&w=40&h=40&fit=crop&v=1541445663&auto=format&frame=1&q=92&s=872ecbdca14ada8169b88c1794d29591\",\"small\":\"https://ksr-ugc.imgix.net/assets/022/981/694/75c6b5ca6616e3a3adaa295fcef9d318_original.png?ixlib=rb-1.1.0&w=160&h=160&fit=crop&v=1541445663&auto=format&frame=1&q=92&s=99039218188220e2690206b2b508b19f\",\"medium\":\"https://ksr-ugc.imgix.net/assets/022/981/694/75c6b5ca6616e3a3adaa295fcef9d318_original.png?ixlib=rb-1.1.0&w=160&h=160&fit=crop&v=1541445663&auto=format&frame=1&q=92&s=99039218188220e2690206b2b508b19f\"},\"urls\":{\"web\":{\"user\":\"https://www.kickstarter.com/profile/704877813\"},\"api\":{\"user\":\"https://api.kickstarter.com/v1/users/704877813?signature=1544762516.4e88d80e492ef75c79caff24e220b49c87d522c7\"}}}"

data %>%
 fromJSON() %>%
 purrr::flatten() %>%
 bind_rows() %>%
 unnest()

#> # A tibble: 1 x 7
#>        id name   thumb        small       medium       web      api        
#>     <int> <chr>  <chr>        <chr>       <chr>        <chr>    <chr>      
#> 1  7.05e8 Wellm… https://ksr… https://ks… https://ksr… https:/… https://ap…

Created on 2018-12-27 by the reprex package (v0.2.1)

Jake Kaupp
  • 7,892
  • 2
  • 26
  • 36