0

My json file is being read into R as a list by json_lite::read_json().

To recreate my problem, save the code below as a .json file from any text editor, the file can then be read into R.

{
"data": [
{
"type": "invite",
"send_date": "2018-05-01"
},
{
"type": "reminder",
"send_date": "2018-05-03",
"tokens": {
"email_subject": "REMINDER: Franchise Exit Survey"
}
},
{
"type": "reminder",
"send_date": "2018-05-07",
"tokens": {
"email_subject": "REMINDER: Franchise Exit Survey"
}
}
],
"relationships": {
"invitee": {
"data": {
"id": "b292aa38"
}
}
}
}

You can read the json file into R

library(jsonlite)
library(dplyr)
library(readr)

file_json <- "json_saved_from_text_editor.json"

l_json <- read_json(file_json, simplifyVector = TRUE) 

# to view the data.frame portion of l_json whose third column is itself a data.frame:
l_json[[1]]

The first element of this list is of class data.frame whose third column is also of class data.frame. I have worked with list columns in tibbles, but never encountered a data.frame with a column of class data.frame. Importantly this column of class data.frame is behaving very differently than any other column class I've encountered. It cannot be unnested, and its values are sensitive to the dimensions of the entire data.frame.

Is there a way to manipulate, create or just avoid this data.frame class of column?

My ultimate goal is to be able to recreate this small json file from a dataframe. But I can't figure out how to manipulate or create these data.frame columns.

SymbolixAU
  • 25,502
  • 4
  • 67
  • 139
Joe
  • 3,217
  • 3
  • 21
  • 37

1 Answers1

1

You need to deal with a couple spots that are nested in your json. I'm saving the actual data from df$data for convenience as df_data, which has a column tokens that itself is data frames of one column, email_subject. If you run df_data %>% pull(tokens) %>% pull(email_subject), you'll get the vector of email subject lines, which you can put into a new data frame.

df_data <- df$data

df_fix <- bind_cols(
    df_data %>% select(type, send_date),
    email_subject = df_data %>% pull(tokens) %>% pull(email_subject)
)

The output then looks like this:

      type  send_date                   email_subject
  invite   2018-05-01                            <NA>
  reminder 2018-05-03 REMINDER: Franchise Exit Survey
  reminder 2018-05-07 REMINDER: Franchise Exit Survey
camille
  • 16,432
  • 18
  • 38
  • 60