-1

I want to create a JSON object from a Dataframe, the resulting JSON object (following HL7- FHIR Version 4.0.1 ) should look something like this:

{
  "resourceType": "Patient",
  "name": [
    { "family": "Simpson", "given": [ "Homer", "Jay" ] }
     ],
  "gender": "male",
  "birthDate": "1956-5-12"
}

The "name" resource contains an array (ordered list) with two elements "family" has datatype string and "given", an array containing one or more strings.

If I try to create this JSON from an dataframe I run into some issues:

Using base R i tried to incorporate a list into the dataframe by protecting it with I() which didnt work for nested lists.

library(jsonlite)


> # Base R
> patient_df <- data.frame(
+   resourceType = "Patient",
+   name = I(list(family= "Simpson", given = I(list("Homer", "Jay")))),
+   gender = "male",
+   birthDate = "1956-5-12"
+ )

> patient_df
       resourceType       name gender birthDate
family      Patient    Simpson   male 1956-5-12
given       Patient Homer, Jay   male 1956-5-12

This creates a dataframe with two rows, so I try to call toJSON in following fashion:

> toJSON(patient_df,dataframe="columns",pretty = T)
{
  "resourceType": ["Patient", "Patient"],
  "name": [
    ["Simpson"],
    [
      ["Homer"],
      ["Jay"]
    ]
  ],
  "gender": ["male", "male"],
  "birthDate": ["1956-5-12", "1956-5-12"],
  "_row": ["family", "given"]
} 

This creates individual arrays for each value, and also repeats the other values and adds a "row element.

I also tried to use tibbles since they can incorporate more complex structures in their columns

library(tidyverse)
> patient_df_tidy <-tibble(
+   resourceType = "Patient",
+   name = tibble(family= "Simpson", given = tibble("Homer", "Jay")),
+   gender = "male",
+   birthDate = "1956-5-12"
+ )

> patient_df_tidy
# A tibble: 1 × 4
  resourceType name$family $given$`"Homer"` $$`"Jay"` gender birthDate
  <chr>        <chr>       <chr>            <chr>     <chr>  <chr>    
1 Patient      Simpson     Homer            Jay       male   1956-5-12

This gives me a single row which is better, however i still dont get the json right:

> toJSON(patient_df_tidy,
+        auto_unbox = T,
+        pretty = T)

[
  {
    "resourceType": "Patient",
    "name": {
      "family": "Simpson",
      "given": {
        "\"Homer\"": "Homer",
        "\"Jay\"": "Jay"
      }
    },
    "gender": "male",
    "birthDate": "1956-5-12"
  }
] 

It preserves the name of the list items but it is still not structured right, because it creates the array at the wrong place.

user12256545
  • 2,755
  • 4
  • 14
  • 28

1 Answers1

2

Rather than using tibble on the inside, use a `list. For example

 patient_df_tidy <-tibble(
     resourceType = "Patient",
     name = list(list(family= "Simpson", given = list("Homer", "Jay"))),
     gender = "male",
     birthDate = "1956-5-12"
  )
 
jsonlite::toJSON(patient_df_tidy,
                auto_unbox = T,
                pretty = T)

will produce

[
  {
    "resourceType": "Patient",
    "name": {
      "family": "Simpson",
      "given": [
        "Homer",
        "Jay"
      ]
    },
    "gender": "male",
    "birthDate": "1956-5-12"
  }
] 

If you need to also get rid of the outer [], then you can do

jsonlite::toJSON(as.list(patient_df_tidy),
                auto_unbox = T,
                pretty = T)
MrFlick
  • 195,160
  • 17
  • 277
  • 295