23

I'm trying to figure out the best way of producing a JSON file from R. I have the following dataframe tmp in R.

> tmp
  gender age welcoming proud tidy unique
1      1  30         4     4    4      4
2      2  34         4     2    4      4
3      1  34         5     3    4      5
4      2  33         2     3    2      4
5      2  28         4     3    4      4
6      2  26         3     2    4      3

  
tmp <- data.frame(
  gender = c(1L, 2L, 1L, 2L, 2L, 2L),
  age = c(30, 34, 34, 33, 28, 26),
  welcoming = c(4L, 4L, 5L, 2L, 4L, 3L),
  proud = c(4L, 2L, 3L, 3L, 3L, 2L),
  tidy = c(4L, 4L, 4L, 2L, 4L, 4L),
  unique = c(4L, 4L, 5L, 4L, 4L, 3L)
)

Using the rjson package, I run the line toJSON(tmp) which produces the following JSON file:

 {"gender":[1,2,1,2,2,2],
 "age":[30,34,34,33,28,26],
 "welcoming":[4,4,5,2,4,3],
 "proud":[4,2,3,3,3,2],
  "tidy":[4,4,4,2,4,4],
  "unique":[4,4,5,4,4,3]}

I also experimented with the RJSONIO package; the output of toJSON() was the same. What I would like to produce is the following structure:

  {"traits":["gender","age","welcoming","proud", "tidy", "unique"],
   "values":[   
            {"gender":1,"age":30,"welcoming":4,"proud":4,"tidy":4, "unique":4},
            {"gender":2,"age":34,"welcoming":4,"proud":2,"tidy":4, "unique":4},
            ....
            ]

I'm not sure how best to do this. I realize that I can parse it line by line using python but I feel like there is probably a better way of doing this. I also realize that my data structure in R does not reflect the meta-information desired in my JSON file (specifically the traits line), but I am mainly interested in producing the data formatted like the line

{"gender":1,"age":30,"welcoming":4,"proud":4,"tidy":4, "unique":4}

as I can manually add the first line.


EDIT: I found a useful blog post where the author dealt with a similar problem and provided a solution. This function produces a formatted JSON file from a data frame.

toJSONarray <- function(dtf){
clnms <- colnames(dtf)

name.value <- function(i){
quote <- '';
# if(class(dtf[, i])!='numeric'){
if(class(dtf[, i])!='numeric' && class(dtf[, i])!= 'integer'){ # I modified this line so integers are also not enclosed in quotes
quote <- '"';
}

paste('"', i, '" : ', quote, dtf[,i], quote, sep='')
}

objs <- apply(sapply(clnms, name.value), 1, function(x){paste(x, collapse=', ')})
objs <- paste('{', objs, '}')

# res <- paste('[', paste(objs, collapse=', '), ']')
res <- paste('[', paste(objs, collapse=',\n'), ']') # added newline for formatting output

return(res)
}
moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
djq
  • 14,810
  • 45
  • 122
  • 157
  • Years have gone on since this question was asked. May I suggest looking at library(jsonlite) - somebody else provided the answer below. It doesn't require the apply() transforms and directly produces the expected JSON output. – ripvlan Mar 24 '16 at 18:24
  • 1
    @ripvlan - marked jsonlite as answer – djq Mar 03 '17 at 09:39

5 Answers5

18

Using the package jsonlite:

> jsonlite::toJSON(list(traits = names(tmp), values = tmp), pretty = TRUE)
{
  "traits": ["gender", "age", "welcoming", "proud", "tidy", "unique"],
  "values": [
    {
      "gender": 1,
      "age": 30,
      "welcoming": 4,
      "proud": 4,
      "tidy": 4,
      "unique": 4
    },
    {
      "gender": 2,
      "age": 34,
      "welcoming": 4,
      "proud": 2,
      "tidy": 4,
      "unique": 4
    },
    {
      "gender": 1,
      "age": 34,
      "welcoming": 5,
      "proud": 3,
      "tidy": 4,
      "unique": 5
    },
    {
      "gender": 2,
      "age": 33,
      "welcoming": 2,
      "proud": 3,
      "tidy": 2,
      "unique": 4
    },
    {
      "gender": 2,
      "age": 28,
      "welcoming": 4,
      "proud": 3,
      "tidy": 4,
      "unique": 4
    },
    {
      "gender": 2,
      "age": 26,
      "welcoming": 3,
      "proud": 2,
      "tidy": 4,
      "unique": 3
    }
  ]
} 
Alex
  • 15,186
  • 15
  • 73
  • 127
  • 1
    Hi Alex - I liked this answer the best. jsonlite understands data.frame and outputs without the alply()/apply() transform as described in @civilstat answer above. Also - I ran into an issue with RJSONIO that I haven't been able to resolve that causes all values to be arrays of 1. jsonlite didn't have this problem. – ripvlan Mar 24 '16 at 18:17
  • thanks for the feedback, I am glad you found it useful. – Alex Mar 25 '16 at 09:24
14

Building upon Andrie's idea with apply, you can get exactly what you want by modifying the tmp variable before calling toJSON.

library(RJSONIO)
modified <- list(
  traits = colnames(tmp),
  values = unname(apply(tmp, 1, function(x) as.data.frame(t(x))))
)
cat(toJSON(modified))
Richie Cotton
  • 118,240
  • 47
  • 247
  • 360
  • 1
    One problem I have had with both approaches (`rjson` and `RJSONIO`) is that they convert all numbers to strings. For example `2` becomes `"2"`. The code snippet I pasted in as part of my question avoids this problem as it checks for `numeric` and `integer`. Perhaps I should add it as an answer. – djq Nov 28 '11 at 15:21
  • @celenius: My solution applied to your variable `tmp` has integer integers, not converted-to-string integers. Are you sure you're not doing something silly? – Richie Cotton Nov 28 '11 at 16:18
  • @RichieCotton hmm. I must have been doing something silly - it was definitely returning integers as strings when I tried it first. No idea why, as when I ran it just now it worked fine (with integers as integers). I hadn't overwritten the `toJSON` function. – djq Nov 28 '11 at 16:58
9

Building further on Andrie and Richie's ideas, use alply instead of apply to avoid converting numbers to characters:

library(RJSONIO)
library(plyr)
modified <- list(
  traits = colnames(tmp),
  values = unname(alply(tmp, 1, identity))
)
cat(toJSON(modified))

plyr's alply is similar to apply but returns a list automatically; whereas without the more complicated function inside Richie Cotton's answer, apply would return a vector or array. And those extra steps, including t, mean that if your dataset has any non-numeric columns, the numbers will get converted to strings. So use of alply avoids that concern.

For example, take your tmp dataset and add

tmp$grade <- c("A","B","C","D","E","F")

Then compare this code (with alply) vs the other example (with apply).

civilstat
  • 416
  • 5
  • 9
4

It seems to me you can do this by sending each row of your data.frame to JSON with the appropriate apply statement.

For a single row:

library(RJSONIO)

> x <- toJSON(tmp[1, ])
> cat(x)
{
 "gender": 1,
"age":     30,
"welcoming": 4,
"proud": 4,
"tidy": 4,
"unique": 4 
}

The entire data.frame:

x <- apply(tmp, 1, toJSON)
cat(x)
{
 "gender": 1,
"age":     30,
"welcoming": 4,
"proud": 4,
"tidy": 4,
"unique": 4 
} {

...

} {
 "gender": 2,
"age":     26,
"welcoming": 3,
"proud": 2,
"tidy": 4,
"unique": 3 
}
Andrie
  • 176,377
  • 47
  • 447
  • 496
  • Thanks @Andrie, that's a great idea. I'm trying to figure out how to insert a `,` at the end of each line using something like this: `x <- apply(tmp, 1, function(tmp){paste(toJSON, ',')})`. Is there a way to append something to what is returned from `toJSON`? – djq Nov 27 '11 at 23:48
  • Just figured it out: `x <- apply(tmp, 1, function(tmp){paste(toJSON(tmp), ',')})`. Still getting the hang of `apply`! – djq Nov 27 '11 at 23:49
  • @celenius: That looks suspiciously like you are trying to write bits of JSON yourself, which will only end in tears. Better to manipulate the data into a form where calling `toJSON` gives you what you want. See my answer. – Richie Cotton Nov 28 '11 at 13:19
  • @RichieCotton I was just adding a `,` between `{...}{...}`. The output that was produced was not valid JSON. I agree in general though, it's probably not a robust way of doing it. – djq Nov 28 '11 at 15:22
  • The library RJSONIO gave me an issue that I haven't resolved yet. All values are output as array of 1. e.g. { "gender" : [ 2], "age" : [33] }.... However, jsonlite::toJSON() produced the expected output. { "gender" : 2, "age" : 33 }. I created by data via ddply(..summarise..) - but I also tried using a simple data.frame(a=c(1,2,3), b=("a","b","c")) and had the same result. – ripvlan Mar 24 '16 at 18:21
2

Another option is to use the split to split your data.frame with N rows into N data.frames with 1 row.

library(RJSONIO)
modified <- list(
   traits = colnames(tmp),
   values = split(tmp, seq_len(nrow(tmp)))
)
cat(toJSON(modified))
edwindj
  • 926
  • 7
  • 5