3

I have data.frame

df <- data.frame(a = c(1,3),b = c(2,4))

  a b
1 1 2
2 3 NA

and I want to receive a data.frame like this:

  a  b           json
1 1  2 {"a":1, "b":2}
2 3 NA        {"a":3}

I wonder if there is a way to get this result efficiently with

df <- df %>% dplyr::mutate(json = ?())

without pasting values myself. In Postgres there is a function json_strip_nulls(row_to_json(*)) to get this. Is there any equivalent in R?

thmschk
  • 614
  • 3
  • 16

2 Answers2

3

You can do:

library(jsonlite)
library(dplyr)

df <- data.frame(a = c(1,3),b = c(2,NA))

df %>%
  rowwise() %>%
  mutate(json = toJSON(across())) %>%
  ungroup()

# A tibble: 2 x 3
      a     b json           
  <dbl> <dbl> <json>         
1     1     2 [{"a":1,"b":2}]
2     3    NA [{"a":3}]  
Ritchie Sacramento
  • 29,890
  • 4
  • 48
  • 56
2

stream_out line by line using the awesome jsonlite package:

library(jsonlite)

df <- data.frame(a = c(1,3),b = c(2,NA))

tc <- textConnection("jsontxt", "w")
stream_out(df, con=tc)
df$json <- jsontxt
close(tc)

df
##  a  b          json
##1 1  2 {"a":1,"b":2}
##2 3 NA       {"a":3}

Should be much more efficient than looping by row inside of R:

df <- data.frame(a = c(1,3),b = c(2,NA))
df <- df[rep(1:2, 10000),]
rownames(df) <- NULL

system.time({
  tc <- textConnection("jsontxt", "w")
  stream_out(df, con=tc)
  df$json <- jsontxt
  close(tc)
})
##Complete! Processed total of 20000 rows.
##   user  system elapsed 
##   0.78    0.00    0.78 

library(dplyr)
system.time({
df %>%
  rowwise() %>%
  mutate(json = toJSON(across())) %>%
  ungroup()
})
##   user  system elapsed 
##  28.36    0.24   28.61 
thelatemail
  • 91,185
  • 12
  • 128
  • 188