5

For my PhD project I need to read a JSONL file into R (the extension isn't json, is jsonl) and transform it in a csv. I tried to use this code based on jsonlite but it gives me an error:

library(jsonlite)
data <- "/Users/chiarap/tweets.jsonl"
dat <- fromJSON(sprintf("[%s]", paste(readLines(data), collapse=",")))
Error: parse error: unallowed token at this point in JSON text
          EFEF","notifications":null}},,,{"in_reply_to_status_id_str":
                     (right here) ------^

Thanks

Rentrop
  • 20,979
  • 10
  • 72
  • 100
Chiara
  • 51
  • 1
  • 3
  • 2
    First time I heard of `jsonl`, but I guess that are just standard `json` in which each line is a different `json`. Try to call `fromJSON` for each line of the file. For instance `content<-readLines(data)` and then `fromJSON(content[i])` where `i` ranges between 1 and the number of rows in the file. – nicola Jan 26 '16 at 15:03
  • @nicola thank you very much, it works. Before writing here I tried to use the readLines function combined with fromJSON but I couldn’t find any example of how to write correctly the script. This was very helpful. – Chiara Jan 27 '16 at 18:36

1 Answers1

8

If you have a large file, pasting all of the rows together may result in errors. You can process each line separately and then combine them into a data frame.

library(jsonlite)
library(dplyr)

lines <- readLines("myfile.jsonl")
lines <- lapply(lines, fromJSON)
lines <- lapply(lines, unlist)
x <- bind_rows(lines)

Then x is a data frame that you can continue to work with or write to file.

cmaimone
  • 546
  • 4
  • 8