I have 10 GB JSON file containing tweets and I'm trying to load it in R using stream_in function from litejson package. I want to load just some number of rows, not whole file. Is there a way to do it?
Asked
Active
Viewed 86 times
0
-
1If *"some number"* is the first `n` lines, then `stream_in(paste(readLines(fname,n=50),collapse="\n"))` should work. Otherwise, your best bet is likely to use an external tool (since I'm assuming you are attempting to reduce R's memory usage). Thoughts: `sed -ne '5,15p' fname` gives you lines 5-15 (and is very good with memory and large files). – r2evans May 15 '18 at 19:25
-
Yes, some first lines would be fine. However, it requires `fname` to be a _connection_ and gives an error now. – Nata May 15 '18 at 20:04
-
Sorry, untested at the time. This works for me: `stream_in(textConnection(readLines(fname,n=5)))`. – r2evans May 15 '18 at 20:10
-
check out the `jqr` package: https://cran.r-project.org/web/packages/jqr/vignettes/jqr_vignette.html – chinsoon12 May 16 '18 at 03:36