0

I have a JSON file sizing in 500mb, and need to process it in R using fromJSON.

I've tried bigmemory packages but still failed. Either crashed or reached the memory limits.

Some code I had used like these

raw<- big.matrix(unlist(fromJSON("data.json")), ncol=24, type='integer', init=2, backingfile='data.bin') 

other information Win 8 64bit, memory of 6GB, R version 3.1.3

here are some lines of the JSON file

["UPGRADE(ONLINE)", "20150223", "5693", "000000", "FR", "fr-fr", "STARADDICT II Plus", "4.0.4", 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0]

1 Answers1

0

If the data is not too structured (and if you can get it into a matrix then it probably isn't) it might be possible to pre-process the JSON into a CSV file and then I think there's probably better tools for handling large CSV files than large JSON files.

It may be as simple as replacing a few brackets with new-lines, and could be done by command line utilities or more slowly in R by reading and writing line-by-line. You could possibly even read it into R directly in a line-by-line manner and parsing each line of the file as you go.

Posting a sample of your data would help construct a fuller answer.

Spacedman
  • 92,590
  • 12
  • 140
  • 224
  • `["UPGRADE(ONLINE)", "20150213", , "5693", "000000", "RU", "ru-ru", "Lenovo A800", "4.0.4", 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] ["UPGRADE(ONLINE)", "20150223", "5693", "000000", "FR", "fr-fr", "STARADDICT II Plus", "4.0.4", 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0] ` – wanderlustwei Mar 30 '15 at 11:55
  • That's not valid JSON unless there's commas or something separating. – Spacedman Mar 30 '15 at 13:10
  • it is separated by commas between [], but it seems all in one line. Is that what caused crash? – wanderlustwei Mar 31 '15 at 06:05
  • Possibly, depends on where your `fromJSON` function is coming from. There's a few json-readers in R. Can you not get this data as a CSV file instead of JSON? Also I note your first line above has an empty field after "20150213" which gives it more fields than the second one, so I don't see how this is going to read into a matrix (unless you've manually edited something wrongly here). – Spacedman Mar 31 '15 at 11:10