0

I'm trying to parse big JSON files with the function parse_stream() function from rwteet library. It fails with long JSON objects.

This JSON objects tends to generate an error when they have considerable size (200MB-1GB). My stream function looks like this.

    stream_tweets(
        "#google,#apple",
        timeout = 60*60*6, #six hours
        file_name = json_filename,
        dir= "./raw_tweets/"
      )

    djt <- parse_stream(json_filename)

I didn't enable the parse = TRUE parameter because in documentation it says that is not recommended at big JSON objects. Anyway I also tried to stream with parse = TRUE and also fails at parse_stream(). The error that I'm getting is the folowing one:

Error: parse error: unallowed token at this point in JSON text
      ELDkx4-i7ysCAR_.mp4?tag=10"},,{"bitrate":2176000,"content_ty
                 (right here) ------^

I thought it was the double comma between the two curl braces. I searched at Atom using ctrl+f the string ELDkx4-i7ysCAR_.mp4?tag=10"} and in the only result there was not ,,.

Why I'm getting this parse error, has someone had the samme issue and fix it somehow?

fmigg
  • 21
  • 1
  • 5

1 Answers1

0

I think that this happens when it has connection errors during the stream. I think that when it connect again the json