0

I'm trying to deserialize a large stream of Twitter Tweets.

Everything is going well at around the first 200 deserialized tweets, but after that, the Deserialize method gets stuck and never proceeds. If I leave it to work for a prolonged time, it eventually throws a System.Net.WebException: The operation has timed out on HttpWebResponse after a long time.

        var tweets = new List<Tweet>();
        JsonTextReader reader;
        using (var client = new WebClient())
        using (var stream = client.OpenRead(url))
        using (var streamReader = new StreamReader(stream))
        using (reader = new JsonTextReader(streamReader))
        {
            reader.SupportMultipleContent = true;
            var serializer = new JsonSerializer();

            while (reader.Read() && tweets.Count < 500)
            {
                if (reader.TokenType == JsonToken.StartObject)
                {
                    tweets.Add(serializer.Deserialize<Tweet>(reader));
                }
            }
        }

Any ideas why would the JsonTextReader get stuck? The stream I am reading seems to continue returning data when consumed through the browser.

Avi Meltser
  • 409
  • 4
  • 11
  • 2
    I'd guess that Twitter is throttling your connection. If you're getting a `WebException`, why do you think it's the *text reader* that is getting stuck, and not the http connection? –  Sep 30 '19 at 13:04
  • @Amy that might be right, although I can't see why twitter would throttle my connection. Additionally, I am using the streaming twitter API which allows more flexible access rates than the regular api – Avi Meltser Sep 30 '19 at 17:51
  • https://developer.twitter.com/en/docs/tweets/filter-realtime/guides/connecting –  Sep 30 '19 at 18:20
  • @Amy The funny thing is that if I put a timer on the loop when it elapses I find my tweet list to be full of tweets in accordance to the time elapsed. But a simple condition such as "tweets.Count < 500" in the while loop wouldn't stop the loop. for some reason, it does not seem to iterate the loop in debug after some derealization, but when using the Timer eventually everything works. – Avi Meltser Sep 30 '19 at 23:26
  • Have you inspected that WebException you're getting? That seems like a more productive avenue of inquiry than the timers. –  Sep 30 '19 at 23:34
  • @Amy I seem to only be getting it when not having a timer set that would interupt the deserialization loop. The exception itself holds no further information and I would assume twitter is disconnecting my connection at that point(so I assume the WebException is a natural symptom of having a connection open for too long) – Avi Meltser Oct 01 '19 at 08:56
  • @Amy the WebException(which sometimes returns as `System.Net.WebException: 'The request was aborted: The connection was closed unexpectedly.'`)seems to happen occasionally but not always, so I assume the problem that causes that is a slow/unreliable connection(which twitter states in its documentation it might abort) – Avi Meltser Oct 01 '19 at 09:20

0 Answers0