Design and architecture for multiple concurrent users subscribing to a data feed

Question

Here's a scenario: I have a 'data-feed' - a REST/JSON service that updates periodically (let's say - every 10 seconds or so), and if a change in the data set occurs - then all subscribed listeners need to be updated.

It's currently implemented using long-polling over HTTP, which is a technicality - but the main concept is that clients don't bother the server, and the server doesn't bother the clients - unless there's something to bother about. When there is something new, all clients get notified immediately. The technology consists of Java/Tomcat7, async IO (asyncResponse).

I think it works great: I can drive 10K concurrent sessions for ~ $0.07 per hour (AWS M3.Medium instance).

(Question - I think it works great, but I would like to hear some benchmark numbers to verify. Or in other words - do you think it's a good bang for the buck? please share !!)

If all my clients receive the same data set (the same JSON), is there a way I could optimize even more?

I'm thinking about IP V6 'multicast' - this would minimize my bandwidth consumption by orders of magnitude - but is this practical?

For supporting 1 million concurrent users, for example, assuming there's an update every 10 seconds, I would need to support 100K 'hits' (or responses) per second. If the response size is 10K, the bandwidth starts becoming a big issue here: 10K * 100K * 60 * 60 * 24 --> 86 Giga per 24h.

There isn't really a single, focused question here (besides IPv6) - I would like to hear your thoughts, experience, and alternative approaches - I hate re-inventing the wheel, and I'm sure that the collective wisdom out there far surpasses my own.

Thanks.

score 0 · Answer 1 · answered Jul 06 '14 at 15:10

I'd like to suggest some other alternatives that are scalable, you should do cost estimations regarding your projected loads to see which is viable -

Assuming feeds may be shared across several clients (I don't know if that's relevant to your application) - use a CDN. See a description of this kind of solution here - http://tech.ftbpro.com/post/78969626647/growing-x20-without-spending-an-extra-penny-on-hosting
Use a specialized messaging service as your backbone. The more prominent one in this field is PubNub.

Design and architecture for multiple concurrent users subscribing to a data feed

1 Answers1