Using EC2 instances (along with Amazon Auto Scaling and Elastic Load Balancing) I have several instances of a TCP server running in Amazon Web Services. Each EC2 instance has access to a centralized database (running on Amazon RDS). To make this backend scalable, new EC2 instances (of the TCP server) are scaled up and down depending on demand.
The servers have been made using the Python Twisted framework. The system powers a custom instant messaging service, with multiple group chats that users can join.
When a user starts using the service they establish a TCP socket with one of the TCP servers. Each server stores in memory the currently connected users (i.e. the open TCP sockets) and which ‘group chat’ each user is currently ‘in’ (and thus subscribed to). All chat data created is stored in the database.
The Problem
When UserA posts a message in GroupChatZ, all users ‘in’ GroupChatZ should receive the message. This is simple if there is only 1 TCP server: it would search its memory for all users ‘in’ that 'group chat' and send them the new message. However as there is more than one server, when a new message is created it is necessary for that server to pass the message on to all the other servers (i.e. EC2 instances).
What is the most efficient solution to this problem? Perhaps using AWS components.
One solution I can think of is for each server to store its IP address in the database when it first starts up, and get the IP address of all other connected servers and set up a TCP connection with them. When each new message is received, the server handling it could send it to all other servers it is connected to.
However TCP connections are not 100% reliable and this solution adds complexity.
I suspect there is actually a good way to use some Amazon Web Services component to implement a simple subscriber-publisher type mechanism (think Observer design pattern). I.e. where one server adds something all other servers get the message from it in real-time.