I am working on a project where more than 50 thousand devices need to communicate with my server using reverse SSH tunneling.
These devices will also be generating, and moving heavy traffic across these ports, hence consuming heavy network and CPU on server.
I am using AWS EC2 stack, and have chosen a moderate server to start with (4 CPU cores and 16 GB RAM).
Since a single server is not capable of 50 thousand + connections, I must find a way to load balance the traffic somehow.
Assuming each EC2 instance can support up to 500 reverse SSH connections, without choking, I would require 50000/500 = 100 servers (for 50k devices: let’s assume this is the hard target for now).
While I am eventually going to require 100 servers, this increase of devices will be gradual, so I don't require 100 servers from day one.
This count should increase gradually, as the number of devices increase, which communicate with the server.
The obvious way to handle this is by Elastic Load Balancing, or maybe Elastic IP (both concepts are bit different but ELB is obviously the way to go).
But ELB would work on normal communication protocols, such as HTTP/HTTPS/TCP.
My scenario is bit different: each device is assigned a different port.
For example:
Dev 1 port = 2000
Dev 2 port = 2001
Dev 3 port = 2003
Dev 50000 port = 52000
I want whole load balancing to happen on the reverse SSH tunnels' ports, which is tad different from the whole ELB concept, to start with.
I am fine with a DNS name such as: ports.my-domain.com.
Then this DNS should be the hub of ELB, and start/stop new EC2 servers whenever required, and do port forwarding like:
ports.my-domain.com
|
|- 1.1.1.1 (port range: 2000-2500)
|- 1.1.1.2 (port range: 2501-3000)
|- 1.1.1.3 (port range: 3001-3500)
...
Obviously, the servers 1.1.1.1 - 1.1.1.3 etc. are all started and managed by ELB.
When dev 1 is establishing reverse SSH on port 2000, it is always going to be assigned same IP: 1.1.1.1, so the sticky port concept is there, which is although supported by CLB (AWS classic load balancer), but does not work for TCP ports...
I would prefer: all devices will communicate with ports.my-domain.com, and request tunneling like:
Dev 1 -> ports.my-domain.com: 2000
Dev 2 -> ports.my-domain.com: 2001
Dev 50000 -> ports.my-domain.com: 52000
Internally, this ELB will start server 1: 1.1.1.1 for first 500 connections, then 1.1.1.2 for next 500 connections, till the end, where the 50000th device will be registered on (maybe) 1.1.255.200
I only want to use forward domain name: ports.my-domain.com, and expect AWS to handle the rest.
Online AWS tutorials gives different concepts to implement, like: AWS Cloud Watch, AWS Elastic Bean stalk, AWS Cloud Formation, AWS Container services (again: docker ports concept is there which is different from my requirement) etc. but all these explanations are not exactly in this direction.
Would like to hear suggestion on what technology stack gives best implementation for my requirement.
Appreciate all feedback...