I'm not sure if this will help you, but in our setup we have a load balancer to which the clients talk. This LB know which instances are live, and which are dark and forwards the traffic accordingly. If the request has a 'special' header, the LB sends the traffic to the dark pool. We have this setup per application (just making this clear as in the diagram you have posted, some people might think that the whole platform is blue-green)
So a diagram of it would be, where the green cluster is live and the blue is dark (<3 ascii art)
[Client] <- I assume this is internal, otherwise add a FW :).
|
\|/
[Application Load Balancer] <- internal, per app
|
|\--------------\--------------\--------------\
\|/ \|/ \|/ \|/
[Node 1 G/L] [Node 2 G/L] [Node 3 B/D] [Node 4 B/D]
G = Green B = Blue
L = Live D = Dark
The Application Load Balancer can be a number of technologies. It could be a Gateway app (like Netflix Zuul) or a load balancing webserver (like AirBnB Smartstack which uses HAProxy).
It's worth mentioning that if the live cluster goes up in flames, we don't automatically promote the dark cluster to live... What I'm trying to say is that we don't use blue/green as an alternative for High Availability. Is this your concern? (as you're using VIPs here and keepalived)
Edit
Thanks for the answers to the questions.Unfortunately, I don't think you'll be able to blue-green successfully with your constraint.
Have you considered have just one big environment and then doing some sort of hybrid between Canary Release and blue-green? With this approach, initially you have 5 servers serving live traffic and 1 serving dark traffic (I assume you have 6 boxes in total). The live nodes could be configured so 3 nodes take live traffic and 2 do the batch processing.
When you're happy with the code in the dark pool, you start upgrading the servers one by one until you have all the servers serving live traffic in the live pool. At that point, you might need to move the 2 batch processing servers to the light pool, unless you have a way to moving them more slowly (probably one job at a time?).
Just in case, I want to make something really clear as this might come to bite you (and I don't like fellow developers to be in pain). If your batch processing is a fundamental part of your platform, you don't have a true HA environment, for the reason I outlined in my original answer, if your live cluster fails for any reason (DB corruption?) you won't be able to run in the remaining hardware.