I agree with Ryan's recommendation to use the blue/green approach, though the term may be unfamiliar to those new to cloud server deployments. Martin Fowler summarizes the problem it addresses in BlueGreenDeployment:
One of the challenges with automating deployment is the cut-over
itself, taking software from the final stage of testing to live
production. You usually need to do this quickly in order to minimize
downtime. The blue-green deployment approach does this by ensuring you
have two production environments, as identical as possible. At any
time one of them, let's say blue for the example, is live. As you
prepare a new release of your software you do your final stage of
testing in the green environment. Once the software is working in the
green environment, you switch the router so that all incoming requests
go to the green environment - the blue one is now idle.
Solving this problem is one of the main benefits of PaaS.
That said, for historical context, it's worth noting this blue/green strategy isn't new to cloud computing. Allow me to elaborate on one of the "old" ways of handling this problem:
Let's assume I have a website hosted on a dedicated server, myexample.com
. My public-facing server's IP address ("blue") would be represented in the DNS "@" entry or as a CNAME
alias; another server ("green") would host the newer version of the application. To test the new application in a public-facing manner without impacting the live production environment, I simply update /etc/hosts
to map the top-level domain name to the green server's IP address. For example:
129.42.208.183 www.myexample.com myexample.com
Once I flush the local DNS entries and close all browsers, all requests will be directed to the green pre-production environment. Once I've confirmed all works as expected, I update the DNS entry for the live environment (myexample.com
in this case). Assuming the DNS has a reasonably short TTL value like 300 seconds, I update the A
record value if by IP or CNAME
record value if by alias and the change will be propagated to DNS servers in minutes. To confirm the propagation of the new DNS values, I comment out the aforementioned /etc/hosts
change, flush the local DNS entries, then run traceroute
. Assuming it correctly resolves locally, I perform a final double-check all is well in the rest of the world with the free online DNS checker (e.g., whatsmydns.net).
The above assumes an update to the public-facing content server (e.g., an Apache server connecting to a database or application server); the switch over from pre-production to production is more involved if the update applies to a central database or similar transactional data server. If it's not too disruptive for site visitors, I disable login and drop all active sessions, effectively rendering the site read-only. Then I go about updating the backend server in much the same manner as previously described, i.e., switching a pre-production green front end to reference a replication in the pre-production green backend, test, then when everything checks out, switch the green front end to blue and re-enable login. Voila.
The good news is that with Bluemix, the same strategy above applies, but is simplified since there's no need to fuss with DNS entries or separate servers.
Instead, you create two applications, one that is live ("blue") and one that is pre-production ("green"). Instead of changing your site's DNS entries and waiting for the update to propagate around the world, you can update your pre-production application (cf push Green
pushes the new code to your pre-production application), test it with its own URL (Green.ng.mybluemix.net
), and once you're confident it's production-ready, add the application to the routing table (cf map-route Green ng.mybluemix.net -n Blue
), at which point both applications "blue" and "green" will receive incoming requests. You can then take the previous application version offline by unmapping it (cf unmap-route Blue ng.mybluemix.net -n Blue
).
Site visitors will experience no service disruption and unlike the "old" way I outlined previously, the deployment team (a) won't have to bite their nails waiting for DNS entries to propagate around the world before knowing if something doesn't work and (b) can immediately revert to the previous known working production version if a serious problem is discovered post-deployment.