Zero Downtime - Application Deployment head aches
During the design and development phase of any problem, one of the most common things over looked is the deployment and management strategy. The PM is concerned about delivery, developer is concerned about application architecture, manager is concerned about the costs of development and operations... they are too busy fighting existing fires.
Over the years in development I have come across many deployment strategies and pitfalls around choosing one option over the other. Whilst I have been advocate of "Zero Downtime", practically achieving it totally different subject matter. Whilst you can employ various clever techniques but with alot of application functionality moves to network (like cached javascript) or user browser, addressing clean deployment strategy is becoming much more important upfront.
Whilst there are many tools in market (like LiveRebel), cost and introducing another system into the ecosystem introduces risks.
Below are some of the deployment techniques that I have come across and major benefits and concerns :
Note: This article doesn't go in detail of individual technique, it just highlight some options. There might be more techniques which i have no highlighted here, if you know please do share.
Apache + Tomcat X
Single Tomcat deployment:
If you are servicing your application with a single tomcat, downtime is a must as deploying on a single instance will take out all sessions and will not service any request till the server comes up.
Multi tomcat cluster behind an Apache Proxy:
When an application is being serviced by multiple tomcats behind an apache proxy (using mod_jk), key considerations should be given to session management. Think about:
- Sticky Sessions
- Session Replication
Using Sticky Session with Session Replication gives absolute zero downtime for deployments. One node can be taken down and all users on that server will be moved over to next node. Key consideration for this strategy is:
- Session replication comes with application complexity and memory issues. More memory is require on each server to store complete system session information (or have buddy server).
- Session replication increases node to node communication which may have an impact on CPU performance.
- You might be better off taking a high on user session that compromise server performance.
Apache + Jboss AS 7
Jboss AS 7 comes in two major flavors: standalone mode and domain mode. Whilst this is independent to the concept of clustering, the deployment strategy is very much dependent on the choices taken upfront.
JBoss in Standalone Mode:
In a Standalone mode, the server is responsible for its own resources and doesn't communicate with other nodes in the same cluster. Pros and Cons of this are:
- Pro: The server is independent and can be taken down as and when required.
- Pro: Suitable for environments where either there is a single standalone server (or may be two).
- Con: If the number of server increase, management and deployment becomes an issue.
- Con: Roll out of configuration or application needs to be done separately.
- Con: Rollback is managed manually.
In terms of achieving Zero Downtime deployment, this model lends best to zero customer impact as with tomcat mentioned above, each server can be taken down if the sessions are shared between nodes. If the sessions are are not shared, the users will be logged out from their session and will be asked to log in to another node automcatically.
This strategy has a potential of bouncing a single user N-1 times if N is number of nodes in a cluster.
JBoss in Domain Mode:
For cluster containing various nodes and has a potential of horizontal scaling (adding more nodes over time), domain mode is best suited as a deployment strategy. Below are some benefits:
- Pro: Management of all server is done through a single Master Node and changes are propagated to client nodes.
- Pro: No single point of failure for the application.
- Pro: Deployment rollback are done automatically.
- Con: At the time of deployment, the cluster can be made unavailable as application is deployed on all nodes by master node.
To achieve zero downtime for any deployment in domain mode much more thinking and planning in required.
This can be partially achieved by deploying parallel cluster and deploying the new application on this cluster whilst the old cluster services client requests. Once the parallel cluster is deployed and verified, the secondary cluster is made the primary cluster and the old cluster is taken down. The old cluster can act as a rollback strategy in case something goes wrong on new cluster.
All the above can be achieved by Master Node Jboss Console. Key considerations for this strategy are:
- Increased requirement of memory to support two clusters.
- Least stressful time on server should be picked on the server.
- The deployment timelines are increased as more activities are required.
- If the application is load balanced through load balancers, additional load balancer configuration is required for parallel deployment.
- Automated process is required for switching between one application cluster to another.
- Separate DNS entries are required for secondary service. This can be used to verify that the application is deployed correctly.