What practically works in technology!!: January 2014

Zero Downtime - Application Deployment head aches

During the design and development phase of any problem, one of the most common things over looked is the deployment and management strategy. The PM is concerned about delivery, developer is concerned about application architecture, manager is concerned about the costs of development and operations... they are too busy fighting existing fires.

Over the years in development I have come across many deployment strategies and pitfalls around choosing one option over the other. Whilst I have been advocate of "Zero Downtime", practically achieving it totally different subject matter. Whilst you can employ various clever techniques but with alot of application functionality moves to network (like cached javascript) or user browser, addressing clean deployment strategy is becoming much more important upfront.

Whilst there are many tools in market (like LiveRebel), cost and introducing another system into the ecosystem introduces risks.

Below are some of the deployment techniques that I have come across and major benefits and concerns :

Note: This article doesn't go in detail of individual technique, it just highlight some options. There might be more techniques which i have no highlighted here, if you know please do share.

Apache + Tomcat X

Single Tomcat deployment:

If you are servicing your application with a single tomcat, downtime is a must as deploying on a single instance will take out all sessions and will not service any request till the server comes up.

Multi tomcat cluster behind an Apache Proxy:

When an application is being serviced by multiple tomcats behind an apache proxy (using mod_jk), key considerations should be given to session management. Think about:

Sticky Sessions
Session Replication

Using Sticky Session with Session Replication gives absolute zero downtime for deployments. One node can be taken down and all users on that server will be moved over to next node. Key consideration for this strategy is:

Session replication comes with application complexity and memory issues. More memory is require on each server to store complete system session information (or have buddy server).
Session replication increases node to node communication which may have an impact on CPU performance.
You might be better off taking a high on user session that compromise server performance.

What ever option is taken, remember someone needs to manage this so plan up front.

Apache + Jboss AS 7

Jboss AS 7 comes in two major flavors: standalone mode and domain mode. Whilst this is independent to the concept of clustering, the deployment strategy is very much dependent on the choices taken upfront.

JBoss in Standalone Mode:

In a Standalone mode, the server is responsible for its own resources and doesn't communicate with other nodes in the same cluster. Pros and Cons of this are:

Pro: The server is independent and can be taken down as and when required.
Pro: Suitable for environments where either there is a single standalone server (or may be two).
Con: If the number of server increase, management and deployment becomes an issue.
Con: Roll out of configuration or application needs to be done separately.
Con: Rollback is managed manually.

In terms of achieving Zero Downtime deployment, this model lends best to zero customer impact as with tomcat mentioned above, each server can be taken down if the sessions are shared between nodes. If the sessions are are not shared, the users will be logged out from their session and will be asked to log in to another node automcatically.

This strategy has a potential of bouncing a single user N-1 times if N is number of nodes in a cluster.

JBoss in Domain Mode:

For cluster containing various nodes and has a potential of horizontal scaling (adding more nodes over time), domain mode is best suited as a deployment strategy. Below are some benefits:

Pro: Management of all server is done through a single Master Node and changes are propagated to client nodes.
Pro: No single point of failure for the application.
Pro: Deployment rollback are done automatically.
Con: At the time of deployment, the cluster can be made unavailable as application is deployed on all nodes by master node.

To achieve zero downtime for any deployment in domain mode much more thinking and planning in required.
This can be partially achieved by deploying parallel cluster and deploying the new application on this cluster whilst the old cluster services client requests. Once the parallel cluster is deployed and verified, the secondary cluster is made the primary cluster and the old cluster is taken down. The old cluster can act as a rollback strategy in case something goes wrong on new cluster.

All the above can be achieved by Master Node Jboss Console. Key considerations for this strategy are:

Increased requirement of memory to support two clusters.
Least stressful time on server should be picked on the server.
The deployment timelines are increased as more activities are required.
If the application is load balanced through load balancers, additional load balancer configuration is required for parallel deployment.
Automated process is required for switching between one application cluster to another.
Separate DNS entries are required for secondary service. This can be used to verify that the application is deployed correctly.

At the end of the day, it is the decision between business, cost and technical ability of the team to achieve Zero downtime and accept if any customer impact is acceptable or not.

Often software engineers are thought of as Code Monkeys who hold the stigma that they sit in darkest corner of the office with big over-the-ear 'funky' headphones, writing code at the speed of sound. They are always the people who are forgotten about in social lunches, "Sam, who?" syndrom in coffee corners (even though Sam has been chucking code for last 15 years for the company). Whilst engineer is different to being a code monkey, both terms are used side-by-side in many upper managements (not all :) ).

In the eyes of an engineer, the world is totally different. He feels he lives in two parallel dimensions where in one he has to interact with people and the other where he needs to let his art work do the talking by solving complex problems. The satisfaction of solving a mind boggling problem is non-next to anything and he knows that. Whilst he wants to be part of the cool gang in the office, he is so involved in his pride of engineering the perfect solution. He forgets the ethics of the office and lets a fart/burp come out in the most non-appropiate time.

I've lived through being a code monkey, a team lead (being the head monkey of the troop) and now I've moved to next stage where I'm more thinking about the trees where these monkeys are living.

With the new dawn of 2014, I would like to start this blog with saying Happy New Year to everyone. I would like to use this blog to share my experiences both technical and non-technical to wider community; so that we can understand why are the brains behind many technical company are still misunderstood. Also what is involved in thinking about how the tree is grown for monkeys to working on. :-) Many thanks.

What practically works in technology!!

Friday, 31 January 2014

Zero Downtime - Application Deployment head aches

Sunday, 5 January 2014

From a Code Monkey to Head Monkey.