Zero Downtime, Instant Deployment and Rollback

Deployment to the cloud is an evolving area. While many tools are available that deploy applications to nodes (machines) in the cloud, zero deployment downtime is rare or nonexistent. In this post, we’ll take a look at this problem and propose a solution. The focus of this post is on web applications—specifically, the server-side applications that run on a port (or a shared resource).

In traditional deployment environments, when switching a node in the cloud from the current version to a new version, there is a window of time when the node is unusable in terms of serving traffic. During that window, the node is taken out of traffic, and after the switch it is brought back into traffic.

In a production environment, this downtime is not trivial. Capacity planning in advance usually accommodates the loss of nodes by adding a few more machines. However, the problem becomes magnified where principles like continuous delivery and deployment are adopted.

To provide effective and non-disruptive deployment and rollback, a Platform as a Service (PaaS) should possess these two characteristics:

  • Best utilization of resources to minimize deployment downtime as much as possible
  • Instantaneous deployment and rollback

Problem analysis

Suppose we have a node running Version1 and we are deploying Version2 to that node. This is how the lifecycle would look:

  typical deployment workflow

Every machine in the pool undergoes this lifecycle. The machine stops serving traffic right after the first step and cannot resume serving traffic until the very last step. During this time, the node is effectively offline.

At eBay, the deployment lifecycle takes a reasonably sized application about 9 minutes. For an organization of any size, many days of availability can be lost if every node must go into offline phase during deployment.

So, the more we minimize the off-traffic time, the closer we get to instant/zero-downtime deployment/rollback.

Proposed solutions

Now let’s look into a few options for achieving this goal.

A/B switch

In this approach, we have a set of nodes standing by. We deploy the new version to those nodes and switch the traffic to them instantly. If we keep the old nodes in their original state, we could do instant rollback as well. A load balancer fronts the application and is responsible for this switch upon request.

The disadvantage to this approach is that some nodes will be idle, and unless you have true elasticity, it will amplify the node wastage. When a lot of deployments are occurring at the same time, you may end up needing to double the capacity to handle the load.

Software load balancers

In this approach, we configure the software load balancer fronting the application with more than one end point so that it can effectively route the traffic to one or another. This solution is elegant and offers much more control at the software level. However, applications will have to be designed with this approach in mind. In particular, the load balancer’s contract with the application will be very critical to successful implementation.

From a resource standpoint, both this and the previous approach are similar; both use additional resources, like memory and CPU. The first approach needs the whole node, whereas the other one is accommodated inside the same node.

Zero downtime

With this approach, we don’t keep a set of machines; rather, we delay the port binding. Shared resource acquisition is delayed until the application starts up. The ports are switched after the application starts, and the old version is also kept running (without an access point) to roll back instantly if needed.

Similar solutions exist already for common servers.

Parallel deployment – Apache Tomcat

Apache Tomcat has added the parallel deployment feature to their version 7 release. They let two versions of the application run at the same time and take the latest version as default. They achieve this capability through their context container. The versioning is pretty simple and straightforward, appending ‘##’ to  the war name. For example, webapp##1.war and webapp##2.war can coexist within the same context; and for rolling back to webapp##1, all that is required is to delete webapp##2.

Although this feature might appear to be a trivial solution, apps need to take special care with shared files, caches (as much write-through as possible), and lower-layer socket usage.

Delayed port binding

This solution is not available in web servers currently. A typical server first binds to the port, then starts the services. Apache lets you delay binding to some extent by overriding bindOnInit, but still the binding occurs after the connector is started.

What we propose here is the ability to start the server without binding the port and essentially without starting the connector. Later, a separate command will start and bind the connector. Version 2 of the software can be deployed while version 1 is running and already bound. When version 2 is started later, we can unbind version 1 and bind version 2. With this approach, the node is effectively offline only for a few seconds.

The lifecycle for delayed port binding would look like this:


However, there is still a few-second glitch, so we will look at the next solution.

Advanced port binding

Now that we have minimized the window of unavailability to a few seconds, we will see if we can reduce it to zero. The only way to do that would be to bring version 2 up before version 1 goes down. But first:

Breaking the myth:  ‘Address already in use’

If you’ve used a server to run an application, I am sure you’ve seen this exception at least once. Let’s consider this scenario: We start the server and bind to the port. If we try to start another instance (or another server with the same port), the process fails with the error ‘Address already in use’. We kill the old server and start it again, and it works.

But have you ever given a thought as to why we cannot have two processes listening to the same port? What could be preventing it? The answer is “nothing”! It is indeed possible to have two processes listening to the same port.


The reason we see this error in typical environments is because most servers bind with the SO_REUSEPORT option off. This option lets two (or more) processes bind to the same port, provided the application that bound the first process had this option set while binding. If this option is off, the OS interprets the setting to mean that the port is not to be shared, and it blocks subsequent processes from binding to that port.

The SO_REUSEPORT option also provides fair distribution of requests (important since threading suffers from bottlenecks in multi-cores). Both of the threading approaches—one thread listening and then dispatching, as well as multiple threads listening—suffer from the under/over utilization of cycles. An additional advantage of SO_REUSEPORT is that it takes care of sending the datagram from the same client to the same server process. However, it has a shortcoming:  packets might be dropped if new processes are added or removed on the fly. This shortcoming is being addressed.

You can find a good article about SO_REUSEPORT at this link on If you want to try this out yourself, see this post on the Free-Programmer’s Blog.

The SO_REUSEPORT option address two issues:

  • The small glitch between the application version switching:  The node can serve traffic all the time, effectively giving us zero downtime.
  • Improved scheduling:  Data indicates (see this article on that thread scheduling is not fair; the ratio between the busiest thread versus the one with the least connections is 3:1.


Please note that SO_REUSEPORT is not the same as SO_REUSEADDRESS, and that it is not available in Java as not all operating systems support it.


Applications can successfully serve traffic during deployment, if we carefully design and manage those applications to do so. Combining both late binding and port reuse, we can effectively achieve zero downtime. And if we keep the standby process around, we will be able to do an instant rollback as well.

22 thoughts on “Zero Downtime, Instant Deployment and Rollback

  1. Darryl Stoflet

    You mentioned tomcat 7. Did you use this with Java by providing your own SocketImpl and SocketOptions implementation?

    1. Suresh Mathew

      We did not have to do this, since we stopped at late binding in Tomcat. PORT_REUSE but is an option if you don’t use java sockets. Apache has its own socket implementation and exposes the option to pass PORT REUSE. But for Http connection, it uses java socket and so cannot be easily customized.

      You are correct, we will have to extend or implement our own socket implementation if you want to use PORT REUSE. It may be work to take a look at this bug too.

      Python, though readily supports it, though lacks a variable with PORT REUSE.

  2. Shaun

    That’s a good solution to the application side of the deployment. However, for me the difficult part has always been on the database side. With a non-trivial database of any size, there will eventually be breaking changes between releases, especially in an agile environment. At least that has been my experience. That can be solved with data migration scripts. However, they will, by their very nature, migrate data and alter structures. I have used tools like liquibase to make this easier but that does have the problem that once they are run, the database is no longer useful for the previous release. You could do things like duplicate the database and deploy the new release to the duplicate database and reconcile changes that occurred during that process afterwards. But when the database is multi-TB in size, this is easier said than done. Any pointers?

    1. Suresh Mathew

      Thanks Lars for pointing this out. Sure. Most of the platforms support it, including windows, in different ways.
      Although, BSD introduced the idea.

  3. Pingback: Errata: Friday Nov 22nd | inthecloud247

  4. Rohit

    An Enlighting post for me, didn’t knew that a port can be shared in server port. SO_REUSEPORT looks promising and something worth looking to me. Thank you.

  5. Pingback: downtime and disaster recovery | Prt.Sc

  6. sayantan

    Nice explanation on Zero Downtime Deployment…a bit more detailed flowcharts if possible please provide…good stuff…..Good work.

  7. Suresh Mathew

    Thanks Sayantan for your comments. Let me try to draw a more detailed diagram than the last one title “Zero Downtime”

  8. SinhaA

    Excellent article Suresh, do you know if other cloud offering like cloudfoundry uses this concept.

  9. Pingback: 服务器部署Session Draining支持方案 - 出家如初,成佛有余 - 关注电子商务领域,关注无线互联网,关注新媒体;

  10. Suresh

    Interesting. BTW, if you unbind v1 immediately after v2 is up, wouldn’t existing connections to v1 be broken? So, when can v1 unbind port usage? Wait till all connections currently served by v1 terminate? What guarantees kernel not to notify v1 and notify only v2 of the newer connections, while existing connections are served by v1? For it to be truly zero downtime, shouldn’t this requires some kernel and application changes with new socket options? Am I missing something?

    1. Kalyan Vemuru

      You should be able to use session replication across clusters (say V1 and V2) to be able to do this. If you use WAS, you can share the session data across clusters or store it in the database temporarily, so that the session is retained even after the user connects to V2.

  11. Suresh Mathew

    @Suresh, @Kalyan,
    Good question and good answer from domain stand point.
    But from Kernel perspective,
    Draining can be best achieved by shutting down the socket with a 1 meaning for writing. This means it can continue to drain traffic but cant read any more new connections.

    Hope that helps.


Leave a Reply

Your email address will not be published. Required fields are marked *