Stateful Container High Availability: Good, Better, Best

A 2020 McKinsey global study found that, as a result of the COVID-19 pandemic, companies have been pushed “over the technology tipping point.” The research found that as digital adoption has ramped up in response to pandemic-related changes in the business environment, enterprises have accelerated the digitization of their customer and supply-chain interactions—as well as their internal operations—by as much as four years. 

Companies had already started increasing digital transformation (DX) projects—replacing manual IT tasks with software to automate the test, configuration and runtime processes—and for good reason. It’s been recognized for decades that DX can not only dramatically improve IT and business capabilities but can also save money. As a result, those organizations that leverage DX can create and maintain a competitive advantage by: 

  • Transforming the employee experience (EX) and customer experience (CX)
  • Increasing top-line growth 
  • Contributing a substantial boost to the bottom line

But while many companies had already started to recognize the value of a DX approach, the pandemic introduced a new sense of corporate urgency in meeting goals related to digital adoption. Organizations that previously had the luxury of remaining reluctant about e-commerce were suddenly forced to accelerate their DX work.

DX Drives Software Container Use

One result of all of these DX initiatives has been an explosion of software container use—and “explosion” is not an exaggeration. A 2021 report from ResearchAndMarkets predicts that the application container market will register a CAGR of 29% during the five-year period between 2021 and 2026. In a separate study, the Cloud Native Computing Foundation (CNCF) Cloud Native Survey 2020 announced that the use of containers in production has increased 300% since 2016, reaching 92%, up from 84% last year.

The CNCF survey also revealed steady growth in the number of containers that organizations run. Those using more than 5,000 containers hit 23% in 2020, up 109% from 2016. Those using more than 250 containers hit 61% in 2020, up from 57% in 2019.

The Emergence of Stateful Containers

While containers were originally designed to be entirely stateless and ephemeral, DX initiatives are also driving the adoption of stateful containers. Previously, containers would spin up, do their job and disappear, leaving no record of what happened while they were running. 

The reason behind the shift to stateful containers is that most real-world applications need to retain state; CNCF survey respondents confirm this: 

  • 55% use stateful containers in production
  • Nearly a quarter (22%) only use stateless containers
  • 12% are evaluating stateful containers
  • 11% plan to use stateful containers in the next year
     

Good, Better, Best HA

But how can organizations move these projects from the development and test phase to production? The answer is that companies will need to deploy a good, better, best high availability (HA) methodology for their stateful containers. Keep in mind that the deployment architecture for containers at scale generally consists of three components:

  • Nodes, which can be either a virtual or physical machine and can have multiple pods.
  • Pods, which are a group of one or more containers, with shared storage and network resources and a specification for how to run the containers. 
  • Container orchestrator, such as Kubernetes or Docker Swarm

“Good HA” looks like this: When a node fails, the container orchestration solution—for example, Kubernetes—hosting the node starts a replica pod on a different node. The container reconnects to services such as storage and networking, and the application re-connects to the container. This full process is relatively time-consuming and can take several minutes to complete.

You can add to good HA to get “better HA,” however. With this capability, when the container fails, the container orchestrator bootstraps another instance of the container, then reattaches it to the services and the application reconnects to the container. The restart of the better HA process is generally faster than “moving” a pod to a different node as in good HA, but still can take minutes to execute.

Ideally, you want “best HA,” in which, as part of deploying a containerized application, the container application instance—for example, mssql-server—is replicated across a group of containers. This group can be spread across separate container orchestrator clusters, and if deployed in a cloud or clouds, deployed in multiple availability zones and regions. The advantage of best HA is that if the primary node, pod or container fails, then the application is reconnected almost instantly to a secondary container running on another node in a different pod. This means “good” and “better” HA scenarios are enhanced with near-zero downtime. 

Optimizing Your Kubernetes Cluster

Best HA, as described above, is beyond the capabilities of container orchestrators such as Kubernetes and Docker Swarm. For medium and large organizations running enterprise database systems such as SQL Server on bare metal or VMs, database-level HA has traditionally been provided by SQL Server Availability Groups (AGs). However, highly available SQL Server AGs have not been supported in containers until recently, hindering organizations’ ability to take advantage of DX.

For SQL Server users, an ideal solution is one that accelerates an enterprise’s DX by speeding the adoption of HA stateful containers. The solution should provide highly available SQL Server AG support for SQL Server containers, including for Kubernetes clusters. 

Look for a system that helps customers deploy stateful containers to create new and innovative applications, while also improving operations with near-zero recovery-time objectives. This functionality allows a company to more efficiently deliver better products and services at a lower budget. The goal is to help organizations generate new revenue streams, allowing them to build distributed containerized AG clusters across availability zones and regions. The result: Hybrid and multi-cloud environments that can rapidly adapt to ever-changing market conditions and consumer preferences.

Don Boxley

Don Boxley Jr is a DH2i co-founder and CEO. Don earned his MBA from the Johnson School of Management, Cornell University.

Don Boxley has 6 posts and counting. See all posts by Don Boxley