Kubernetes and Distributed SQL: The Perfect Pair

Kubernetes together with distributed SQL enables developers to build applications to solve business problems and enable true digital transformation

The popularity of containers in the technology industry has increased greatly over the last 10 years. Many credit this growth to the numerous benefits containers provide for developers, DevOps teams and enterprises running (or looking to run) modern, microservices-based, cloud-native applications. Containerization helps organizations become more flexible, move faster and gain freedom in their choice of underlying infrastructure. In fact, it is predicted that more than 50% of companies will use container technology this year, up from less than 20% in 2017.

The power of containers stems from their portability, agility and ability to enable consistency across application environments. Today, containers and container orchestration technologies enable enterprises to adopt infrastructure as code and accelerate the pace at which they can take an app from development to test to production. But looking back a decade or so ago, the infrastructure as code movement required … a lot more coding. A typical enterprise had data centers and AWS and had to write a lot of code to just to deploy their applications to those locations because different infrastructure providers had unique nuances to consider. DevOps teams spent a significant amount of time and effort writing code to decouple the app from the underlying infrastructure and then be able to deploy the app across different environments.

Then a few years later, the landscape changed. Enter Kubernetes (on the heels of Docker before it) and more public clouds: Azure and Google Cloud. Almost overnight, the old infrastructure as code era—even though it was better than the Everything Manual era before it—was no longer agile enough to develop applications and test and deploy them consistently from the laptop and across application environments, hosted in the cloud or multiple clouds and on-premises.

Kubernetes is the technology that allowed the old era of infrastructure as code to be replaced with a new, faster way of deploying and operating infrastructure. Over the years, much has been written about the war between the top three container orchestration contenders–Kubernetes, Docker Swarm and Apache Mesos. However, there is no denying that Kubernetes has been named the king by consistently holding the leading position as the most widely deployed container orchestration technology due to its open source software and diverse community of developers. Today, using Kubernetes, developers and DevOps engineers are truly able to build once and deploy many times because, for example, the way to ask for disk storage on AWS is similar to the way on Google Cloud.

The infrastructure-as-code DevOps movement has evolved hand in hand with trends happening in applications themselves. Transactional and user-facing applications have increasingly required higher availability, instant scalability, the ability to run anywhere (including multi-cloud and hybrid cloud environments) and operational simplicity to more easily operate an application throughout its lifetime.

As infrastructure and the applications themselves have evolved, so too, has the data tier. SQL has been the de facto language for relational databases (aka RDBMS) for decades and decades. However, the original SQL databases Oracle, PostgreSQL and MySQL are single-node SQL solutions and are unable to distribute data and queries across multiple instances automatically to provide high availability and scale. On the path to scalability and resilience, NoSQL databases including MongoDB and Apache Cassandra came into prominence in the mid- to late 2000s. They were originally positioned as alternatives to the monolithic SQL databases of the time and their distributed nature was attractive to applications and application developers. The various NoSQL languages focused on single-row (aka key-value) data models and gave up on the relational/multi-row constructs of the SQL language. However, enterprises quickly realized that NoSQL databases have to coexist alongside SQL databases rather than replace them. The primary reason for the continued need of SQL databases was the need for relational data modeling with support for single-row consistency as well as multi-row ACID transactions. The early 2010s saw the advent of NewSQL databases, also known as “scalable” SQL databases to support large-scale OLTP workloads where both data correctness and scalability were important; however, even NewSQL databases come with compromises, especially in Kubernetes-native, multi-cloud deployments.

In turn, enterprises have turned to distributed SQL databases to gain the combined capabilities of traditional single-node SQL systems—strong consistency, ACID transactions and support for the SQL syntax, the distributed nature of NoSQL and the scalability of NewSQL.

With Kubernetes-driven orchestration of containerized applications, enterprises get the ability to automatically scale services, make them fault-tolerant, deploy upgrades with no downtime and more. This all makes sense when the application is stateless—control is completely with Kubernetes and Kubernetes does the entire life cycle.

When it comes to stateful applications (applications that store data; a database is one example), Kubernetes can offer the benefits of scale, fault tolerance and more, but the stateful app itself needs to be orchestration-ready and deliver on those promises as well. The stateful app has to be ready to be scalable and fault-tolerant, all without losing data.

A SQL database is a stateful application and is one of the most complex workloads to run in Kubernetes. The ephemeral nature of Kubernetes pods and the constant need to reschedule them onto a new Kubernetes host requires the underlying database tier to also become equally agile. Otherwise, the application will see outages, slowdowns and, worst of all, data loss and incorrect results. Most stateful SQL databases cannot derive the benefits of Kubernetes; developers have to essentially tell Kubernetes not to apply those benefits because the database can’t handle it. However, a distributed SQL database can solve these challenges and enables enterprises to take advantage of the inherent benefits that Kubernetes offers.

As enterprises move from cloud-hosted to cloud-native environments, the way DevOps builds applications and stores globally distributed data is changing. With Kubernetes as the industry’s standard container orchestration system, developers can efficiently build more complex applications and data storage environments. By pairing your containerized environments with distributed SQL, this process becomes much easier, allowing for enterprises to solve business problems and enable true digital transformation.

Karthik Ranganathan

Karthik Ranganathan is co-founder and CTO of Yugabyte. An ex-Facebook product leader, Karthik helped build the NoSQL platform that powered Facebook Messenger and its internal time series monitoring system, along with other Yugabyte co-founders, Kannan Muthukkaruppan (CEO) and Mikhail Bautin (Software Architect).

Karthik Ranganathan has 1 posts and counting. See all posts by Karthik Ranganathan