Unless your application is entirely stateless, it will need to store and retrieve persistent data. This is where databases come in — they employ a simple query language that allows authorized users to retrieve and edit data. And now that inroads have been made to enable stateful Kubernetes deployments, many organizations are looking to bring the same scalability advantages of containerization and Kubernetes to database management.
Cloud Native Computing Foundation (CNCF) is home to a wide variety of helpful open source projects. These tools span everything from cloud native networking, to continuous integration, scheduling and orchestration and many other areas. Below, we’ll review three CNCF projects that can be used to support cloud-native databases. These tools specifically help scale the management of distributed databases in the cloud.
A highly scalable, low latency, easy-to-use key-value database
TiKV is a key-value database that excels at working with large amounts of data. It’s essentially a unified distributed storage layer designed to scale to petabyte-scale deployments or trillions of rows. TiKV provides an ACID-compliant transactional key-value API and boasts minuscule response times.
TiKV’s structure includes TiKV nodes, which store key-value pairs, and Placement Driver (PD) nodes which manage TiKV clusters. TiKV was inspired by Google’s Spanner, a globally-distributed database. Projects like TiDB, Zetta, Tidis, Titan and JuceFS all use TiKV. To get started with TikV, you can follow the step-by-step instructions here. Initially created by PingCAP, TiKV is now a graduated project under the CNCF umbrella.
A clustering system for horizontal scaling of MySQL
Vitess was born in 2010 to respond to issues encountered with running MySQL at scale at YouTube. Vitess is described as a sharding middleware for MySQL. Using Vitess, you can shard a MySQL database while making it seem like you are communicating with a unified database from the outset, all without adding additional logic to your application.
Since MySQL doesn’t natively support sharding, Vitess is a helpful middleware to retain MySQL yet enable more distributed database architectures. Vitess also includes other intelligent features, such as automatically handling failovers and backups and rewriting queries that might cause poor performances. YouTube, Weave, Square, Slack and Hubspot are just some of the many companies using Vitess.
A Kubernetes operator for declarative database schema management
Described as an “open source database schema migration tool,” SchemaHero is last in our short — but sweet — list of CNCF cloud-native database tools. SchemaHero was developed in response to potential issues around database schema changes, migration failures, and schema auditing when working with various languages and development platforms.
SchemaHero is a GitOps-friendly, declarative object-relational mapping (ORM) utility to manage database schemas. It’s deployed as a Kubernetes operator, and is language neutral, meaning it’s compatible with whatever language developers are using, whether it’s Python, Rust, Go or Java. By using SchemaHero, engineers don’t have to create sequenced migration scripts that are compatible with all environments.
At the time of writing, SchemaHero is a sandbox project within the CNCF. For more information, you can consult the documentation or get started with this introductory walkthrough.
Above, we’ve reviewed a few cloud-native database projects hosted by CNCF. As we’ve seen, these tools help support database operations and can extend traditional database types. Of course, there are many other open source options available for consumption — some of the more popular modern databases of note include Neo4j, PostgreSQL, CassandraDB, YugaByte, Couchbase, MongoDB, CockroachDB and FaunaDB.