March 26, 2017

Some call me a Docker skeptic. But I prefer to think of myself as so excited about the technology that ineffective adoption is a bummer. While Docker has dominated the DevOps conversation, by and large its adoption is limited both in the number of companies using it and how they are using it. If you are using Docker, chances are it is for Dev/Test only, and for front-end applications only. However, given the tooling available through Docker directly and third-party tools, there is no reason it should be limited.

By my estimation, culture and neglecting critical elements to proper implementation are the real culprits to its limited use. Specifically, we have not addressed how to store and manage containers and how to leverage server-level resources to build and run Docker containers more robustly.

Docker and Storage

For a while I’ve been talking about the private registry from a top-down governance perspective. But there is another way to look at it. And that is the bottom-up physical storage perspective—finding a way to:

  1. Extend containers benefits to the whole app, including backends and others’ styles of computation, such as Hadoop and big data
  2. Use bare metal to really make Docker scream
  3. Let Ops police resources, not containers
  4. Support back-end scale and redundancy

Containers hermetically seal applications and allow resource controls, to some degree, and that’s something that needs to be considered when it comes to storage infrastructure. But every application has a state, something to store and something to back up. States also need to be sealed, encapsulated and portable.

Forcing customers to separate out infrastructure into stateless portions (of the app, in containers) and stateful portions (in VMs) is more than just a bummer: It creates unnecessary duplication and makes it too easy to mismanage.

And there are vendors that have realized this challenge, because they have been dealing with it all along. Portworx, for example, is a solution that has attacked this issue head-on. (And from the storage up.)

Its team lives and breathes storage and distributed systems. Gou Rao was an early Linux contributor and later a storage CTO at a startup called Ocarina, picked up by Dell. Vinod Jayaraman designed high-speed networking products for F5 and has been applying distributed systems and networking principles to storage at Ocarina, Dell and now Portworx. Murli Thirumale was the CEO at Ocarina and before there, at Net6, which Citrix acquired. Before there he was a GM at Hewlett-Packard. Eric Han was the first product manager at Kubernetes and co-founder of Google’s Container Engine, a multi-cloud container orchestration system. Needless to say, Portworx has history.

Portworx and the like are building multi-cloud storage for containers. The interesting thing is they’re building storage not just for containers but implemented as a container. Storage spins up quickly, the container provisions the storage and cuts out super-complicated hardware configuration steps. It also lets admins separately manage storage for portability (think across clusters and, someday, data centers) and control. The company just announced a first release for DevOps called PX-Lite. We’ll see how that does. I think it’s got a shot.

The point is, considering how your images are stored is very important for extending and stabilizing Docker implementations. The benefits of considering container storage are:

  1. Extending the benefits of Docker to back-end applications
  2. Increasing Docker adoption and reliability
  3. Boosting performance by using bare metal
  4. Allowing cloud infrastructure to just be a resource pool (i.e., not impact application architecture)

An Example

Let’s take a simple example I’ve talked about before: the Docker private registry. The registry is the fifth most popular container. So I’m a Docker shop, how do I run the registry? There’s definitely state, as the registry stores my container images. Every time my developer wants to update an app and share, these images are the artifacts.

Let’s say there’s an update to the registry image itself. (I realize this is starting to feel circular.)  Shouldn’t I “pull” the new registry and deploy as I would any good container? What if I have a hard drive failure on that server? Just like storage did with virtual machines (VMs), I should be able to snapshot and back up my state. (Now I feel like I’m waxing philosophical.) I want to fail over easily, too. That’s not too much to ask. Even if I think I’m using Docker just for web apps, I still have an app (the registry) in my environment that is totally stateful.

Now, I put all those containers, registry, web app, WordPress and databases together. Why was I wasting compute cycles and server hardware on VMs? Hypervisors were great—when apps polluted the OS. Now that I can package and keep my apps portable, I want the old benefits of storage, but in a way that plays in this new world.

There is much more to adopting Docker than the big green Go button. And its adoption strangely is largely impacted by who introduces the technology (Operations or Dev), its initial use case (dev/test/prod) and its perceived benefit.

For Docker to expand across your entire delivery chain and support production long into the future, there are some bigger considerations that must be made.

Some are quite obvious: How are you going to manage the storage of your images and layers? How are you going to distribute resources at an infrastructure level to the containers that run on it—which requires a storage layer for Docker that is node-independentand has support tools to make the visibility of all your containers high—and support the growth and speed of a modern delivery chain?

Chris Riley

Chris Riley (@HoardingInfo) is a technologist and DevOps analyst for Fixate IO. Helping organizations make the transition from traditional development practices to a modern set of culture, tooling, and processes that increase the release frequency and quality of software. He is an O'Reilly author, speaker, and subject matter expert in the area of DevOps Strategy, Machine Learning, and Information Management.