In case you are wondering what Docker is about and how it relates to DevOps, here’s a quick take on how Docker bolsters the DevOps mission by helping operations become more agile.
The pain of operations lies in the complexity of environmental configuration and setup. An application makes assumptions about the environment in which it is meant to operate. It may also sometimes be meant for more than one environment.
Loosely stated, an environment is everything that is required for the application to run, but is not included in the application build. This usually includes the operating system, the database, the web server, the browser, the run-time libraries, services from other applications, data files, metadata, style-sheets, and various files, scripts, properties, registry settings, and several other resources. Defining the environment implies identifying the versions of each of the components that are required for the application to run.
Once an environment is defined, it has to be set up. And then the application has to be moved into that environment. The first two steps are sometimes called environment configuration and the third step is called software deployment, though the term is often used to embrace the previous two steps as well, which is how I use it in this piece.
Deployment into production is complex for one of several reasons, some of which are:
- The environment may be difficult to configure and therefore difficult to set up.
- It is difficult to ensure the stability of the environmental settings, especially when multiple applications are hosted by the same operating system. For instance, one application running in the same environment may change some of the resources assumed by another application.
- As customers vary in their choice of features, and as software vendors race to keep pace with the speed of new demands, living with different versions of an application is a modern day necessity. Flexibility is often introduced at build-time through strategies like code forks and switches. However, a lot of the flexibility can (and often should) only be provided by introducing install-time or run-time parameters that depend on environmental variables. The downside of such flexibility is the environmental complexity it creates because different versions of an application will now need to be released in different environments. Matching environmental variables to build version is no trivial task.
Reasons of budget and speed often dictate different kinds of environments at various stages in delivery. For instance, developers often leave the heavyweight performance tests to the testers. They are unlikely to need complex production servers for their commits. Providing for production-sized servers to developers impacts the budget and also the speed of deployment. Imagine the amount of time and resources that need to be spent on provisioning and setting up a production server every time a build needs to be executed in development (which is ever so frequent). So, development environments are usually different from production.
And so are most of the other environments in an organization’s deployment pipeline. Unit testing, integration testing, user acceptance testing would all be normally different from production unless, of course, you have deep pockets. The only exception could be the pre-production staging, which should be as close to the production environment as realistically possible.
So environmental variations during the lifecycle exist along with variations after release. These variations are what make life difficult for Operations in a continuous delivery pipeline.
- Tracking and linking the builds at various stages of delivery to their environmental properties is the first task for Operations.
- Once such a ‘configuration map’ is prepared, the next task of the operations team is to make sure that the correct environments are set up at the proper juncture. The ability to set up environments presupposes the ability to dismantle them as well. This is easy if you are using virtual machines but not if you are on bare metal.
- The final task is to move the builds into the environments set up for them. Often this is a matter of copying a build onto the right machine, but in a scenario where there can be scores or even hundreds of builds, this apparently simple task can be mind-boggling.
Various tools and practices have been in vogue to make these tasks simpler. One must-do practice is to encapsulate the environment properties as some kind of artefact (often a simple file) and keep it under version control. Sophisticated tools like IBM Collaborative Lifecycle Management (CLM) integrate these artefacts into their overall configuration management function and maintain a map of linkages between environment configuration artefacts and build versions.
It would then theoretically be possible for a deployment tool to look up the artefact repository based on say the build version, access the environmental properties, and run a deployment script by passing the appropriate environment variables to it (to represent this rather simplistically). This approach is a big advance from manual deployment, but it is not always fail-safe. One of the problems is that it is not always possible to automatically clean up afterwards. For instance, if a particular script installs a library file, it cannot always uninstall the library later (say when you are done with testing an application). Most deployment scripts make some assumptions about the pre-existing environment, which if violated can lead to unpredictable consequences. Most operations personnel would concur (for this very reason) that it is better to install an environment from scratch than to attempt to repair one (or modify an existing one).
Virtual machines (VMs) tackle this problem. Because you can set up a VM and shut it down fairly easily, running a particular test in a particular environment is a relatively easy task. All you need to do is boot up a VM from scratch with the right configuration (no need to bother about pre-existing configurations), run the application on the VM, and drop the VM after you are done with the application. Everything on the VM goes away along with it. Unfortunately, that could often mean setting up a VM for every separate service that needs to be tested, which is a considerable waste of resources. Additionally, VM provisioning time, can often reduce deployment velocity.
One technology that has the potential to solve this problem is containerization. A container is a bit like a VM. It sits on top of a host operating system (OS) and provides all the services that are required to run an application. It hosts an application and provides the environmental resources required to run the application. A containerized application does not talk to the OS or to other resources outside the container but only to the services provided by the container.
Docker’s implementation of the container concept is elegant, and perfect for continuous delivery. Docker bolsters the core container technology with an easy way to containerize an application and to deploy a container. The corollary of this is that it is now possible to pre-configure the environment in which an application is meant to run, bundle it with the application executable into a single image, push the combined package onto a host system, and then use the image to install the environment and launch the application. Such an application is said to be ‘dockerized’.
The benefits of dockerization are relevant for DevOps and continuous delivery:
- First, the container eliminates the need for VMs. It provides applications with a trimmed down version of the operating system. The entire OS is not virtualized as in VMs but only some select services that run in user space. These services then interact with the underlying host OS, and hardware by translating the service and resource requests made by the application into the appropriate host OS routines.
- Second, Docker reduces the vagaries of deployment because it spins up a new environment from scratch, and does not attempt to ‘repair’ a pre-existing environment (which as we discussed can be unpredictable).
- Third, it makes container creation and deployment relatively easy, which is what makes it amenable to DevOps. Anyone (almost) can with a bit of training Dockerize an application, create its image, spin up the container on a host machine, run the application within the container (and test it if required), and spin down the container (and all its contents) when done with the application. This not only reduces deployment time but, equally importantly, also blurs the dividing line between development and operations, because it now enables a developer to deploy and test his application without depending on operations personnel to set up the appropriate environment at the appropriate time.
- Finally, Docker brings nimbleness to operations. Since, Docker makes it easy to set up and tear down environments, it gives operations teams the confidence to work faster, make changes, roll back configurations, and if need be start all over from scratch. As a result, no longer need agility be a buzzword confined to design meetings or developer scrums but a visible characteristic of operations as well. This agility, manifesting itself in short concept-to-cash cycle times, frequent incremental releases, and happier customers is DevOps in its full-bloom glory.
So back to our core point – Docker takes the pain out of operations at various stages of a delivery lifecycle – development, testing, integration, and release. However, it needs a lifecycle collaboration and management tool to help it merge into the lifecycle – communicate with other tasks, access common resources, and be visible to relevant stakeholders. While Docker automates the deployment process, reduces risk, and thus enables frequent yet reliable deployments, you need a mature and stable management tool that can inter-operate with it and accomplish the important task of mapping image files to builds, tracking and sharing them, and providing appropriate visibility to all stakeholders. These latter tasks, and other related ones, are taken up by CLM.
CLM tools and Docker complement each other. CLM provides the tools that are required to manage activities and artefacts across the entire delivery pipeline (including deployment). Docker makes deployment simple, quick, economical, and therefore ideal for agile processes. CLM has long been a tool of choice in agile development. Coupled with Docker it can bring agility to operations and create a true end-to-end DevOps culture in which operations and development teams together respond agilely to customer demands.
About the Author / Vidhya V Kumar
Vidhya is an advisory software engineer at IBM. An evangelist and practitioner of agile methodologies, continuous delivery, and documentation, she has around ten years of experience in the IT industry. She is a subject matter expert in DevOps, Application Lifecycle Management, and Search methodologies.