The world of virtualization has continued to change and evolve over the last 15 years with VMware leading the charge. Even though the idea of containerization has been in the operations world for over 20 years it really has only taken hold in the last couple of years. One of the massive drivers of this has been the rise of Docker.
Essentially, the open-sourced Docker runs applications inside software containers, utilizing resource isolation features and allocation benefits in allowing enhanced portability and efficiency, similar to a lightweight virtual machine (VM), but with a different architectural approach.
Unlike VMs, with containers, the operating system (OS) kernel is shared between containers, running as an isolated process in userspace on the host OS, while at the same time including the application and its dependencies, enhancing its portability and reducing VM-like overhead. With containers, running the code directly on the host OS and including everything that it requires to run the application in a complete filesystem – such as code, as well as system libraries and tools – ensures the application will run the same no matter the environment.
As more and more companies look to take advantage of containers in their ecosystems, the question of enterprise readiness gets more important.
There are a number of areas to look at and plan for as you take Docker to production, including Security, Storage, Orchestration, Networking and Service Discovery.
One of the first and most important aspects of any production environment is the security of the system.
Initially, from a security perspective, many Docker advocates point out that containers add a layer of application protection simply because they isolate each application from the next.
There are a number of areas to be considered with regards Docker security, for example: the security of the kernel; the Docker daemon; default or user customized gaps in the container configuration profile; and user enhanced kernel security features and how they work together with containers.
Additionally, for Docker, this invariably means considering several aspects of running containers:
- User permissions inside containers
Docker starts processes within a container with root user privileges. While a root user is not meant to be able to escalate that status to the host, basic precautions should be considered to enhance security. For example, within containers, running a process as a non-privileged user. In addition, making sure the Docker container has unnecessary system capabilities removed and using a “hardened” Linux security module whenever necessary.
- Filesystem permissions
Container programs are run as root by default, as on the host system. As a result, files in a Docker-based development project have root ownership to begin. Establishing and switching filesystem and data volume permissions and accessibility also depends on whether they are shared, whether they are on a host machine, and whether the UID of the user and the host match.
- Docker daemon access
Running Docker containers necessitates running the Docker daemon, which in turn requires root privileges. The Docker daemon should only be controlled by trusted users because of security concerns. One significant feature is that a guest container can access the directory on a Docker host without impeding the access rights of the container.
- Container configuration and kernel capabilities
Docker containers start with a limited set of capabilities by default, which leads to the root or non-root question becoming an effective access control system. These capabilities can include the ability to mount file systems inside a container or access certain /proc files.
- Docker key/value labels
Labels are used to apply metadata to containers or daemons. Labels have a myriad of uses, such as host identification or providing licensing information. While a label is seen as a / pair, each must be unique, even if multiple labels are specified, or the value will be overwritten. Assigning
- Signed Docker images / Docker Content Trust
Docker containers are based on binary Docker images, which essentially define a set of files, metadata and requirements for running a single Docker container. Unless additional access privileges are granted to the container when it was created, it will only have access to resources defined in the image. To address the issue of secure distribution, Docker supports using a tool called notary to sign your own Docker images.
- Docker Keywhiz integration
Secrets can be managed and distributed using Keywhiz. Secrets can be centrally stored and encrypted in a database for a cluster of Keywhiz servers. The Keywhiz plugin manages secrets and supplies credentials using Keywhiz as a central depository.
Container filesystem storage is an interesting problem for Docker because in theory all containers are ephemeral and can be removed at any time. This means that using a third party system is usually the route enterprises should look at. There are number of good options when looking for persistent storage.
- Flocker + ScaleIO
Flocker is a data volume manager for open source Docker container applications which provides tools for data migrations. Data volumes are portable with Flocker and can be managed together with Docker containers. In combination, commodity hardware can be turned into shared block storage with the scalable and performant software-defined storage infrastructure, ScaleIO.
The open source GlusterFS distributed storage file system can be attached to a network, can be scaled out to include thousands of clients, and can be utilized with a variety of applications, including content delivery and cloud computing.
The free software storage platform Ceph provides for a data storage system using commodity hardware that is self-managing as well as self-healing, utilizing a single distributed computer cluster to store the data so it is fault-tolerant.
The open source Rancher software allows for the delivery of Docker orchestration to users through the deployment of a private container service. Resource pools can be created from any host allowing authorized users to deploy containers, and in doing so allowing users to have complete control over the deployment of their applications.
- Docker GlusterFS storage plugin
One of Docker’s many plugins is GlusterFS, which allows for multi-host volumes management.
Automating manual possesses is called orchestration and works by essentially programming a manual information technology management task.
There are a number of tools and frameworks available for orchestrating with Docker, including Kubernetes and Docker Swarm.
The open source Google-developed Kubernetes Docker container orchestration system is aimed mainly at deploying and managing containers at scale, actively managing and scheduling computer cluster nodes, and grouping application containers into logical units.
- Docker Swarm
Seen as best suited for smaller environments made up of just a few Docker hosts, Docker Swarm is an easy tool to setup for containers running on an orchestrated structure. Integrating with the standard Docker command line tools, Swarm also ships with a service registry while at the same time supporting many leading registries.
Networking and seamless communication between all nodes on a network is also key to the efficient use of Docker containers.
With Weave, applications act as if containers are all attached to the same network switch, creating a virtual network without the need to configure settings and ports, facilitating their automatic discovery, while at the same time allowing the application services to be accessed by users outside of the network. From a security perspective data traffic can be encrypted to allow hosts to communicate over untrusted networks.
- Docker Networking
Containers can essentially be run in four Docker network modes: Bridge mode; Host mode; Mapped Container mode; and the ubiquitous None.
- Bridge mode is the default and allows for containers to be attached to the Docker0 Bridge.
- Host mode essentially makes all of the host’s network interfaces accessible to the container, which is placed in the host’s network stack.
- Mapped Container mode makes the network resources of one container available for sharing by a second container, once that first container is mapped into the second container’s network stack.
- None, allows for custom network configuration and denying the configuration of a container’s network interfaces by putting it in its own network stack.
Cluster processes and services can be managed by service discovery tools, which essentially locate and facilitate communication between those components. A number of tools are available, including Consul, Etcd + Docker Swarm, and SkyDNS.
With optimized properties for cluster configuration and coordination, Consul utilizes a specific class of datastore that make it a powerful tool for distributed system building.
- Etcd + Docker Swarm
Written in Go and utilizing the Raft consensus algorithm, etcd is a distributed, consistent key-value store for shared configuration and service discovery. Running etcd within a Docker container is an effective way to test a sample cluster. As a result, a Docker Swarm cluster can leverage etcd for node and service discovery.
- Sky DNS
Etcd features again with SkyDNS, used for distributed service discovery. Running on top of etcd, available services are discovered using DNS queries and utilizing SRV records.
Container popularity continues to rise and coming from a company which runs thousands of Docker containers in production, we’ve seen many of the potential issues of running Docker in enterprises.
About the Author/Aaron Brongersma
Aaron Brongersma is a Senior Infrastructure Manager at Modulus, a Progress company that specializes in application hosting PaaS for app developers and enterprises, with a specialty in Node.js and Docker.