Securing Container Images in the DevOps World

February 19, 2019February 18, 2019 Tae Jin Kang code fingerprinting, code vulnerabilities, container security, open source

by Tae Jin Kang

According to 451 Research, the application container market will experience significant growth over the next five years. In its “2017 Cloud-Enabling Technologies Market Monitor & Forecast report,” the research firm noted that “annual revenue is expected to increase by 4x, growing from $749m in 2016 to more than $3.4bn by 2021, representing a CAGR of 35%.”

Automating deployment is a must-have capability for SMBs and enterprises. Leveraging container automation has reshaped how quickly and effectively an organization can leverage internal and external virtual environments.

Containerization is now a widely adopted DevOps trend because it substantially reduces the time and resources required for deployment. Virtual machine hypervisors emulate virtual hardware, which means they use a significant amount of system capacity.

Containers use shared operating systems, which makes them much more efficient than hypervisors in terms of system resource use and much easier to deploy. Additionally, containers are better-suited for continuous integration/continuous deployment (CI/CD). The lightweight nature of containers allows applications to be modularized in independently deployable micro-services that can be instantiated and torn down in a just-in-time fashion. This DevOps model encourages developers to integrate their code into a shared repository early and often, and enables the code to be deployed quickly and efficiently.

According to James Bottomley, CTO of Parallels and a leading Linux Kernel developer, optimally tuned containers can have as many as four to six times the number of server application instances running than hypervisors can have on the same hardware.

Sponsorships Available

Convenience at a Cost

However, the convenience of containerization comes at a cost—specifically, a security cost. Unlike hypervisors, containers use a shared OS kernel and are isolated at the process level. Therefore, a security breach could affect the entire system.

There are three different levels that must be addressed for potential security risks by DevOps teams deploying containers: the cloud infrastructure level, the container management/orchestration level and the container image itself.

Currently, the cloud infrastructure and container management levels are well-addressed by the corresponding solution providers. By following the best practices recommended by the providers, security risks can be adequately mitigated.

The Container Image

The most significant and least addressed level, from a security standpoint, is the container image. It is more challenging to address than the previously mentioned levels because many developers—especially the smaller development houses—download and use images from a container image repository/hub, such as Docker Hub. In these cases, deployments are vulnerable to any security problems that exist in the binaries within the images being used. This is not uncommon, as many developers don’t know exactly what is in these images.

Given that Docker Hub has more than 100,000 images and there are numerous corporate (private) repositories, it is challenging to ensure that someone with access did not inadvertently included a software component with a vulnerability. In a worst-case scenario someone could have included a component with a vulnerability with malicious intent.

To address the possibility of a software vulnerability in a container image hosted by a repository or a hub, vendors offer paid plans that use security scanning tools to check the binaries in the images for known vulnerabilities. However, the current scanning technology used by the vendors produces sub-optimal scanning results—having too many false positives and failing to detect many known open source vulnerabilities altogether.

Scanning proprietary binary files for known vulnerabilities is relatively easy because the scanners only need to determine whether the binary is an official version, which can be done easily with hash value scan. Only a few variations of officially released binaries exist.

While existing vulnerability scanning tools do a good job of detecting vulnerabilities for proprietary software, the same cannot be said for detecting open source software (OSS) vulnerabilities. This is due largely to the inherent permissiveness of OSS and the complex supply chain that comprises software created with open source components—the modules, libraries and building blocks used in today’s open source code.

For example, any part of the source code can be taken from one open source project and used in another totally unrelated project. This code can then be distributed as binary and linked with other projects, creating a complex supply chain of OSS. This makes tracking the OSS components incredibly challenging and makes detecting software vulnerability increasingly difficult as one goes down the supply chain.

Given that open source is now almost universally used, with Forrester estimating that more than 90 percent of today’s software contains at least one open source component, the potential impact of a single vulnerability can be far-reaching and damaging.

Moreover, even the most experienced and resourceful enterprises with a formal OSS management program are likely to have challenges securing container images. Manual audits and existing scanners are likely to overlook small traces of OSS introduced through unexpected means such as a developer copying an open source code snippet.

Fingerprint-Based Binary Code Scanning

DevOps and security teams can leverage binary code scanners that use code fingerprinting technology. These tools extract “fingerprints” from the target binary to be examined and compare them to the fingerprints collected from OSS components hosted in well-known, open source repositories.

Existing binary scanning approaches have a difficult time detecting variations of binaries from the same OSS project. Detecting OSS components added into a project in “source code” format, in an arbitrary target binary, is difficult. However, fingerprint-based scanning can accurately detect even minute traces of OSS components. Hence, fingerprint-based scanning allows for far greater OSS detection coverage with fewer false positives from any arbitrary target binary file.

Once a component and its version are identified through this fingerprint-matching, DevOps and security teams can easily find the known security vulnerabilities associated with the component from vulnerability databases, such as the NVD. Of course, the next step is to address the vulnerability through patching or other workarounds.

At some point the container image repositories are likely to leverage a fingerprint-based binary scanning tool to hunt down and eliminate know open source software vulnerabilities. Until that time, a best practice for DevOps teams is to use a fingerprint-based binary code scanner separately, prior to including code in their containers, to reduce the risk of a costly security breach.