ALPS blog

Enhancing Cloud Security by Reducing Container Images Through Distroless Techniques

Since its inception in 2013, Docker has transformed how developers use containers. Docker Hub, on the other hand, has influenced how developers share container images. So as not to reinvent the proverbial wheel, most of the developers who deploy their code to containers do so using a publicly available image in Docker Hub as a base. Some of the most popular images in Docker Hub, like the key-value store server official image, follow the same trend and use the official Debian image as its base. While this seems like a good idea, verifying the footprint of each image is not necessarily a cloud security practice that developers regularly implement.

The official images are often maintained by a community of developers. When developers take the base operating system (OS) images, most of the application images are developed on top of other images. This means the vulnerabilities and security weaknesses found in the base images are carried over to the application images that used them

In this article, we take a closer look at the Distroless technique for optimizing, among other things, security in container images and offer an alternative approach that can reduce both the size of container images and the attack surface for malicious actors seeking to exploit cloud-native applications.

A brief background on current industry practices

The software bill of materials (SBOM) has recently become a popular concept in the information security community. SBOM is a list of all the packages installed in a specific container or file system, while Syft has started to become the industry standard for generating SBOMs.

We used Syft to generate a package list from the official public image of Debian:

Figure 1. A package list from the official public image of Debian generated using Syft; note that more packages are listed.
Figure 1. A package list from the official public image of Debian generated using Syft; note that more packages are listed.

Figure 1 shows that there are 96 packages installed in this image. We can also use Grype, also an increasingly popular tool, to analyze the SBOM generated by Syft to scan the original image for vulnerabilities.

Figure 2. A package list from the official public image of Debian generated through the use of Grype; note that more vulnerabilities are listed.
Figure 2. A package list from the official public image of Debian generated through the use of Grype; note that more vulnerabilities are listed.

The extent of the risk of using Debian-based images is plain to see: The more packages there are, the larger the attack surface becomes. This also results in a bigger disk and bandwidth footprint, which has pushed many developers to migrate from using Debian-based images to Alpine-based ones. For the newcomers, Alpine Linux is a security-oriented, lightweight Linux distribution based on musl libc and BusyBox.

We can see the benefits that Alpine Linux offers here:

Figure 3. Alpine-based images come with fewer packages and vulnerabilities by default.
Figure 3. Alpine-based images come with fewer packages and vulnerabilities by default.

And, as of today, that represents 0 vulnerabilities.

 

Figure 4.  Use of Alpine Linux as a security-oriented, lightweight Linux distribution
Figure 4. Use of Alpine Linux as a security-oriented, lightweight Linux distribution

The security improvement that Alpine Linux provides is great news. Alpine Linux has also been releasing timely updates.

Current issues

It would be naive to imagine that the only vulnerable pieces of code inside a given container would be the packages of the original base image. The applications written by the developers  introduce potential vulnerabilities to the container as well.  To visualize the potential problems, let us suppose for now the possibility that any container might be running a vulnerable application that allows remote code execution (RCE) inside the container.

If a developer has an application running in a Debian-based base image and one of the packages available for the attacker to use is apt, then Debian’s package manager creates opportunities for the malicious actors to exploit. Alpine-based official images are not that different, since they contain apk, its package manager, and BusyBox, which combines tiny versions of many common UNIX utilities into a single small executable, such as wget.

Malicious actors always find a vulnerability to exploit, thus the need to eradicate all possible opportunities that can elevate them to the next phase of an attack.

From the standpoint of an attacker trying to gain access to the shell of a potentially exposed container, package managers are seen as obstacles that need to be overcome. But that is not the only concern we had when we attempted to map the attack surface. There were also several native Linux tools — depending on the base image — that can be used for malevolent purposes, so the images would be more secure without them.

One approach to solving this issue involves mapping out those tools and removing the actual binaries during build. There are, however, two issues with such an approach: the effort of mapping all available tools and the creativity of attackers to use what is left.

A simple yet powerful example is base64 command, given its presence in all the container base images, as well as full Linux distributions. Its intent is to encode and decode data for ease of transfer. We also noticed that focused cloud-native attackers use this technique on a large scale to download or drop malicious parts of their arsenal encoded in base64 based on the assumption that the target victim has the command installed so they can decode in runtime and exploit the container further.

Figure 5. Screenshot of our analysis tools showing the threat actors using native Linux tools to unfold their attack.
Figure 5. Screenshot of our analysis tools showing the threat actors using native Linux tools to unfold their attack.

Another notable issue is that many cloud service providers (CSP) functioning as a service offering also run in containers or micro virtual machines (VMs) that are based on images with more than the minimal required packages installed.

The stage for a cyberattack is set if the application running on the exposed container is breached, as this enables malicious actors to use the tools inside the container to advance to the next level, whether the application is running on-premise or through a CSP.

How can we address the security issues?

Clearly, the attack surface needs to be reduced. Google created Distroless container images, which are images that contain only the application and its runtime dependencies. Unlike images for standard Linux distributions, Distroless container images do not have package managers, shells, or other programs.

Figure 6. A sample of Google’s Distroless Debian image
Figure 6. A sample of Google’s Distroless Debian image generated using Syft, highlighting the low number of packages
Figure 7. A sample of Google’s Distroless Debian image analyzed using Grype, highlighting the low number of packages and vulnerabilities
Figure 7. A sample of Google’s Distroless Debian image analyzed using Grype, highlighting the low number of packages and vulnerabilities

The Amazon Web Service (AWS) images shown in Figures 8 and 9 are not necessarily equal to what they offer as hosted, but they are base images that AWS provides so that users can create their own:

Figure 8. A sample of a base image that AWS provides which allows users to create their own analyzed using Syft, highlighting the high number of packages
Figure 8. A sample of a base image that AWS provides which allows users to create their own, analyzed using Syft, highlighting the high number of packages
Figure 9. A sample of a base image that AWS provides which allows users to create their own analyzed using Grype, highlighting the high number of package and few vulnerabilities
Figure 9. A sample of a base image that AWS provides which allows users to create their own, analyzed using Grype, highlighting the high number of package and few vulnerabilities
Figure 10. A sample ofan AWS base image found on an AWS node base image with many pre-installed packages and few high-severity vulnerabilities analyzed using Syft
Figure 10. A sample of an AWS base image found on an AWS node base image with many pre-installed packages and few high-severity vulnerabilities, analyzed using Syft
Figure 11. A sample of AWS base images found on an AWS node base image with many pre-installed packages and few high-severity vulnerabilities analyzed using Grype
Figure 11. A sample of an AWS base image found on an AWS node base image with many pre-installed packages and few high-severity vulnerabilities, analyzed using Grype

This approach allows us to tackle two main security issues that we have observed. We can significantly reduce the number of packages inside the image and retain only what is necessary for the intended application to run. By doing so, we also decrease the attack surface that cybercriminals can exploit. This approach also allows us to drastically reduce the number of vulnerabilities, even bringing it down to zero in most cases. This new approach makes the application more secure when deployed.

An alternative approach to Distroless

When we started this research, we noticed that most of the Distroless approaches we analyzed sought to achieve lighter and faster containers. In many cases, we observed that the container images did not have unnecessary tools and libs, while some even used scratch images with just a few base file systems as layers mounted afterward.

We propose an alternative approach to Distroless, which is to use a multistage build technique plus a scratch image that contains only the necessary supporting binaries for the intended application to run.

This approach dovetails neatly with serverless since its core concept is to break down applications into smaller functions and use the serverless functions to process data. In other words, each function has only one purpose. While this is the intended use, it might not reflect the real-world usage for all the users.

With the desired usage for container images in mind, there are two requirements for the function to run: the language interpreter and the CSP’s internal application programming interface (API) binaries. Our test results showed that we can drastically reduce both the size of the container as well as the attack surface and vulnerabilities found on the CSP-provided images.

Conclusion

The concept of Distroless container images may have been in existence for quite some time, but it is far from being the norm. As the body of research on container security is slowly being built, we continue to channel our expertise into the implications container security has for the cloud infrastructure. Our research showed its potential and how it can be adopted for resource optimization and for addressing security concerns. However, given the perceived shortfalls in the Distroless approach, we devised an alternative technique that uses a multistage build with a scratch image that contains only the essential supporting binaries for the intended application to run. If properly implemented, this approach can address vulnerability management issues and the need to minimize the attack surface that malicious actors targeting cloud-native applications exploit.

The multistage build with scratch image technique we discussed to optimize container images offers the following benefits for developers who strive to improve cloud security:

  • It can work well with serverless as its core concept is to segmentize applications into smaller functions and use the serverless functions to process data.
  • It can also significantly reduce not only the size of the container but also the attack surface and vulnerabilities found on the CSP-provided images.
Facebook
Twitter
LinkedIn

Featured News