What are the problems with typical base images?

When we package our application into containers, we have to find a suitable base image. Normally, we would take an official image containing the needed runtime for our application. For example, the typical base images for a Java application are Amazon Correto, Eclipse Temurin or the Bellsoft Liberica JRE. They are good choices since they are well maintained and battle tested. Nevertheless, they come with some disadvantages. They are usually based on known distributions and therefore contain more tools than are actually necessary to just run the application. For example, let’s look at the Bellsoft Liberica JRE Image.

docker sbom bellsoft/liberica-openjre-debian:19.0.2
Syft v0.43.0
 ✔ Loaded image
 ✔ Parsed image
 ✔ Cataloged packages      [121 packages]
NAME                    VERSION              TYPE
...
apt                     2.2.4                deb
bash                    5.1-2+deb11u1        deb
curl                    7.74.0-1.3+deb11u3   deb
grep                    3.6-1                deb
sed                     4.7-1                deb
tar                     1.34+dfsg-1          deb
...

As the name already suggests, it is based on Debian and comes with some packages shown above which may help with debugging but are not required to run the application, in particular packages like a shell or package managers. It looks better with Alpine based images but even these images contain a package manager (APK) and a shell.

docker sbom bellsoft/liberica-openjre-alpine:19.0.2
Syft v0.43.0
 ✔ Loaded image
 ✔ Parsed image
 ✔ Cataloged packages      [15 packages]
NAME                    VERSION      TYPE
...
apk-tools               2.12.9-r3    apk
busybox                 1.35.0-r17   apk
...

These additional tools bring problems with them:

amazoncorretto                     19.0.2-al2               ...   38 hours ago    509MB
eclipse-temurin                    19.0.2_7-jre-jammy       ...   2 days ago      269MB
bellsoft/liberica-openjre-debian   19.0.2                   ...   8 days ago      253MB
bellsoft/liberica-openjre-alpine   19.0.2                   ...   8 days ago      133MB
docker scan bellsoft/liberica-openjre-debian:19.0.2
...
✗ High severity vulnerability found in curl/libcurl4
  Description: Cleartext Transmission of Sensitive Information
  Info: https://security.snyk.io/vuln/SNYK-DEBIAN11-CURL-3066040
  Introduced through: [email protected]+deb11u3
  From: [email protected]+deb11u3 > curl/[email protected]+deb11u3
  From: [email protected]+deb11u3

✗ High severity vulnerability found in curl/libcurl4
  Description: Cleartext Transmission of Sensitive Information
  Info: https://security.snyk.io/vuln/SNYK-DEBIAN11-CURL-3179181
  Introduced through: [email protected]+deb11u3
  From: [email protected]+deb11u3 > curl/[email protected]+deb11u3
  From: [email protected]+deb11u3

✗ Critical severity vulnerability found in curl/libcurl4
  Description: Exposure of Resource to Wrong Sphere
  Info: https://security.snyk.io/vuln/SNYK-DEBIAN11-CURL-3065656
  Introduced through: [email protected]+deb11u3
  From: [email protected]+deb11u3 > curl/[email protected]+deb11u3
  From: [email protected]+deb11u3
  Fixed in: 7.74.0-1.3+deb11u5

What are distroless images?

Distroless images try to address these problems by containing only tools and libraries really needed to run our application. Probably the best known distroless images are those from the Google distroless project. It provides base images for the most common programming languages like Python, Java or Nodejs. It also provides some core images for all their distroless images. For a deep dive of what they do and why they are needed, I recommend a look into Ivan Velichko’s blog.

This project focuses on the LTS versions of the variant language platforms and are a good starting point.

Another possibility are the Chainguard Distroless images. The main focus is on secure and reproducible builds. These can be used similarly to the Google distroless images. Here we can see a full list of the packages as part of the Chainguard JRE image.

docker sbom cgr.dev/chainguard/jre
Syft v0.43.0
 ✔ Loaded image
 ✔ Parsed image
 ✔ Cataloged packages      [15 packages]
NAME                    VERSION          TYPE
bzip2                   1.0.8-r4         apk
ca-certificates-bundle  20220614-r4      apk
expat                   2.5.0-r2         apk
fontconfig              2.14.1-r1        apk
freetype                2.12.1-r1        apk
glibc                   2.36-r6          apk
glibc-locale-en         2.36-r6          apk
glibc-locale-posix      2.36-r6          apk
jrt-fs                  17.0.6-internal  java-archive
libbrotlicommon1        1.0.9-r1         apk
libbrotlidec1           1.0.9-r1         apk
libpng                  1.6.39-r1        apk
openjdk-17-jre          17.0.6-r0        apk
wolfi-baselayout        20221118-r1      apk
zlib                    1.2.13-r3        apk

So no shell, no package managers or other debug tools. How many vulnerabilities are in this image?

docker scan cgr.dev/chainguard/jre

Testing cgr.dev/chainguard/jre...

Package manager:   apk
Project name:      docker-image|cgr.dev/chainguard/jre
Docker image:      cgr.dev/chainguard/jre
Platform:          linux/arm64

✔ Tested 15 dependencies for known vulnerabilities, no vulnerable paths found.

Note that we currently do not have vulnerability information for Wolfi 20221118, which we detected in your image.

So from a security point of view they seem to be a better fit. But interestingly enough, they are not the smallest possible images.

amazoncorretto                     19.0.2-al2               ...   38 hours ago    509MB
eclipse-temurin                    19.0.2_7-jre-jammy       ...   2 days ago      269MB
bellsoft/liberica-openjre-debian   19.0.2                   ...   8 days ago      253MB
bellsoft/liberica-openjre-alpine   19.0.2                   ...   8 days ago      133MB
cgr.dev/chainguard/jre             latest                   ...   9 hours ago     175MB
gcr.io/distroless/java17-debian11  latest                   ...   3 days ago      226MB

Both the Chainguard and Google distroless JRE image are bigger than the alpine. So there is room for improvement. But before we check how to solve that, let’s talk about how we could debug these images.

How to debug images based on distroless?

Although these images are generally preferrable they come with a downside. Sometimes, especially during development, we would still like to have a way to look inside a running container. Sometimes a shell access is the quickest way to analyze issues but how do we do that without sacrificing the improved security. We could use another base image for development than for the final deployment, but this would be against the idea of an identical artifact we use to push through all development and deployment stages. In the end, we want to run the same image in production that we work with during development, right?

Are there better ways to debug a distroless image?

Probably one of the most interesting tools out there is cdebug also created by Ivan Velichko which provides means to debug into a distroless container without changing the original image. If we e.g. have a container running called my-app based on a distroless image, we can open up a debug shell session like this:

cdebug exec -it my-app

It is already supporting most of the common runtime platforms like Docker, Containerd and in a limited way Kubernetes, so it looks quite promising and is worth a shot. For an understanding how cdebug works, have a look at the official documentation in GitHub.

How can I build my own distroless image?

What if we need a distroless image for a language version not supported by Chainguard or Google? Or if the provided images are still too big? They focus on LTS versions, so e.g. there is no distroless image for the latest Java version (19.0.2). So how can we build an own distroless image?

We could use the tooling used to build the Google distroless images but it is based on Bazel which is not everyone’s cup of tea (especially not mine). The easier way is to use the tooling used by the Chainguard project.

It is based on apko which can create an image based on a list of apk packages without the need for a Dockerfile.

Let’s try to create our own JRE 19 image based on Bellsoft Liberica. The Bellsoft Liberica JRE is available as an APK package to start with. To build a distroless image with apko, we need a configuration file describing which packages should be part of the final image and some additional configuration parameters.

contents:
  repositories:
    - https://dl-cdn.alpinelinux.org/alpine/edge/main
    - https://dl-cdn.alpinelinux.org/alpine/edge/community
    - https://apk.bell-sw.com/main
  packages:
    - bellsoft-java19-runtime-lite
entrypoint:
  command: java -jar
environment:
  PATH: /usr/lib/jvm/bellsoft-java19-runtime-lite/bin
jre.yaml

This is all we need to create an own distroless image. We can run apko with the following command:

docker run -v "$PWD":/work cgr.dev/chainguard/apko build \
    -k https://apk.bell-sw.com/[email protected] \
    jre.yaml myjre:19 myjre.tar

The -k option appends the system keyring (which already contains the keys for the Alpine repositories) with the key for the Bellsoft repository, otherwise we would get build errors. The created tar file can be loaded into Docker with

docker load < myjre.tar

So let’s have a look at the final image size

amazoncorretto                     19.0.2-al2               ...   38 hours ago    509MB
eclipse-temurin                    19.0.2_7-jre-jammy       ...   2 days ago      269MB
bellsoft/liberica-openjre-debian   19.0.2                   ...   8 days ago      253MB
bellsoft/liberica-openjre-alpine   19.0.2                   ...   8 days ago      133MB
cgr.dev/chainguard/jre             latest                   ...   9 hours ago     175MB
gcr.io/distroless/java17-debian11  latest                   ...   3 days ago      226MB
myjre                              19                       ...   53 years ago    78.2MB

The final image is roughly half the size of the smallest JRE image we had so far. Not so bad. Is it working?

docker run myjre:19
openjdk 19.0.2 2023-01-17
OpenJDK Runtime Environment (build 19.0.2+9)
OpenJDK 64-Bit Server VM (build 19.0.2+9, mixed mode)

Additionally let’s check the installed packages.

docker sbom myjre:19
Syft v0.43.0
 ✔ Loaded image
 ✔ Parsed image
 ✔ Cataloged packages      [7 packages]
NAME                          VERSION       TYPE
bellsoft-java19-runtime-lite  19.0.2_p9-r0  apk
busybox                       1.36.0-r3     apk
busybox-binsh                 1.36.0-r3     apk
java-common                   0.5-r0        apk
jrt-fs                        19.0.2        java-archive
musl                          1.2.3-r4      apk
zlib                          1.2.13-r0     apk

It still contains a shell as part of busybox, which seems to be a dependency we can’t avoid. Currently there is no way to exclude direct or transitive dependencies. As a final step let’s have a last look on the vulnerabilities:

docker scan myjre:19

Testing myjre:19...

Package manager:   apk
Project name:      docker-image|myjre
Docker image:      myjre:19
Platform:          linux/arm64

✔ Tested 6 dependencies for known vulnerabilities, no vulnerable paths found.

Note that we do not currently have vulnerability data for your image.

So the result already looks quite promising and is an easy way to generate our own distroless image. Additionally, apko provides more features (e.g. multi-platform builds, own apk package integration via melange, SBOM support and multi-process images) which we do not address here but can further simplify integration into our development process and is probably worth its own dedicated article.

So is there a reason not to use distroless images?

Actually, no. I think with the tools now in place to improve the debugging capabilities and to build your own images, there are no excuses anymore as the security and performance aspects outweigh possible inconveniences during development.