Tiny Golang containers with distroless
I’ve recently moved some of the components that assist with the running of bgp4_table and bgp6_table into ‘the cloud’. Specifically I migrated the tweeter service, that actually compiles and tweets the data out, and the grapher service, which creates the images that goes with those tweets.
tweeter is written in Go, while grapher is using Python. I’d prefer to use Go as much as possible, but Python still has better graphing libraries. For graphing I use Matplotlib
The cloud service I’m using for both is Google’s Cloud Run. You place your app inside a docker container, upload the image and invoke it from there. I use the HTTPS endpoint for the tweeter service, and that calls the grapher service container over gRPC. I tweet 180 times a month and Google gives you 2 million invocations a month free so I’m well within the free tier here.
When moving to Docker containers, most online guides will show you how to pull in a fairly large container for your particular project. Whether that be something like docker pull python:3.8.3-buster or docker pull golang:latest - This is fine for initial testing, but the resulting containers can be pretty large. Not only that, but there are a load of internal libraries running which you don’t need, increasing the attack surface of your container.
Distroless
Google has created some tiny containers instead that give you just enough OS to run your app and nothing more. I’ve been using them with both Go and Python and there are quirks to both. I’ll go over the two main ways I use this in Golang which is the easiest, and then show what I do with my Python containers.
Go
Go is compiled. This means once you have a fully functioning application, you can compile this to a single binary and run this somewhere else. Go also allows you to cross-compile your binaries, meaning you can create a binary that’ll run on an ARM Linux server from an x86_64 Windows machine. Both of these attributes makes it very easy to move to distroless containers in the Go world.
Option 1 - Multistage build
The first option is to use a multistage build. This means you use a regular container to compile you app, and then you copy all the relevant parts over to a distroless container. In fact this is how the distroless Go FAQ shows it. This is an example Dockerfile:
FROM golang:1.14 as build-env
WORKDIR /go/src/app
ADD . /go/src/app
RUN go get -d -v ./...
RUN go build -o /go/bin/app
FROM gcr.io/distroless/base
COPY --from=build-env /go/bin/app /
CMD ["/app"]
Here we have an initial container using golang:1.14 and we name is build-env. This is a temporary container. Our source code is passed into this container, we get our dependencies, and then we build the app. The second part then pulls the distroless:base container and copies the binary from the build-env container. What you’re left with is a tiny distroless container with a compiled binary ready to be invoked.
Option 2 - Makefile
There is a faster way to do this though. Remember Go allows cross-compilation, so you could just compile the binary on your machine and copy that into a new distroless container. It doesn’t matter what OS/Arch you’re running vs the container, as we just compile to the target container OS/Arch. This is an example Makefile:
build:
GOOS=linux go build -o app
docker build -t xxxx/xxxx/appname .
rm -f app
And this is the Dockerfile:
FROM gcr.io/distroless/base
ADD app /app
ADD config.ini /
ENTRYPOINT ["/app"]
Now all I need to do is run make build and I have a container ready to do. In this instance I’m setting the OS to linux and compiling the app to that target OS. I then call the Dockerfile which builds a new container and copies over the binary. I’m also copying a local config.ini used in this particular app. The Makefile then deletes the binary on my local machine. This process is extremely fast as the compiling is done locally and only the resulting binary is copied into a distroless container.
Python
Unfortunately Python is not so straight forward. Python does not compile so you need to have all your dependencies inside the distroless container. I did try Pyinstaller but I couldn’t get it to work. Maybe that could be an option if you manage to get it working. In my case I reverted to a multistage build, then copied the libraries over to my distroless container. This continer is running Python 3.7, so I just match that in my initial container. No makefile this time, just a Dockerfile:
FROM python:3.7-slim AS build-env
WORKDIR /app
ADD ./ ./
RUN pip3 install --upgrade pip && \
pip install -r ./requirements.txt
FROM gcr.io/distroless/python3-debian10
COPY --from=build-env /app /app
COPY --from=build-env /usr/local/lib/python3.7/site-packages /usr/local/lib/python3.7/site-packages
WORKDIR /app
ENV PYTHONPATH=/usr/local/lib/python3.7/site-packages
CMD ["app.py"]
requirements.txt contains a list of the Python libraries I need installed. My example:
grpcio
grpcio-tools
protobuf
matplotlib
The end result here being a distroless Python container with both my app and the required dependencies ready to go.