Docker Container image size optimization.

Ripping out Python and Reducing Our Docker Image Size by ~87%

Sep 19, 2023Web App Development

Background

We built an OSS containerized tool, cloud-concierge, for automating Terraform best practices, and primarily implemented the solution in Go, given that Go generally is the “lingua franca” for Terraform tooling. With some gaps in Go, however, we built a few components of the container with Python in order to quickly ship a functioning product. Once we had Python packaged into the container, it became natural to add additional functionality that relies on Python to keep moving fast, which was okay until…

The Bloated Image

While we were able to release quickly, and get valuable initial feedback from users, it became clear that the cloud-concierge image we made available was bloated. It’s statistics:

  • 942 MB on DockerHub
  • 2.69 GB fully extracted

Why this was bad:

  1. For users it would take 5–10 minute process just to pull and extract the latest cloud-concierge image to run locally.
  2. We would call Python scripts for select functionality through os.exec. While this worked, it lead to unwieldy error logs when bugs arose, and increased the overall code base’s complexity.
  3. All else being equal, a smaller container is more secure.

Conversion to Go Only

We initially tried to use the slim tool to shrink the container, but it is best suited for HTTP servers and so did not work for cloud-concierge which serves as a run-until-complete job.

With no easy fixes available, the most obvious way to reduce image size while preserving functionality was to remove all Python elements from the image.

This included addressing the following components:

This was a relatively straight conversion of the implementation from Python to GoLang, and eliminated the need for including some bulkier packages like Pandas and Numpy. Of course, nothing in Go matches the ease of data manipulation in Pandas, but the package-elimination juice was worth the squeeze here.

Unfortunately, this script is tied to a very specific Python package, spaCy, that cannot be shifted directly to Go. As a result, we ended up pulling this script into its own service and hosting it within a Google Cloud Function that the cloud-concierge container now calls as an endpoint. Offloading this script to a different service reduced the image size by ~300 MB.

Both of these CLIs for major cloud providers require Python runtimes, and gcloud in particular was onerous enough to install in the container that the easiest implementation was to use the google/cloud-sdk image as the base image in our original build.

Replacing the CLI calls with the equivalent Go SDKs required a bit of digging through documentation, but replacing our base image with Alpine saved 100s of MBs of space.

Tale of the Tape

After implementing each of these optimizations to rid the image of Python, the results were in:

  • 125.44 MB on DockerHub, a ~7.5x reduction
  • 470 MB fully extracted, a ~5.7x reduction

From release v0.1.3 to v0.1.5 you can see for yourself the steep decline in image size on DockerHub. When combined with extracting the container, this reduction in container size dramatically lowers time to value for users.

Conclusion

We initially leaned on Python for select functionality to ship our product as quickly as possible. This came at the cost, however, of a bloated Docker image. By methodically removing all Python code from the container, we shrunk the image size dramatically. This yielded a much faster download and extraction time for users, all without comprising on value delivered.

dragondrop.cloud’s mission is to automate developer best practices while working with Infrastructure as Code. Our flagship OSS product, cloud-concierge, allows developers to codify their cloud, detect drift, estimate cloud costs and security risks, and more — while delivering the results via a Pull Request. For enterprises running cloud-concierge at scale, we provide a management platform. To learn more, schedule a demo or get started today!

Learn More About Web App Development

Firefly vs. Control Monkey vs. cloud-concierge in 2023

Why a Cloud Asset Management Platform? With ever expanding cloud environments, having visiblity for and control of cloud assets is not a trivial task to perform manually. A series of offerings exist to automate this problem, providing functionality to at least: Detect...

read more

Running Remote DB Migrations via GitHub Actions

Motivation Database migrations are an essential component in the life cycle of any relational database. We needed a way to trigger database migrations in our development and production environments within our existing CI/CD pipelines running in GitHub Actions. We...

read more