This post will be the first of several addressing Docker image optimizations for different project types. It stems from my recent experiences with badly written Dockerfiles, which result in sitting around for 10 minutes every time you build, only to then need to upload images of over 1GB in size. These are extreme examples, but it's likely in the best interest of any developer to have an optimally written Dockerfile (if only because time is precious). What follows is mainly a collection of notes picked up for those working with Rust and Cargo builds. Other posts will follow for other languages as I learn the necessary practices.
Let's look at a base Dockerfile for a typical Cargo project. I have opted to use a real project I work on as an example (although I can't go into too much detail). This project is relatively small in terms of code space, but does have several dependencies on projects such as futures-rs, tokio, etc.
A very simple Dockerfile (and I'm sure one we have all written) could look something like this:
# select image FROM rust:1.23 # copy your source tree COPY ./ ./ # build for release RUN cargo build --release # set the startup command to run your binary CMD ["./target/release/my_project"]
It's minimal, but works just fine - you copy across your project and build it ready for release. Let's look at how a build like this fares (assuming you have the base image
rust:1.23 already downloaded):
Cargo build time: 227.7s Total time taken: 266.6s Final image size: 1.66GB
That's roughly a 4.5 minute build, and a very large image (I was genuinely suprised by that number). But now we have our baseline! Let's see what we can do to improve it.
Optimizing Build Times
The first aspect we're going to look at is the amount of time taken to build an image (partly because otherwise I'll have to suffer 5 minute builds for the rest of this post).
It might surprise you to learn that given the base Dockerfile above, the Docker cache is almost totally useless. Every time you copy over your
./ project, if anything in there has changed, the cache is invalidated. This means you'll have to sit through that build all over again, disaster!
So the first (and most important) thing we need to do is avoid that cache invalidation. Part of the reason the build is so long is that Cargo is building all of your dependencies, as they're pulled in at the same time as your source is being compiled. Lucky for us, there's a neat little trick which can get your dependencies into the cache (and therefore speed up your builds).
In essence, you need to create a new Cargo project inside the image, with all of the same dependencies as you, and compile that before you move your source code across. An example could look something like this:
# select image FROM rust:1.23 # create a new empty shell project RUN USER=root cargo new --bin my-project WORKDIR /my-project # copy over your manifests COPY ./Cargo.lock ./Cargo.lock COPY ./Cargo.toml ./Cargo.toml # this build step will cache your dependencies RUN cargo build --release RUN rm src/*.rs # copy your source tree COPY ./src ./src # build for release RUN cargo build --release # set the startup command to run your binary CMD ["./target/release/my_project"]
This image will build all dependencies before you introduce your source code, which means they'll be cached most of the time. Only when you change your actual dependencies will they need to be recompiled (if you change
Cargo.lock). Make sure to note that we've also changed to copy a specific set of files to avoid accidentally invalidating the cache. Let's see how this fares:
Cargo build time: 191.6s + 30.6s (222.2s) Total time taken: 262.3s Final image size: 1.67GB
This looks almost the same, and is expected because there's nothing in the cache yet. On the first build you'll see almost no improvement, it's the next build where these changes really shine (to test this, make sure to change a file in your src tree):
Cargo build time: 0s + 30.9s (30.9s) Total time taken: 33.6s Final image size: 1.67GB
The dependencies are the same so your cache is hit, and bang, you're now building at speeds 15% of the original time taken. As far as I know (at this point), there's very little else we can do to gain a faster build time.
Optimizing Build Sizes
Even with our changes to speed up builds, the image sizes are still large. There are a couple of things we can do at this point, with the major one being to make use of a multi-stage Docker build. These builds allow you to "chain" builds to copy artifacts from one build to another, thus lowering the amount of churn in your final image.
It's super simple to change, too. Here's our Dockerfile from before, but with an extra stage added to hold only the build artifact (no Cargo caches, etc):
# select build image FROM rust:1.23 as build # create a new empty shell project RUN USER=root cargo new --bin my_project WORKDIR /my_project # copy over your manifests COPY ./Cargo.lock ./Cargo.lock COPY ./Cargo.toml ./Cargo.toml # this build step will cache your dependencies RUN cargo build --release RUN rm src/*.rs # copy your source tree COPY ./src ./src # build for release RUN cargo build --release # our final base FROM rust:1.23 # copy the build artifact from the build stage COPY --from=build /my_project/target/release/my_project . # set the startup command to run your binary CMD ["./my_project"]
At this point, our image has cut down from
1.45GB, which is an improvement but still not what we're looking for. This is actually the base size of
rust:1.23, which gives us a hint of where to look next. The tag we're using is pulling in the
stretch version of the image. We can optimize here by moving to their
jessie version, since we don't care about the extra stuff. Simply changing "rust:1.23" to "rust:1.23-jessie" in the Dockerfile above gets us from
1.23GB, so we're about ~75% from where we started.
For some projects (i.e. not Rust), this is as far as you can get - we're basically the same size as the official builds. However we have one last trick up our sleeve, and it's related to the fact that these Rust projects compile to a single (executable) binary. Once we have this binary, we don't actually need any of the Rust toolchains, nor Cargo, etc. This means that we can actually just use a raw
jessie image as the source image of the second stage in our Dockerfile, so let's just substitute for
jessie-slim. The results are amazing:
REPOSITORY TAG IMAGE ID CREATED SIZE my-project dev 9bc6aeae4190 50 seconds ago 86.5MB my-project base b881102368b6 37 minutes ago 1.66GB
That's not a typo, our image is functionally the same (for the purposes of our project), and yet it's only 87MB in size. Fantastic!
When we first looked at the build results for our base Dockerfile, it didn't seem so bad (but I bet it does now!). Below is a comparison of the base and the final build times and sizes:
# base statistics Cargo build time: 227.7s Total time taken: 266.6s Final image size: 1.66GB # optimized statistics Cargo build time: 30.9s (~13.5%) Total time taken: 33.6s (~12.6%) Final image size: 86.5MB (~5.2%)
Perhaps the best part here is that all of this can be changed in a half hour (in fact, this blog post took me ~45 minutes whilst also working back through the changes myself). We're not building anything new, we're just making good use of the existing Docker tooling. I highly recommend that everyone take some time out to think about how they can do similar practices for their projects (there are similar tactics for things like Maven). Taking a half hour will be paid back after building the base image above another 8 times, after all.
Please reach out if you have any questions, or anything needs clarification. If you have actual build improvements, please also reach out (most of this stuff is things I've stumbled over so I'm sure there's other neat stuff).