6 Dockerfile Tips from the Official Images

Following on from my previous post on the Docker Official Images, in this post I’ll go through some tips and techniques for writing Dockerfiles that I learnt from the official images.

1. Prefer Debian

The majority of Dockerfiles for official images are based on Debian, either directly or through another image. The version is usually pegged to a given distribution, normally wheezy (at time of writing, the stable distribution), but several images use jessie (in testing) and even sid (unstable). The main advantage of the Debian image is the smaller size – it clocks in at around 85.1 MB compared to around 200 MB for Ubuntu. Specifying the exact distribution guards against your build breaking when the  distribution tagged latest is upgraded.

2. Establish Provenance

If your users need to be able to rely on and trust your image, you need to consider how to verify the authenticity of any software installed in that image. In the case of apt-getting from official Debian repositories, this is taken care of already. However, if you download files from the internet or install software from a third-party repository, you should verify the files by testing checksums or digital signatures. For example, the nginx Dockerfile does the following to verify the nginx package:

Notice nginx is pegged to a specific version. This is a good idea as it helps to ensure the image tested by the maintainer is the same as the one built by the build system. It isn’t infallible however, as nginx itself is likely to pull in dependencies which may change over time (consider dependencies specified as >= to a given version).

You can do something similar for any files downloaded, by taking a cryptographic sum of the file and testing against a stored version. This is done in the Redis Dockerfile. Also, some downloads will have a signature file which you can test with gpg, which again is commonly done in the official images.

Unfortunately several of the official images fail to do this correctly currently, or only validate some files, so be aware of this when looking at the official Dockerfiles.

3. Remove Build Dependencies

If you compile code from source during your build, it is likely your image is much larger than it needs to be. If possible, try to install the build tools, build the software and remove the build tools all in the same RUN instruction. This is awkward and annoying, but can save 100s of MBs. There is no point in deleting files in a separate instruction as they will already have been bundled into the image. For an example of how to do this, we can look at the Redis Dockerfile again:

gcc, libc and make are installed, used and deleted in this one instruction. Also note the author has deleted the no longer needed tar.gz file source directory. Incidentally, this code also shows how to use sha1sum to verify the checksum of the Redis download.

4. Check Out gosu

The gosu utility is often used in scripts called from ENTRYPOINT instructions inside Dockerfiles for official images. It’s a very simple utility, similar to sudo, that runs a given instruction as a given user. The difference is that gosu avoids sudo‘s “strange and often annoying TTY and signal-forwarding behavior”.

Also check out the official advice on writing entrypoint scripts, which is followed by most official images

5. Consider the buildpack-deps Base Image

Several of the Docker “language-stack” images are based on the buildpack-deps base image, which installs various commonly required development headers and tools (such a source code management tools). If you are building a language-stack image, you may well be able to save some time by using this base image. It has come in for some criticism for adding unnecessary bloat to images which has lead to several repositories such as Node offering alternative slim packages based directly on Debian (the full Node image is 728MB compared to just 291.4MB for slim).  However, remember your users may require the same development libraries and are likely to have downloaded the base image already anyway.

6. Use a Range of Descriptive Tags

All of the official repositories offer a range of tags. As well as a latest tag, it is good form to offer a versioned tag that users can use without fear of the base image changing and breaking their containers. The official images take this further, often offering minimal slim images as discussed above as well as onbuild images which take care of automatically importing and compiling code. By tagging this image onbuild, the user is much less surprised when code is moved about and compiled when creating a child image.

Do you want to learn more? Sign up!

The following two tabs change content below.

Adrian Mouat

Adrian Mouat is Chief Scientist at Container Solutions and the author of the O'Reilly book "Using Docker". He has been a professional software developer for over 10 years, working on a wide range of projects from small webapps to large data mining platforms.

Latest posts by Adrian Mouat (see all)


    • Thanks Bob.

      Busybox and Apline linux are great if you don’t have many dependencies, but I think that’s the exception rather than the rule. Also, it’s worth remembering that most people will already have the Debian image and won’t need to download it again.

      Please don’t follow the advice given in that post on data containers, look at this instead http://container42.com/2014/11/18/data-only-container-madness/. Similarly the advice about ssh is slightly out-of-date, we can now use docker exec.

Leave a Reply

Your email address will not be published. Required fields are marked *