News

Docker: The latest Confusion

One of the most misunderstood parts of Docker seems to be the latest tag. The confusion stems largely from its name, which doesn't really reflect what the tag implies. In this post we'll look at what the latest tag really does and what you should use it for.

There are two ways to tag images: using the docker tag command and passing the -t flag to docker build. In both cases, the argument is of the form repository_name:tag_name e.g. docker tag myrepo:mytag. If the repository is to be uploaded to the Docker Hub, the repository name must be prefixed with a slash and the Docker Hub user name e.g. amouat/myrepo:mytag. The trick is of course, if you leave the tag part out (e.g. _docker tag myrepo:1.0 myrepo), Docker will automatically give it the tag latest. You probably knew all this already, but it's important to realise that this is about as far as it goes -- the latest tag doesn't have any magical powers.

Just because an image is tagged latest, does not mean that it is the most up-to-date image within its repository. It may be, but only if the repository owner has chosen to use this convention. I can easily push an old image to the latest tag e.g:

  
docker images myrepo
(out) REPOSITORY          TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
(out) myrepo              1.0                 2e9f372f03a0        44 seconds ago      2.433 MB
(out) myrepo              latest              2e9f372f03a0        44 seconds ago      2.433 MB
(out) myrepo              0.9                 4986bf8c1536        2 weeks ago         2.433 MB
docker tag -f myrepo:0.9 myrepo:latest
docker images myrepo
(out) REPOSITORY          TAG                 IMAGE ID            CREATED              VIRTUAL SIZE
(out) myrepo              1.0                 2e9f372f03a0        About a minute ago   2.433 MB
(out) myrepo              0.9                 4986bf8c1536        2 weeks ago          2.433 MB
(out) myrepo              latest              4986bf8c1536        2 weeks ago          2.433 MB

And you can see latest is now the same as the 0.9 image from two weeks ago, rather than the 1.0 image from a minute ago.

It's easy to understand why this may surprise people -- consider the sentence "just pull the latest image"; does this mean the image tagged latest, or the newest image in the repository? Are they the same thing? What does it even mean to be the newest image in a repo; is it the newest stable or the newest development image?

More worryingly, some people seem to believe that the latest tag will be automatically updated -- that if I pull an image marked latest, docker will take care of checking it is still the newest version before running it each time. This is emphatically not the case - just as with every other tag, you still need to manually docker pull new versions.

The confusion doesn't end there. What happens if I do a docker pull on a repository without specifying a tag? If you think you get all the images, you're wrong - you just get the one tagged latest. You have to use the -a flag to get all the images. So what happens if you do the pull on a repository with no latest tag? This:

  
docker pull amouat/myrepo
(out) Pulling repository amouat/myrepo
(out) 2015/01/21 12:04:06 Tag latest not found in repository amouat/myrepo

Unsurprisingly, you get an error message. But I bet you weren't sure what was going to happen.

A further annoyance is the way the latest tag hides other tags. Suppose you download the latest tag for debian. Which version is it?

  
docker images debian
(out) REPOSITORY          TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
(out) debian              latest              4d6ce913b130        4 days ago          84.98 MB

Err, no idea. Turns out it's 7.8, aka wheezy:

  
docker pull debian:7.8
(out) debian:7.8: The image you are pulling has been verified
(out) 511136ea3c5a: Already exists
(out) d0a18d3b84de: Already exists
(out) 4d6ce913b130: Already exists
(out) Status: Image is up to date for debian:7.8
docker pull debian:wheezy
(out) debian:wheezy: The image you are pulling has been verified
(out) 511136ea3c5a: Already exists
(out) d0a18d3b84de: Already exists
(out) 4d6ce913b130: Already exists
(out) Status: Image is up to date for debian:wheezy
docker images debian
(out) REPOSITORY          TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
(out) debian              7.8                 4d6ce913b130        4 days ago          84.98 MB
(out) debian              latest              4d6ce913b130        4 days ago          84.98 MB
(out) debian              wheezy              4d6ce913b130        4 days ago          84.98 MB

I would rather Docker set all the tags for the image when it is downloaded, I'm not sure what the rationale is for not doing this. The current method means users can have different versions of images locally for tags that are identical on the server (e.g. if wheezy and latest are updated on the Hub and I pull debian:wheezy, my wheezy tag will be ahead of my latest tag even though they are identical on the Hub).

That just about covers most of the semantics of latest and how it is commonly misunderstood. So how could the situation be improved? Personally, I would like to see the latest tag deprecated and replaced with a name that more closely matches its semantics, such as "default". I would also like to see a few tweaks to the way tagging works, such as updating all the tags for an image at the same time. In the meantime however, I would strongly advise repository owners to be wary of the latest tag and consider not using it at all.

Comments
Leave your Comment