Tuesday, March 28, 2017

Docker From the Ground Up: Building Images

Docker From the Ground Up: Building Images

Docker containers are on the rise as a best practice for deploying and managing cloud-native distributed systems. Containers are instances of Docker images. It turns out that there is a lot to know and understand about images. 

In this two-part tutorial, I'm covering Docker images in depth. In part one I discussed the basic principles, design considerations, and inspecting image internals. In this part, I cover building your own images, troubleshooting, and working with image repositories. 

When you come out on the other side, you'll have a solid understanding of what Docker images are exactly and how to utilize them effectively in your own applications and systems.

Building Images

There are two ways to build images. You can modify and existing container and then commit it as a new image, or you can write a Dockerfile and build it to an image. We'll go over both and explain the pros and cons.

Manual Builds

With manual builds, you treat your container like a regular computer. You install packages, you write files, and when it's all said and done, you commit it and end up with a new image that you use as a template to create many more identical containers or even base other images on.

Let's start with the alpine image, which is a very small and spartan image based on Alpine Linux. We can run it in interactive mode to get into a shell. Our goal is to add a file called "yeah" that contains the text "it works!" to the root directory and then create a new image from it called "yeah-alpine". 

Here we go. Nice, we're already in the root dir. Let's see what's there.

What editor is available? No vim, no nano?

Oh, well. We just want to create a file:

I exited from the interactive shell, and I can see the container named "vibrant_spenc" with docker ps --all. The --all flag is important because the container is not running anymore.

Here, I create a new image from the "vibrate_spence" container. I added the commit message "mine, mine, mine" for good measure.

Let's check it out. Yep, there is a new image, and in its history you can see a new layer with the "mine, mine, mine" comment.

Now for the real test. Let's delete the container and create a new container from the image. The expected result is that the "yeah" file will be present in the new container.

What can I say? Yeah, it works!

Using a Dockerfile

Creating images out of modified containers is cool, but there is no accountability. It's hard to keep track of the changes and know what the specific modifications were. The disciplined way to create images is to build them using a Dockerfile.

The Dockerfile is a text file that is similar to a shell script, but it supports several commands. Every command that modifies the file system creates a new layer. In part one we discussed the importance of dividing your image into layers properly. The Dockerfile is a big topic in and of itself. 

Here, I'll just demonstrate a couple of commands to create another image, "oh-yeah-alpine", based on a Dockerfile. In addition to creating the infamous "yeah" file, let's also install vim. The alpine Linux distribution uses a package management system called "apk". Here is the Dockerfile:

The base image is alpine. It copies the "yeah" file from the same host directory where the Dockerfile is (the build context path). Then, it runs apk update and installs vim. Finally, it sets the command that is executed when the container runs. In this case it will print to the screen the content of the "yeah" file.

OK. Now that we know what we're getting into, let's build this thing. The "-t" option sets the repository. I didn't specify a tag, so it will be the default "latest".

Looks good. Let's verify the image was created:

Note how installing vim and its dependencies bloated the size of the container from the 4.8MB of the base alpine image to a massive 30.5MB!

It's all very nice. But does it work?

Oh yeah, it works!

In case you're still suspicious, let's go into the container and examine the "yeah" file with our freshly installed vim.

The Build Context and the .dockerignore file

I didn't tell you, but originally when I tried to build the oh-yeah-alpine image, it just hung for several minutes. The issue was that I just put the Dockerfile in my home directory. When Docker builds an image, it first packs the whole directory where the Dockerfile is (including sub-directories) and makes it available for COPY commands in the Dockerfile. 

Docker is not trying to be smart and analyze your COPY commands. It just packs the whole thing. Note that the build content will not end in your image, but it will slow down your build command if your build context is unnecessarily large.

In this case, I simply copied the Dockerfile and the "yeah" into a sub-directory and ran the docker build command in that sub-directory. But sometimes you have a complicated directory tree from which you want to copy specific sub-directories and files and ignore others. Enter the .dockerignore file. 

This file lets you control exactly what goes into the build context. My favorite trick is to first exclude everything and then start including the bits and pieces I need. For example, in this case I could create the following .dockerignore file and keep the Docker file  and the "yeah" in my home directory:

There is no need to include the "Dockerfile" itself or the ".dockerignore" file in the build context.

Copying vs. Mounting

Copying files into the image is sometimes what you need, but in other cases you may want your containers to be more dynamic and work with files on the host. This is where volumes and mounts come into play. 

Mounting host directories is a different ball game. The data is owned by the host and not by the container. The data can be modified when the container is stopped. The same container can be started with different host directories mounted.

Tagging Images

Tagging images is very important if you develop a microservices-based system and you generate a lot of images that must be sometimes associated with each other. You can add as many tags as you want to an image. 

You've already seen the default "latest" tag. Sometimes, it makes sense to add other tags, like "tested", "release-1.4", or the git commit that corresponds to the image.

You can tag an image during a build or later. Here's how to add a tag to an existing image. Note that while it's called a tag, you can also assign a new repository.

You can also untag by removing an image by its tag name. This is a little scary because if you remove the last tag by accident, you lose the image. But if you build images from a Dockerfile, you can just rebuild the image.

If I try to remove the last remaining tagged image, I get an error because it is used by a container.

But if I remove the container...

Yep. It's gone. But don't worry. We can rebuild it:

Yay, it's back. Dockerfile for the win!

Working With Image Registries

Images are very similar in some respects to git repositories. They are also built from an ordered set of commits. You can think of two images that use the same base images as branches (although there is no merging or rebasing in Docker). An image registry is the equivalent of a central git hosting service like GitHub. Guess what's the name of the official Docker image registry? That's right, Docker Hub

Pulling Images

When you run an image, if it doesn't exist, Docker will try to pull it from one of your configured image registries. By default it goes to Docker Hub, but you can control it in your "~/.docker/config.json" file. If you use a different registry, you can follow their instructions, which typically involve logging in using their credentials.

Let's delete the "hello-world" image and pull it again using the docker pull command.

It's gone. Let's pull now.

The latest hello-world was replaced with a newer version.

Pushing Images

Pushing images is a little more involved. First you need to create an account on Docker Hub (or other registry). Next, you log in. Then you need to tag the image you want to push according to your account name ("g1g1" in my case).

Now, I can push the g1g1/hello-world tagged image.

Conclusion

Docker images are the templates to your containers. They are designed to be efficient and offer maximum reuse by using a layering file system storage driver. 

Docker provides a lot of tools for listing, inspecting, building and tagging images. You can pull and push images to image registries like Docker Hub to easily manage and share your images.


No comments:

Post a Comment