1. Process Isolation

In this demo, we’ll illustrate:

  • What containerized process IDs look like inside versus outside of a kernel namespace

  • How to impose control group limitations on CPU and memory consumption of a containerized process.

Exploring the PID Kernel Namespace

Start a simple container we can explore:

[user@node ~]$ docker container run -d --name pinger centos:7 ping 8.8.8.8

Use docker container exec to launch a child process inside the container’s namespaces:

[user@node ~]$ docker container exec pinger ps -aux

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.1  0.0  24860  1884 ?        Ss   02:20   0:00 ping 8.8.8.8
root         5  0.0  0.0  51720  3504 ?        Rs   02:20   0:00 ps -aux

Run the same ps directly on the host, and search for your ping process:

[user@node ~]$ ps -aux | grep ping

USER      PID %CPU %MEM    VSZ   RSS TTY     STAT START   TIME COMMAND
root    11622  0.0  0.0  24860  1884 ?       Ss   02:20   0:00 ping 8.8.8.8
centos  11839  0.0  0.0 112656  2132 pts/0   S+   02:23   0:00 grep --color=auto ping

The ping process appears as PID 1 inside the container, but as some higher PID (11622 in this example) from outside the container.

List your containers to show this ping container is still running:

[user@node ~]$ docker container ls

CONTAINER ID  IMAGE     COMMAND         ...  STATUS        ...  NAMES
bb3a3b1cbb78  centos:7  "ping 8.8.8.8"  ...  Up 6 minutes       pinger

Kill the ping process by host PID, and show the container has stopped:

[user@node ~]$ sudo kill -9 [host PID of ping]
[user@node ~]$ docker container ls

CONTAINER ID  IMAGE     COMMAND         ...  STATUS        ...  NAMES

Killing the ping process on the host also kills the container - all a running container is is its PID 1 process, and the kernel tooling that isolates it from the host. Note using kill -9 is just for demonstration purposes here; never stop containers this way.

Imposing Resource Limitations With Cgroups

Start a container that consumes two full CPUs:

[user@node ~]$ docker container run -d training/stress:3.0 --vm 2

Here the --vm flag starts 2 dummy processes that allocate and free memory as fast as they can, each consuming as many CPU cycles as possible.

Check the CPU consumption of processes in the container:

[user@node ~]$ docker container top <container ID>

UID     PID     PPID    C   ...   CMD
root    5806    5789    0   ...   /usr/bin/stress --verbose --vm 2
root    5828    5806    99  ...   /usr/bin/stress --verbose --vm 2
root    5829    5806    99  ...   /usr/bin/stress --verbose --vm 2

That C column represents CPU consumption, in percent; this container is hogging two full CPUs! See the same thing by running ps -aux both inside and outside this container, like we did above; the same process and its CPU utilization is visible inside and outside the container:

[user@node ~]$ docker container exec <container ID> ps -aux

USER       PID %CPU %MEM   ...   COMMAND
root         1  0.0  0.0   ...   /usr/bin/stress --verbose --vm 2
root         5 98.9  6.4   ...   /usr/bin/stress --verbose --vm 2
root         6 99.0  0.4   ...   /usr/bin/stress --verbose --vm 2
root         7  2.0  0.0   ...   ps -aux

And on the host directly, via the PIDs we found from docker container top above:

[user@node ~]$ ps -aux | grep <PID>

USER       PID %CPU %MEM   ...   COMMAND
root      5828 99.3  4.9   ...   /usr/bin/stress --verbose --vm 2
centos    6327  0.0  0.0   ...   grep --color=auto 5828

Kill off this container:

[user@node ~]$ docker container rm -f <container ID>

This is the right way to kill and remove a running container (not kill -9).

Run the same container again, but this time with a cgroup limitation on its CPU consumption:

[user@node ~]$ docker container run -d --cpus="1" training/stress:3.0 --vm 2

Do docker container top and ps -aux again, just like above; you’ll see the processes taking up half a CPU each, for a total of 1 CPU consumed. The --cpus="1" flag has imposed a control group limitation on the processes in this container, constraining them to consume a total of no more than one CPU.

Find the host PID of a process running in this container using docker container top again, and then see what cgroups that process lives in on the host:

[user@node ~]$ cat /proc/<host PID of containerized process>/cgroup

12:memory:/docker/31d03...
11:freezer:/docker/31d03...
10:hugetlb:/docker/31d03...
9:perf_event:/docker/31d03...
8:net_cls,net_prio:/docker/31d03...
7:cpuset:/docker/31d03...
6:pids:/docker/31d03...
5:blkio:/docker/31d03...
4:rdma:/
3:devices:/docker/31d03...
2:cpu,cpuacct:/docker/31d03...
1:name=systemd:/docker/31d03...

Get a summary of resources consumed by processes in a control group via systemd-cgtop:

[user@node ~]$ systemd-cgtop

Path                Tasks   %CPU     Memory  Input/s    Output/s

/                   68      112.3    1.0G    -          -
/docker             -       99.3     301.0M  -          -
/docker/31d03...    3       99.3     300.9M  -          -
...

Here again we can see that the processes living in the container’s control group (/docker/31d03…​) are constrained to take up only about 1 CPU.

Remove this container, spin up a new one that creates a lot of memory pressure, and check its resource consumption with docker stats:

[user@node ~]$ docker container rm -f <container ID>
[user@node ~]$ docker container run -d training/stress:3.0 --vm 2 --vm-bytes 1024M
[user@node ~]$ docker stats

CONTAINER           CPU %               MEM USAGE / LIMIT     MEM %    ...
b29a6d877343        198.94%             937.2MiB / 3.854GiB   23.75%   ...

Kill this container off, start it again with a memory constraint, and list your containers:

[user@node ~]$ docker container rm -f <container ID>
[user@node ~]$ docker container run \
    -d -m 256M training/stress:3.0 --vm 2 --vm-bytes 1024M
[user@node ~]$ docker container ls -a

CONTAINER ID        IMAGE                 ...  STATUS
296c8f76af5c        training/stress:3.0   ...  Exited (1) 26 seconds ago

It exited immediately this time.

Inspect the metadata for this container, and look for the OOMKilled key:

[user@node ~]$ docker container inspect <container ID> | grep 'OOMKilled'

        "OOMKilled": true,

When the containerized process tried to exceed its memory limitation, it gets killed with an Out Of Memory exception.

Conclusion

In this demo, we explored some of the most important technologies that make containerization possible: kernel namespaces and control groups. The core message here is that containerized processes are just processes running on their host, isolated and constrained by these technologies. All the tools and management strategies you would use for conventional processes apply just as well for containerized processes.

2. Creating Images

In this demo, we’ll illustrate:

  • How to read each step of the image build output

  • How intermediate image layers behave in the cache and as independent images

  • What the meanings of 'dangling' and <missing>` image layers are

Understanding Image Build Output

Make a folder demo` for our image demo:

[user@node ~]$ mkdir demo ; cd demo
And create a Dockerfile therein with the following content:

FROM centos:7
RUN yum update -y
RUN yum install -y which
RUN yum install -y wget
RUN yum install -y vim

Build your image from your Dockerfile, just like we did in the last exercise:

[user@node demo]$ docker image build -t demo .

Examine the output from the build process. The very first line looks like:

Sending build context to Docker daemon  2.048kB

Here the Docker daemon is archiving everything at the path specified in the docker image build command (. or the current directory in this example). This is why we made a fresh directory demo to build in, so that nothing extra is included in this process.

The next lines look like:

Step 1/5 : FROM centos:7
 ---> 49f7960eb7e4

Do an image ls:

[user@node demo]$ docker image ls
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
demo                latest              59e595750dd5        10 seconds ago      645MB
centos              7                   49f7960eb7e4        2 months ago        200MB

Notice the Image ID for centos:7 matches that second line in the build output. The build starts from the base image defined in the FROM` command.

The next few lines look like:

Step 2/5 : RUN yum update -y
 ---> Running in 8734b14cf011
Loaded plugins: fastestmirror, ovl
...

This is the output of the RUN command, yum update -y. The line Running in 8734b14cf011 specifies a container that this command is running in, which is spun up based on all previous image layers (just the centos:7 base at the moment). Scroll down a bit and you should see something like:

 ---> 433e56d735f6
Removing intermediate container 8734b14cf011

At the end of this first RUN command, the temporary container 8734b14cf011 is saved as an image layer 433e56d735f6, and the container is removed. This is the exact same process as when you used docker container commit to save a container as a new image layer, but now running automatically as part of a Dockerfile build.

Look at the history of your image:

[user@node demo]$ docker image history demo

IMAGE          CREATED         CREATED BY                                      SIZE
59e595750dd5   2 minutes ago   /bin/sh -c yum install -y vim                   142MB
bba17f8df167   2 minutes ago   /bin/sh -c yum install -y wget                  87MB
b9f2efa616de   2 minutes ago   /bin/sh -c yum install -y which                 86.6MB
433e56d735f6   2 minutes ago   /bin/sh -c yum update -y                        129MB
49f7960eb7e4   2 months ago    /bin/sh -c #(nop)  CMD ["/bin/bash"]            0B
<missing>      2 months ago    /bin/sh -c #(nop)  LABEL org.label-schema....   0B
<missing>      2 months ago    /bin/sh -c #(nop) ADD file:8f4b3be0c1427b1...   200MB

As you can see, the different layers of demo correspond to a separate line in the Dockerfile and the layers have their own ID. You can see the image layer 433e56d735f6 committed in the second build step in the list of layers for this image.

Look through your build output for where steps 3/5 (installing which), 4/5 (installing wget), and 5/5 (installing vim) occur - the same behavior of starting a temporary container based on the previous image layers, running the RUN command, saving the container as a new image layer visible in your docker iamge history output, and deleting the temporary container is visible.

Every layer can be used as you would use any image, which means we can inspect a single layer. Let’s inspect the wget layer, which in my case is bba17f8df167 (yours will be different, look at your docker image history output):

[user@node demo]$ docker image inspect bba17f8df167

Let’s look for the command associated with this image layer by using --format:

[user@node demo]$ docker image inspect \
    --format='{{.ContainerConfig.Cmd}}' bba17f8df167

[/bin/sh -c yum install -y wget]

We can even start containers based on intermediate image layers; start an interactive container based on the wget layer, and look for whether wget and vim are installed:

[user@node demo]$ docker container run -it bba17f8df167 bash

[root@a766a3d616b7 /]# which wget

/usr/bin/wget

[root@a766a3d616b7 /]# which vim

/usr/bin/which: no vim in
    (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin)

wget is installed in this layer, but since vim didn’t arrive until the next layer, it’s not available here.

Managing Image Layers

Change the last line in the Dockerfile from the last section to install nano instead of vim:

FROM centos:7
RUN yum update -y
RUN yum install -y which
RUN yum install -y wget
RUN yum install -y nano

Rebuild your image, and list your images again:

[user@node demo]$ docker image build -t demo .
[user@node demo]$ docker image ls

REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
demo                latest              5a6aedc1feab        8 seconds ago       590MB
<none>              <none>              59e595750dd5        23 minutes ago      645MB
centos              7                   49f7960eb7e4        2 months ago        200MB

What is that image named <none>? Notice the image ID is the same as the old image ID for demo:latest (see your history output above). The name and tag of an image is just a pointer to the stack of layers that make it up; reuse a name and tag, and you are effectively moving that pointer to a new stack of layers, leaving the old one (the one containing the vim install in this case) as an untagged or 'dangling' image.

Rewrite your Dockerfile one more time, to combine some of those install steps:

FROM centos:7
RUN yum update -y
RUN yum install -y which wget nano

Rebuild using a new tag this time, and list your images one more time:

[user@node demo]$ docker image build -t demo:new .
...
[user@node demo]$ docker image ls

REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
demo                new                 568b29a0dce9        20 seconds ago      416MB
demo                latest              5a6aedc1feab        5 minutes ago       590MB
<none>              <none>              59e595750dd5        28 minutes ago      645MB
centos              7                   49f7960eb7e4        2 months ago        200MB

Image demo:new is much smaller in size than demo:latest, even though it contains the exact same software - why?

Conclusion

In this demo, we explored the layered structure of images; each layer is built as a distinct image and can be treated as such, on the host where it was built. This information is preserved on the build host for use in the build cache; build another image based on the same lower layers, and they will be reused to speed up the build process. Notice that the same is not true of downloaded images like centos:7; intermediate image caches are not downloaded, but rather only the final complete image.

3. Basic Volume Usage

In this demo, we’ll illustrate:

  • Creating, updating, destroying, and mounting docker named volumes

  • How volumes interact with a container’s layered filesystem

  • Usecases for mounting host directories into a container

Using Named Volumes

Create a volume, and inspect its metadata:

[user@node ~]$ docker volume create demovol
[user@node ~]$ docker volume inspect demovol

[
    {
        "CreatedAt": "2018-11-03T19:07:56Z",
        "Driver": "local",
        "Labels": {},
        "Mountpoint": "/var/lib/docker/volumes/demovol/_data",
        "Name": "demovol",
        "Options": {},
        "Scope": "local"
    }
]

We can see that by default, named volumes are created under /var/lib/docker/volumes/<name>/_data.

Run a container that mounts this volume, and list the filesystem therein:

[user@node ~]$ docker container run -it -v demovol:/demo centos:7 bash
[root@f4aca1b60965 /]# ls
anaconda-post.log  bin  demo  dev  etc  home ...

The demo directory is created as the mountpoint for our volume, as specified in the flag -v demovol:/demo. This should also appear in your container filesystem’s list of mountpoints:

[root@f4aca1b60965 /]# cat /proc/self/mountinfo | grep demo

1199 1180 202:1 /var/lib/docker/volumes/demovol/_data /demo
    rw,relatime - xfs /dev/xvda1 ...

Put a file in this volume:

[root@f4aca1b60965 /]# echo 'dummy file' > /demo/mydata.dat

Exit the container, and list the contents of your volume on the host:

[user@node ~]$ sudo ls /var/lib/docker/volumes/demovol/_data

You’ll see your mydata.dat file present at this point in the host’s filesystem. Delete the container:

[user@node ~]$ docker container rm -f <container ID>

The volume and its contents will still be present on the host.

Start a new container mounting the same volume, attach a bash shell to it, and show that the old data is present in your new container:

[user@node ~]$ docker container run -d -v demovol:/demo centos:7 ping 8.8.8.8
[user@node ~]$ docker container exec -it <container ID> bash
[root@11117d3de672 /]# cat /demo/mydata.dat

Exit this container, and inspect its mount metadata:

[user@node ~]$ docker container inspect <container ID>

    "Mounts": [
        {
            "Type": "volume",
            "Name": "demovol",
            "Source": "/var/lib/docker/volumes/demovol/_data",
            "Destination": "/demo",
            "Driver": "local",
            "Mode": "z",
            "RW": true,
            "Propagation": ""
        }
    ],

Here too we can see the volumes and host mountpoints for everything mounted into this container.

Build a new image out of this container using docker container commit, and start a new container based on that image:

[user@node ~]$ docker container commit <container ID> demo:snapshot
[user@node ~]$ docker container run -it demo:snapshot bash
[root@ad62f304ba18 /]# cat /demo/mydata.dat
cat: /demo/mydata.dat: No such file or directory

The information mounted into the original container is not part of the container’s layered filesystem, and therefore is not captured in the image creation process; volume mounts and the layered filesystem are completely separate.

Clean up by removing that volume:

[user@node ~]$ docker volume rm demovol

You will get an error saying the volume is in use - docker will not delete a volume mounted to any container (even a stopped container) in this way. Remove the offending container first, then remove the volume again.

Mounting Host Paths

Make a directory with some source code in it for your new website:

[user@node ~]$ mkdir /home/centos/myweb
[user@node ~]$ cd /home/centos/myweb
[user@node myweb]$ echo "<h1>Hello Wrld</h1>" > index.html

Start up an nginx container that mounts this as a static website:

[user@node myweb]$ docker container run -d \
    -v /home/centos/myweb:/usr/share/nginx/html \
    -p 8000:80 nginx

Visit your website at the public IP of this node, port 8000.

Fix the spelling of 'world' in your HTML, and refresh the webpage; the content served by nginx gets updated without having to restart or replace the nginx container.

Conclusion

In this demo, we saw two key points about volumes: they exist outside the container’s layered filesystem, meaning that not only are they not captured on image creation, they don’t participate in the usual copy on write procedure when manipulating files in the writable container layer. Second, we saw that manipulating files on the host that have been mounted into a container immediately propagates those changes to the running container; this is a popular technique for developers who containerize their running environment, and mount in their in-development code so they can edit their code using the tools on their host machine that they are familiar with, and have those changes immediately available inside a running container without having to restart or rebuild anything.

4. Single Host Networks

In this demo, we’ll illustrate:

  • Creating docker bridge networks

  • Attaching containers to docker networks

  • Inspecting networking metadata from docker networks and containers

  • How network interfaces appear in different network namespaces

  • What network interfaces are created on the host by docker networking

  • What iptables rules are created by docker to isolate docker software-defined networks and forward network traffic to containers

Following Default Docker Networking

Switch to a fresh node you haven’t run any containers on yet, list your networks:

[centos@node-1 ~]$ docker network ls

NETWORK ID          NAME                DRIVER              SCOPE
7c4e63830cbf        bridge              bridge              local
c87d2a849036        host                host                local
902af00d5511        none                null                local

Get some metadata about the bridge network, which is the default network containers attach to when doing docker container run:

[centos@node-1 ~]$ docker network inspect bridge

Notice the IPAM section:

"IPAM": {
    "Driver": "default",
    "Options": null,
    "Config": [
        {
            "Subnet": "172.17.0.0/16",
            "Gateway": "172.17.0.1"
        }
    ]
}

Docker’s IP address management driver assigns a subnet (172.17.0.0/16 in this case) to each bridge network, and uses the first IP in that range as the network’s gateway.

Also note the containers key:

"Containers": {}

So far, no containers have been plugged into this network.

Have a look at what network interfaces are present on this host:

[centos@node-1 ~]$ ip addr

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP qlen 1000
    link/ether 12:eb:dd:4e:07:ec brd ff:ff:ff:ff:ff:ff
    inet 10.10.17.74/20 brd 10.10.31.255 scope global dynamic eth0
       valid_lft 2444sec preferred_lft 2444sec
    inet6 fe80::10eb:ddff:fe4e:7ec/64 scope link
       valid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN
    link/ether 02:42:e2:c5:a4:6b brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever

We see the usual eth0 and loopback interfaces, but also the docker0 linux bridge, which corresponds to the docker software defined network we were inspecting in the previous step; note it has the same gateway IP as we found when doing docker network inspect.

Create a docker container without specifying any networking parameters, and do the same docker network inspect as above:

[centos@node-1 ~]$ docker container run -d centos:7 ping 8.8.8.8
[centos@node-1 ~]$ docker network inspect bridge

...
"Containers": {
    "f4e8f3f1b918900dd8c9b8867aa3c81e95cf34aba7e366379f2a9ade9987a40b": {
        "Name": "zealous_kirch",
        "EndpointID": "f9f246a...",
        "MacAddress": "02:42:ac:11:00:02",
        "IPv4Address": "172.17.0.2/16",
        "IPv6Address": ""
    }
}
...

The Containers key now contains the metadata for the container you just started; it received the next available IP address from the default network’s subnet. Also note that the last four digits of the container’s MAC address are the same as its IP on this network - this encoding ensures containers get a locally unique MAC address that linux bridges can route traffic to.

Look at your network interfaces again:

[centos@node-1 ~]$ ip addr

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP qlen 1000
    link/ether 12:eb:dd:4e:07:ec brd ff:ff:ff:ff:ff:ff
    inet 10.10.17.74/20 brd 10.10.31.255 scope global dynamic eth0
       valid_lft 2188sec preferred_lft 2188sec
    inet6 fe80::10eb:ddff:fe4e:7ec/64 scope link
       valid_lft forever preferred_lft forever
3: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
    link/ether 02:42:e2:c5:a4:6b brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:e2ff:fec5:a46b/64 scope link
       valid_lft forever preferred_lft forever
5: vethfbd45f0@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
    master docker0 state UP
    link/ether 6e:3c:e4:21:7b:e2 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::6c3c:e4ff:fe21:7be2/64 scope link
       valid_lft forever preferred_lft forever

A new interface has appeared: interface number 5 is the veth connection connecting the container’s network namespace to the host’s network namespace. But, what happened to interface number 4? It’s been skipped in the list.

Look closely at interface number 5:

5: vethfbd45f0@if4

That @if4 indicates that interface number 5 is connected to interface 4. In fact, these are the two endpoints of the veth connection mentioned above; each end of the connection appears as a distinct interface, and ip addr only lists the interfaces in the current network namespace (the host in the above example).

Look at the interfaces in your container’s network namespace (you’ll first need to connect to the container and install iproute):

[centos@node-1 ~]$ docker container exec -it <container ID> bash
[root@f4e8f3f1b918 /]# yum install -y iproute
...
[root@f4e8f3f1b918 /]# ip addr

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue
    state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
4: eth0@if5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
    state UP group default
    link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.17.0.2/16 scope global eth0
       valid_lft forever preferred_lft forever

Not only does interface number 4 appear inside the container’s network namespace connected to interface 5, but we can see that this veth endpoint inside the container is getting treated as the eth0 interface inside the container.

Establishing Custom Docker Networks

Create a custom bridge network:

[centos@node-1 ~]$ docker network create my_bridge
[centos@node-1 ~]$ docker network ls

NETWORK ID          NAME                DRIVER              SCOPE
7c4e63830cbf        bridge              bridge              local
c87d2a849036        host                host                local
a04d46bb85b1        my_bridge           bridge              local
902af00d5511        none                null                local

my_bridge gets created as another linux bridge-based network by default.

Run a couple of containers named c2 and c3 attached to this new network:

[centos@node-1 ~]$ docker container run \
    --name c2 --network my_bridge -d centos:7 ping 8.8.8.8
[centos@node-1 ~]$ docker container run \
    --name c3 --network my_bridge -d centos:7 ping 8.8.8.8

Inspect your new bridge:

[centos@node-1 ~]$ docker network inspect my_bridge

...
"IPAM": {
    "Driver": "default",
    "Options": {},
    "Config": [
        {
            "Subnet": "172.18.0.0/16",
            "Gateway": "172.18.0.1"
        }
    ]
},
...
"Containers": {
    "084caf415784fb4d58dc6fb4601321114b93dc148793fd66c95fc2c9411b085e": {
        "Name": "c3",
        "EndpointID": "8046005...",
        "MacAddress": "02:42:ac:12:00:03",
        "IPv4Address": "172.18.0.3/16",
        "IPv6Address": ""
    },
    "23d2e307325ec022ce6b08406bfb0f7e307fa533a7a4957a6d476c170d8e8658": {
        "Name": "c2",
        "EndpointID": "730ac71...",
        "MacAddress": "02:42:ac:12:00:02",
        "IPv4Address": "172.18.0.2/16",
        "IPv6Address": ""
    }
},
...

The next subnet in sequence (172.18.0.0/16 in my case) has been assigned to my_bridge by the IPAM driver, and containers attached to this network get IPs from this range exactly as they did with the default bridge network.

Try to contact container c3 from c2:

[centos@node-1 ~]$ docker container exec c2 ping c3

It works - containers on the same custom network are able to resolve each other via DNS lookup of container names. This means that our application logic (c2 ping c3 in this simple case) doesn’t have to do any of its own service discovery; all we need to know are container names, and docker does the rest.

Start another container on my_bridge, but don’t name it:

[centos@node-1 ~]$ docker container run --network my_bridge -d centos:7 ping 8.8.8.8
[centos@node-1 ~]$ docker container ls

CONTAINER ID        IMAGE     ... STATUS           PORTS   NAMES
625cb95b922d        centos:7  ... Up 2 seconds             competent_leavitt
084caf415784        centos:7  ... Up 5 minutes             c3
23d2e307325e        centos:7  ... Up 5 minutes             c2
f4e8f3f1b918        centos:7  ... Up 21 minutes            zealous_kirch

As usual, it got a default name generated for it (competent_leavitt in my case). Try resolving this name by DNS as above:

[centos@node-1 ~]$ docker container exec c2 ping competent_leavitt

ping: competent_leavitt: Name or service not known

DNS resolution fails. Containers must be explicitly named in order to appear in docker’s DNS tables.

Find the IP of your latest container (competent_leavitt in my case) via docker container inspect, and ping it from c2 directly by IP:

[centos@node-1 ~]$ docker network inspect my_bridge

...
"625cb95b922d2502fd016c6517c51652e84f902f69632d5d399dc38f3f7b2711": {
    "Name": "competent_leavitt",
    "EndpointID": "2fdb093d97b23da43023b07338a329180995fc0564ed0762147c8796380c51e7",
    "MacAddress": "02:42:ac:12:00:04",
    "IPv4Address": "172.18.0.4/16",
    "IPv6Address": ""
}
...

[centos@node-1 ~]$ docker container exec c2 ping 172.18.0.4

PING 172.18.0.4 (172.18.0.4) 56(84) bytes of data.
64 bytes from 172.18.0.4: icmp_seq=1 ttl=64 time=0.083 ms
64 bytes from 172.18.0.4: icmp_seq=2 ttl=64 time=0.060 ms

The ping succeeds. While the default-named container isn’t resolvable by DNS, it is still reachable on the my_bridge network.

Finally, create container c1 attached to the default network:

[centos@node-1 ~]$ docker container run --name c1 -d centos:7 ping 8.8.8.8

Attempt to ping it from c2 by name:

[centos@node-1 ~]$ docker container exec c2 ping c1
ping: c1: Name or service not known

DNS resolution is scoped to user-defined docker networks. Find c1 's IP manually as above (mine is at 172.17.0.3), and ping this IP directly from c2:

[centos@node-1 ~]$ docker container exec c2 ping 172.17.0.3

The request hangs until it times out (press CTRL+C to give up early if you don’t want to wait for the timeout). Different docker networks are firewalled from each other by default; dump your iptables rules and look for lines similar to the following:

[centos@node-1 ~]$ sudo iptables-save

...
-A DOCKER-ISOLATION-STAGE-1 -i br-dfda80f70ea5
    ! -o br-dfda80f70ea5 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN
-A DOCKER-ISOLATION-STAGE-2 -o br-dfda80f70ea5 -j DROP
-A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP
-A DOCKER-ISOLATION-STAGE-2 -j RETURN
...

The first line above forwards traffic originating from br-dfda80f70ea5 (that’s your custom bridge) but destined somewhere else to the stage 2 isolation chain, where if it is destined for the docker0 bridge, it gets dropped, preventing traffic from going from one bridge to another.

Forwarding a Host Port to a Container

Start an nginx container with a port exposure:

[centos@node-1 ~]$ docker container run -d -p 8000:80 nginx

This syntax asks docker to forward all traffic arriving on port 8000 of the host’s network namespace to port 80 of the container’s network namespace. Visit the nginx landing page at <node-1 public IP>:8000.

Inspect your iptables rules again to see how docker forwarded this traffic:

[centos@node-1 ~]$ sudo iptables-save | grep 8000

-A DOCKER ! -i docker0 -p tcp -m tcp --dport 8000
    -j DNAT --to-destination 172.17.0.4:80

Inspect your default bridge network to find the IP of your nginx container; you should find that it matches the IP in the network address translation rule above, which states that any traffic arriving on port tcp/8000 on the host should be network address translated to 172.17.0.4:80 - the IP of our nginx container and the port we exposed with the -p 8000:80 flag when we created this container.

Clean up your containers and networks:

[centos@node-1 ~]$ docker container rm -f $(docker container ls -aq)
[centos@node-1 ~]$ docker network rm my_bridge

Conclusion

In this demo, we stepped through the basic behavior of docker software defined bridge networks, and looked at the technology underpinning them such as linux bridges, veth connections, and iptables rules. From a practical standpoint, in order for containers to communicate they must be attached to the same docker software defined network (otherwise they’ll be firewalled from each other by the cross-network iptables rules we saw), and in order for containers to resolve each other’s name by DNS, they must also be explicitly named upon creation.

5. Docker Compose

In this demo, we’ll illustrate:

  • Starting an app defined in a docker compose file

  • Inter-service communication using DNS resolution of service names

Exploring the Compose File

Please download the DockerCoins app from Github and change directory to ~/orchestration-workshop/dockercoins.

[user@node ~]$ git clone -b ee3.0 \
    https://github.com/docker-training/orchestration-workshop.git
[user@node ~]$ cd ~/orchestration-workshop/dockercoins

Let’s take a quick look at our Compose file for Dockercoins:

version: "3.1"

services:
rng:
    image: training/dockercoins-rng:1.0
    networks:
    - dockercoins
    ports:
    - "8001:80"

hasher:
    image: training/dockercoins-hasher:1.0
    networks:
    - dockercoins
    ports:
    - "8002:80"

webui:
    image: training/dockercoins-webui:1.0
    networks:
    - dockercoins
    ports:
    - "8000:80"

redis:
    image: redis
    networks:
    - dockercoins

worker:
    image: training/dockercoins-worker:1.0
    networks:
    - dockercoins

networks:
    dockercoins:

This Compose file contains 5 services, along with a bridge network.

When we start the app, we will see the service images getting downloaded one at a time:

[user@node dockercoins]$ docker-compose up -d

After starting, the images required for this app have been downloaded:

[user@node dockercoins]$ docker image ls | grep "dockercoins"

Make sure the services are up and running, as is the dedicated network:

[user@node dockercoins]$ docker-compose ps
[user@node dockercoins]$ docker network ls

If everyting is up, visit your app at <node-0 public IP>:8000 to see Dockercoins in action.

Communicating Between Containers

In this section, we’ll demonstrate that containers created as part of a service in a Compose file are able to communicate with containers belonging to other services using just their service names. Let’s start by listing our DockerCoins containers:

[user@node dockercoins]$ docker container ls | grep 'dockercoins'

Now, connect into one container; let’s pick webui:

[user@node dockercoins]$ docker container exec -it <Container ID> bash

From within the container, ping rng by name:

[root@<Container ID>]# ping rng

Logs should be outputted resembling this:

PING rng (172.18.0.5) 56(84) bytes of data.
64 bytes from dockercoins_rng_1... (172.18.0.5): icmp_seq=1 ttl=64 time=0.108 ms
64 bytes from dockercoins_rng_1... (172.18.0.5): icmp_seq=2 ttl=64 time=0.049 ms
64 bytes from dockercoins_rng_1... (172.18.0.5): icmp_seq=3 ttl=64 time=0.073 ms
64 bytes from dockercoins_rng_1... (172.18.0.5): icmp_seq=4 ttl=64 time=0.067 ms
64 bytes from dockercoins_rng_1... (172.18.0.5): icmp_seq=5 ttl=64 time=0.057 ms
64 bytes from dockercoins_rng_1... (172.18.0.5): icmp_seq=6 ttl=64 time=0.074 ms
64 bytes from dockercoins_rng_1... (172.18.0.5): icmp_seq=7 ttl=64 time=0.052 ms
64 bytes from dockercoins_rng_1... (172.18.0.5): icmp_seq=8 ttl=64 time=0.057 ms
64 bytes from dockercoins_rng_1... (172.18.0.5): icmp_seq=9 ttl=64 time=0.080 ms

Use CTRL+C to terminate the ping. DNS lookup for the services in DockerCoins works because they are all attached to the user-defined dockercoins network.

After exiting this container, let’s navigate to the worker folder and take a look at a section of worker.py:

[user@node dockercoins]$ cd worker
[user@node ~ worker]$ cat worker.py

import logging
import os
from redis import Redis
import requests
import time

DEBUG = os.environ.get("DEBUG", "").lower().startswith("y")

log = logging.getLogger(__name__)
if DEBUG:
    logging.basicConfig(level=logging.DEBUG)
else:
    logging.basicConfig(level=logging.INFO)
    logging.getLogger("requests").setLevel(logging.WARNING)

redis = Redis("redis")

def get_random_bytes():
    r = requests.get("http://rng/32")
    return r.content

def hash_bytes(data):
    r = requests.post("http://hasher/",
                    data=data,
                    headers={"Content-Type": "application/octet-stream"})
    hex_hash = r.text
    return hex_hash

As we can see in the last two stanzas, we can direct traffic to a service via a DNS name that exactly matches the service name defined in the docker compose file.

Shut down Dockercoins and clean up its resources:

[user@node dockercoins]$ docker-compose down

Conclusion

In this exercise, we stood up an application using Docker Compose. The most important new idea here is the notion of Docker Services, which are collections of identically configured containers. Docker Service names are resolvable by DNS, so that we can write application logic designed to communicate service to service; all service discovery and load balancing between your application’s services is abstracted away and handled by Docker.