How to provision a production ready Docker Swarm cluster from scratch

Docker Swarm Nov 15, 2019

In this article you'll learn how to bootstrap a secure production ready docker swarm cluster from scratch.

This guide will use Debian 10 (buster) as operating system but the tutorial should work for any operating system, you'll just need to adapt some commands.

In this tutorial we'll setup a 3 nodes docker swarm, with 1 manager and 2 workers.

What you'll need

To follow this tutorial you'll need the following:

  • Domain name: this is totally optional but you should use DNS to resolve your node IP address in case of dynamic IP.
  • 3 VPS: each of them will be a swarm node.

Create DNS aliases

First of all we'll create DNS alias to identify the servers. It is done by adding a A entry in your DNS zone that will point to your server.

manager.example.com A 192.168.2.10
worker1.example.com A 192.168.2.11
worker2.example.com A 192.168.2.12

Now after DNS propagation occurs you'll be able to log to your servers using their hostname. For example:

user@desktop:~$ ssh root@manager.example.com

Install the servers

Now we'll need to provision the servers, starting from a clean Debian 10 install.

Please note that the following steps are to be executed on each servers. (unless specified)

Setup server hostname

Now that you have correctly setup the DNS record to point to each of your servers you can setup the server hostname using the following command:

root@vps6712032:~# hostnamectl set-hostname manager.example.com

Now you can logout and login again and you'll see that the prompt is updated and now displays the server hostname.

Create management user

It is a good practice to disallow ssh access for root user, to do so you must first create a management user that you will use to connect to each servers.

root@manager:~# adduser myuser

Now you can grant sudo access to the user so he'll be able to become root.

root@manager:~# usermod -aG sudo myuser

Now you can login as myuser:

ssh myuser@manager.example.com

myuser@manager:~$

Secure ssh access

Now since we have a user to log with we can safely disable root ssh access. By the way we will change ssh port to increase security. This is done by editing the file /etc/ssh/sshd_config

You'll need to set the following lines

PermitRootLogin no

And

Port 2001

Note that you can set whatever you want here (should be within 1023-65535 in order to say safe from commonly scanned ports)

Now reload the sshd daemon

root@manager:~# systemctl restart sshd

And try to logout/login again. Does it works? You can proceed with other steps then !

Install a firewall (ufw)

To secure the servers we'll install ufw (a tiny wrapper for iptables) to enable a local firewall to secure the servers.

First update all packages and then install ufw

root@manager:~# apt update && apt upgrade -y
root@manager:~# apt install -y ufw

Now we'll need to whitelist the ssh port otherwise you'll loose access to your servers

root@manager:~# ufw allow 2001/tcp

Note: please replace 2001 by your ssh port !

Now we can safely enable ufw

root@manager:~# ufw enable

Still access to your server? Now we can register the port needed for Swarm to work:

Port 2377 (tcp only)

Port used for cluster management communications. It need to be opened on manager nodes only.

root@manager:~# ufw allow 2377/tcp

Port 7946 (tcp & udp)

This port is used for inter-nodes communications, you need to open it on all nodes.

root@manager:~# ufw allow from <worker1-ip> to any port 7946
root@manager:~# ufw allow from <worker2-ip> to any port 7946
root@worker1:~# ufw allow from <manager-ip> to any port 7946
root@worker1:~# ufw allow from <worker1-ip> to any port 7946
root@worker2:~# ufw allow from <manager-ip> to any port 7946
root@worker2:~# ufw allow from <worker1-ip> to any port 7946

Port 4789 (udp only)

This port is used for overlay network traffic.

root@manager:~# ufw allow from <worker1-ip> to any port 4789 proto udp
root@manager:~# ufw allow from <worker2-ip> to any port 4789 proto udp
root@worker1:~# ufw allow from <manager-ip> to any port 4789 proto udp
root@worker1:~# ufw allow from <worker2-ip> to any port 4789 proto udp
root@worker2:~# ufw allow from <manager-ip> to any port 4789 proto udp
root@worker2:~# ufw allow from <worker1-ip> to any port 4789 proto udp

Install the docker runtime

Now that your server are securely installed we can proceed with docker installation.

root@manager:~# apt update && apt install -y apt-transport-https ca-certificates curl gnupg2 software-properties-common

Now you'll need to install docker PGP key

root@manager:~# curl -fsSL https://download.docker.com/linux/debian/gpg | apt-key add -

Install docker community repository

root@manager:~# add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/debian $(lsb_release -cs) stable"

Update package and install docker runtime

root@manager:~# apt update && apt install -y docker-ce docker-ce-cli containerd.io

Create the swarm

Now that we have each servers correctly configured we can proceed with swarm creation

Execute the following command on the manager node only

root@manager:~# docker swarm init --advertise-addr <MANAGER-IP>

The command will output something like this

docker swarm join --token SWMTKN-1-49nj1cmql0jkz5s954yi3oex3nedyz0fb0xx14ie39trti4wxv-8vxv8rssmk743ojnwacrr2e7c 192.168.99.100:2377

And now you'll just need to copy it and run it for each workers and Tada ! your swarm is ready !

You can display all nodes by running the following command on the manager:

root@manager:~# docker node ls

ID                           HOSTNAME  STATUS  AVAILABILITY  MANAGER STATUS
dxn1zf6l61qsb1josjja83ngz *  manager   Ready   Active        Leader
ix7yrpke9zpu1xpme3ji7gjsw    worker    Ready   Active
v7ybuc3vatydzehuet4ps2v6y    worker    Ready   Active

Now you can install Traefik 2.0 for example:

How to install Traefik 2.x on a Docker Swarm
In this article you’ll learn about how to deploy securely Traefik 2.0 on a docker swarm

Aloïs Micard

You can contact me on: alois@micard.lu. PGP fingerprint: F733 E871 0859 FCD2