The following procedure describes how to bootstrap (restart) a Distributed DDNS database cluster when all nodes in the cluster stop functioning.
First, on each DDNS node, do the following:
- View the contents of the file
/var/lib/docker/volumes/mariadb-data/_data/grastate.dat
as text. Locate the node with the file with the thesafe_to_bootstrap
variable set to 1.
Distributed DDNS Nodes write their last executed state in this file. If the cluster was able to fail gracefully, it will set this variable to 1 in the file of the node that you can safely start with when bootstrapping the cluster. It is possible that no node will have this variable set to 1.
- If you find a node with
safe_to_bootstrap
equal to 1, follow the instructions in Restarting from the "safe to bootstrap" node. - Otherwise, follow the instructions in Restarting from the node with the most advanced state.
Restarting from the "safe to bootstrap" node
On the node where the
grastate.dat
file hassafe_to_boostrap
equal to 1, restart the container with the standard docker command:docker start <Node container name>
Where <Node container name> is the name of the node's container.
Note: Make sure you wait for the node to completely finish its startup.After the first node starts successfully, start the other nodes one by one with the same
start
command.Make sure you wait for each node to fully start and sync with the rest of the cluster before starting another node.
Restarting from the node with the most advanced state
Use the following procedure when all nodes have the following values in their
grastate.dat
files:
safe_to_bootstrap
is 0seqno
is -1
To restart the cluster in this case:
- You will need to run commands within each database node's container, so restart any stopped nodes with the standard docker command.
docker start <Node container name>
Where <Node container name> is the name of the node's container.
Here, the order of nodes does not matter.
After all nodes are started, make sure the MariaDB process on each node is stopped. To do so, on each and every data node, run the following command:
docker exec -it <node-container-name> supervisorctl stop mariadb
After stopping the MariaDB process on all nodes in the cluster, determine the recovered position of each node and make note of it. To do so, run the following command on each node:
docker exec -it <node-container-name> mysqld --defaults-extra-file=/etc/mysql/custom/my.cnf -u mysql --wsrep-recover
Look for the following log entry in the output and write down the number after the colon at the end.
Recovered position: 00000000-0000-0000-0000-000000000000:36
- Review your list to determine the node with the highest number. On that node, do
the following:
-
Manually edit the file
/var/lib/docker/volumes/mariadb-data/_data/grastate.dat
to set the value ofsafe_to_bootstrap
to 1.Save the file when you're done.
-
Manually edit the file
/var/lib/docker/volumes/mariadb-config/_data/custom/my.cnf
to set the value ofwsrep_cluster_address
togcomm://
. That is, remove all IP addresses from the entry, leaving only thegcomm://
prefix.Save the file when you're done.
-
Restart the database container on this node with the standard restart command:
docker restart <Node container name>
After this first node finishes its startup process, you will have a new 1-node cluster.
-
After the first node starts successfully, restart the other nodes one by one with the restart command:
docker restart <Node container name>
Make sure you wait for each node to fully start and sync with the rest of the cluster before restarting another node.
Note: You must use therestart
command (not thestart
command) so that nodes first completely stop before starting again.