In the event of a network failure, the Distributed DDNS database cluster might split into
several components. A component is a set of nodes that are connected to each other, but
not to other nodes with the database cluster. If this occurs, only one component can
continue to modify the database state to avoid history divergence, known as the
Primary Component. When BlueCat Distributed DDNS is operating correctly, the
Primary Component consists of all nodes within the cluster. When the cluster is split
into multiple components, the database cluster uses a quorum algorithm to determine
which component is selected as the Primary Component. This ensures that there is only
one Primary Component in the cluster that can change the database state in the cluster
to avoid data divergence between components.
In certain situations, cluster failures can cause the database nodes to operate
autonomously of the other nodes. This situation is known as
split brain, where
two database nodes can independently update the same row in the same table, resulting in
data corruption. The database cluster avoids split-brain using Primary Components. In
the event that a failure results in the splitting of clusters into two partitions of
equal size, such as a two node cluster, neither partition becomes operational. If you
configure a two node Distributed DDNS database cluster, automatic failure cannot occur
due to the failure of one node causing the other node to turn into a non-operational
state. BlueCat recommends configuring at least three Distributed DDNS Data Nodes to
ensure that database cluster quorum algorithm can determine a Primary Component and
avoid split-brain situations. You can configure the Distributed DDNS database cluster as
follows:
- Single switch clusters with a minimum of three nodes
- Clusters spanning multiple switches with a minimum of three switches
- Clusters spanning multiple networks with a minimum of three networks
- Clusters spanning multiple data centers with a minimum of three data
centers.
WAN replications
If you are configuring a Distributed DDNS database cluster over a WAN, you might
experience transient network connectivity failures. To prevent the partitioning of
the cluster, you can modify the configuration parameters to tolerate longer
connectivity outages. You can configure the following parameter in the
/var/lib/docker/volumes/mariadb-config/_data/my.cnf file to
handle a tolerance of 30 seconds for connectivity
outages:
wsrep_provider_options = "evs.keepalive_period = PT3S; evs.suspect_timeout = PT30S; evs.inactive_timeout = PT1M; evs.install_timeout = PT1M"