Database replication failover behavior - BlueCat Address Manager - 9.1.0

Address Manager Administration Guide

prodname
BlueCat Address Manager
version_custom
9.1.0

The following section describes the behavior when a database replication failover is performed.

Once a database replication failover has been initiated, a banner appears on the user interface of the current Primary server that indicates that a failover is in progress and that services are temporarily disabled until the failover has successfully completed. The banner is visible to all users that are logged into the user interface of the current Primary server.

Effects on services during a failover

During a replication failover, the following services are affected:
  • Notification processing—the current Primary server stops processing notification files temporarily. During the failover, existing notifications on the old Primary server are copied over to the new Primary server. Any new notifications created from DNS/DHCP Servers during the failover are queued on the DNS/DHCP Server. Once the failover has been performed and the new Primary server is online, the queued notification on the DNS/DHCP Servers are sent to the new Primary server and notification processing resumes.
  • Migration—the current Primary server stops any ongoing migration tasks. Once the failover has been performed and the new Primary server is online, you must review the content that has been migrated to the new Primary server and reapply any migration content that was not applied.
  • IP discovery and reconciliation—the current Primary server temporarily disables any scheduled IP reconciliation policies. New scheduled reconciliation policies will not be started and any reconciliation policy tasks that are currently in-progress will continue to run until the graceful failover begins (when the latency is 0 seconds and no deployments are in progress), or a forced failover is initiated. Once the failover has been performed and the new Primary server is online, scheduled IP reconciliation policies are re-enabled on the new Primary server. You must re-run any reconciliation policies that were interrupted during the failover or wait for the next run if the reconciliation policy is scheduled.
  • Scheduled deployments—the current Primary server temporarily disables any scheduled deployments. New deployments will not be started and any deployments that are currently in-progress will continue to deploy before the failover begins, unless a forced failover is initiated. Once the failover has been performed and the new Primary server is online, scheduled deployments are re-enabled on the new Primary server.
  • Deployments—the current Primary server temporarily disables the ability to perform manual deployments. Any deployments that are currently in-progress will continue to deploy before the failover begins, unless a forced failover is initiated. Once the failover has been performed and the new Primary server is online, the ability to perform manual deployments is re-enabled on the new Primary server.
  • API—the Address Manager API service continues to operate on the current Primary server; however, API methods that initiate deployment tasks cannot be performed. If you attempt to initiate a deployment task using a SOAP API call, the Address Manager server returns a 503 error with the message "Deployer service is not available." If you attempt to initiate a deployment task using a REST API call, the Address Manager server returns a 503 service unavailable error. Other API calls can be performed and will continue to be processed until the graceful failover begins or a forced failover is initiated. Once the failover has been performed and the new Primary server is online, API services are re-enabled on the new Primary server.

Failures during failover

If there is an interruption with any of the Primary or Standby servers in replication, the following scenarios take place:
  • Cannot failover to the target Standby server due to problems on the target Standby server (one Standby server)

    If the failover has been initiated but the Primary server cannot failover to the only Standby server due to problems on the Standby server, the Primary server stops the failover and breaks replication with the Standby server. The Primary server then becomes a Standalone server while the Standby server becomes a Removed Standby server. Services are re-enabled on the Standalone server.

  • Cannot failover to the target Standby server due to problems on the target Standby server (two Standby servers)

    If the failover has been initiated but the Primary server cannot failover to the selected Standby server due to problems on the Standby server, the Primary server stops the failover to the selected Standby server and removes the selected Standby server from the database replication cluster. Services are re-enabled on the Primary server and database replication resumes operating with a single Standby server.

  • Successful failover to the target Standby server but cannot connect to the secondary Standby server due to problems on the second Standby server (two Standby servers)

    If the failover has been initiated but the Primary server cannot replicate with the secondary Standby server due to problems on the secondary Standby server, the Primary server continues to failover to the target Standby server. Once the Primary server has successfully failed over the target Standby server, the new Primary server removes the secondary Standby server from the database replication cluster. Services are re-enabled on the new Primary server and database replication resumes operating with a single Standby server.

  • Successful failover to the target Standby server but cannot connect to the old Primary server due to problems on the old Primary server (two Standby servers)

    If the failover been successfully performed and the new Primary server cannot connect old Primary server due to problems on the old Primary server, the new Primary server removes the old Primary server (now a Standby server) from the database replication cluster. The Address Manager services remain shut down on the old Primary server. Services are re-enabled on the new Primary server and database replication resumes operating with a single Standby server.

  • Successful failover to the target Standby server but cannot connect to the old Primary or second Standby server due to problems on the old Primary and secondary Standby servers (two Standby servers)

    If the failover been successfully performed and the new Primary server cannot connect to either the old Primary or secondary Standby server due to problems on the old Primary and secondary Standby servers, the new Primary server breaks replication with the other two servers. The new Primary server then becomes a Standalone server while the old Primary (now a Standby server) and secondary Standby servers become Removed Standby Servers. The Address Manager services remain shut down on the old Primary server. Services are re-enabled on the Standalone server.

If you encounter any issues during the failover, contact BlueCat Customer Care for assistance.