Gateway Availability groups provide a mechanism for improving the reliability of workflow applications running on the Gateway platform. An Availability group has two Gateway instances: A Primary and a Secondary node. The Primary Gateway node handles requests to Gateway. The Secondary node monitors the status of the Primary node, taking over when the Primary node fails or is unreachable, an event sometimes called a "failover".
However, Availability groups only make sure that there is an instance of Gateway available to handle incoming requests. It does not monitor workflow operations that run on it. When using Availability groups on systems with custom workflows, you might need to modify it to properly handle failover events:
- If a workflow does not care which specific Gateway node
responds to requests, then it might work as-is with Availability groups.
Important: Even if this is the case, we strongly recommend fully testing your workflows with Availability groups under failover conditions to make sure they continue functioning correctly.
- If a workflow requires a specific instance of Gateway, then you might need to modify it to check whether a failover occurs and take the appropriate action. For example, a workflow on a Secondary node might have to perform special actions to make sure that it can properly access data created by the corresponding workflow on the Primary node and continue its operations.
In general, to respond smoothly to failover events within an Availability group, a custom workflow should:
- Monitor the current role of the node that it runs on.
- Provide custom code to be called by the Availability group when a failover (or other) event occurs. This code should perform additional operations required to handle the failover event.
You can do so with Gateway Availability group Python API. For more details, see Availability group API reference.