Split-brain
When a high-availability cluster is functioning normally, only one of the member servers should assume the role of active server. In this case, the passive server detects the presence of the active server via both the Heartbeat connection and data connection.
If all Heartbeat and data connections are lost, both servers might attempt to assume the role of active server. This situation is referred to as a "split-brain" error. In this case, connections to the IP addresses of the high-availability cluster will be redirected to either of the two servers, and inconsistent data might be updated or written on the two servers.
When any one of the Heartbeat or data connections is reconnected, the system will detect the split-brain error and data inconsistency between the two servers, and will enter high-availability safe mode.
In the event a split-brain error occurs:
- The services on both servers and the IP addresses of the high-availability cluster will be unavailable until the split-brain error is resolved.
- Once both servers enter high-availability safe mode, a new tab named Split-brain will appear on the left panel. In this tab, the following information will be listed: the difference between the files in the shared folders on the two servers, the time the servers became the active servers, as well as the last iSCSI Target connection information. All the other tabs will remain read-only.
- In high-availability safe mode, File Station will be in read-only mode, and you will be able to download or view the files.
- In the Cluster tab, you are only allowed to either resolve split-brain errors or shut down the current login server. To resolve split-brain errors, do any of the following:
- Choose one server as the active server of the high-availability cluster and the other as the passive server. Once both servers are rebooted, all the different data and settings on the active server will be synced to the passive server. Please note that the updated data on the passive server during the split-brain error will be lost.
- Choose one server as the active server of the high-availability cluster, and unbind the other. Once both servers are rebooted, the active server will remain in the high-availability cluster, and the unbound server will keep its data and return to Standalone status. Please note that a full replication will be required to bind a new passive server in the future.
- Remove the entire cluster and keep the data on both hosts, and let them return to Standalone status.
Notes:
- The more files there are in your shared folders, the longer it will take to list the differences.
- Before choosing which server as your active/passive server, please make sure both servers are powered on.