Post by account_disabled on Feb 22, 2024 2:16:41 GMT -5
All data saved at any given time is automatically copied to the other node, always with real-time redundancy of the data (not just the data, but the entire hard drive of the virtual machine : operating system, installed software, updates...) Any changes made to the virtualized system are immediately retransmitted to another disk on another node. 3.- The partition storing the virtual hard disk is presented to the Linux system via iSCSI and an assigned fixed IP, both configured as cluster resources. When the node on which a machine resides fails, the machine, IP, and iSCSI devices are all migrated to the node that is still "alive," allowing it to continue operating with minimal downtime.
There is such a shutdown period because the machine does not run on two nodes at the same time like ESX, but is started on one node and, in the event of a Switzerland Mobile Number List failure, on the second node. Therefore, the time required to restart the computer is lost, but considering that in a virtualized environment with these characteristics, we are talking about about 30-35 seconds, this is not a significant time loss and is completely acceptable.
When restarting the machine on the second node after the failure, the hard drive is still accessible via iSCSI and the IP we migrated with the machine. 5.- DRBD replication is done using a dedicated network card, 1Gb speed. In addition, the connection between the two buildings is via optical fibers, one of which is reserved specifically for this task. 6.-The cluster "heartbeat" uses DRBD network cards and machine connection network cards for redundancy. There are 2 redundant paths to ensure that detections from other nodes will not cause "false positives.
Additionally, the cluster is configured with the STONITH resource ("Shoot The Other Node In The Head"), which means that both nodes are constantly monitoring each other. If one node does not detect another node, the unresponsive node will be forced to restart. This prevents nodes from "hanging" and stopping working, for example. A reboot will be forced to reconnect to the cluster and the DRBD appliance, synchronize again and be 100% operational again. That’s it for part three. In the last post I will comment (although not too deeply) on how a cluster is configured and the conditions that must be enforced in said cluster in order for resources to be up and running properly.
There is such a shutdown period because the machine does not run on two nodes at the same time like ESX, but is started on one node and, in the event of a Switzerland Mobile Number List failure, on the second node. Therefore, the time required to restart the computer is lost, but considering that in a virtualized environment with these characteristics, we are talking about about 30-35 seconds, this is not a significant time loss and is completely acceptable.
When restarting the machine on the second node after the failure, the hard drive is still accessible via iSCSI and the IP we migrated with the machine. 5.- DRBD replication is done using a dedicated network card, 1Gb speed. In addition, the connection between the two buildings is via optical fibers, one of which is reserved specifically for this task. 6.-The cluster "heartbeat" uses DRBD network cards and machine connection network cards for redundancy. There are 2 redundant paths to ensure that detections from other nodes will not cause "false positives.
Additionally, the cluster is configured with the STONITH resource ("Shoot The Other Node In The Head"), which means that both nodes are constantly monitoring each other. If one node does not detect another node, the unresponsive node will be forced to restart. This prevents nodes from "hanging" and stopping working, for example. A reboot will be forced to reconnect to the cluster and the DRBD appliance, synchronize again and be 100% operational again. That’s it for part three. In the last post I will comment (although not too deeply) on how a cluster is configured and the conditions that must be enforced in said cluster in order for resources to be up and running properly.