Implementing a BCDR solution with  Hyper-V Replica.

Implementing a BCDR solution with Hyper-V Replica.

Scenario

An Organisation runs most of its compute environment runs on-premises on Windows Server. This includes virtualized workloads on Windows Server 2016 hosts.

We need to determine an appropriate Business Continuity and Disaster Recovery (BCDR) solution, so that operations could continue and data could be recovered if a natural disaster occur, such as an earthquake, flooding, or fire.

As a Windows Server administrator, you need to determine what options are available to ensure that services can continue running and data remains available if extreme events occur.


Overview of Hyper-V Replica

Hyper-V failover clusters are used to make virtual machines (VMs) highly available, but they're typically limited to a single location. Multi-site clusters usually depend on specialized hardware and can be complicated and expensive to implement. If a natural disaster such as an earthquake or a flood occurred, all server infrastructure at the affected location could be lost.

One possible solution is to periodically copy the VM manually. You also can back up the VM and its storage. Although this solution achieves the desired result, it's resource intensive and time-consuming. In addition, because you perform backups only periodically, the backup is seldom as current as the running VM.

You can use Hyper-V Replica to implement an affordable BCDR solution for a virtual environment:

  • Hyper-V Replica can protect against data loss from site outage by copying a live VM as a replica VM from one location to another. If the site that contains the primary VM becomes unavailable, the replica VM is available to keep workloads available.

  • If necessary, you can use Hyper-V Replica to extend replication of the offline copy to a third location.

  • If your organization only has a single location available, you can still use Hyper-V Replica to replicate VMs to a partner organization in another location, to a hosting provider, or to Microsoft Azure.

Hyper-V Replica can have the following two instances of a single VM residing on different Hyper-V hosts:

  • The main, actively running VM, which is called a primary VM.

  • An offline copy of the primary VM, which is called a replica VM.

If failure occurs at the primary server site, you can use Hyper-V Replica to perform a failover of the VM(s) to the replica server at a secondary server site. This will incur minimal downtime.


Prerequisites for Hyper-V Replica implementation

Before implementing Hyper-V Replica, ensure that the virtualization infrastructure has the following prerequisites:

  • A supported version of Windows Server with the Hyper-V role installed at both the primary and replica locations.

  • Sufficient storage on both the primary and replica Hyper-V hosts to store and run all VMs, such as the local VMs and the replicated VMs. Replicated VMs are in a turned-off state and start only if you perform a failover.

  • Sufficient storage for the log files that contain the changes at the primary location. Although log files get purged after they are replicated, if there are issues with network connectivity, log files could fill up the storage.

  • Network connectivity between the locations that are hosting the primary and the replica Hyper-V hosts. Connectivity can be through a wide area network (WAN) or a local area network (LAN) link.

  • Firewall rules to allow replication between the primary and replica sites. When you install the Hyper-V role, Hyper-V Replica HTTP Listener (TCP-In) and Hyper-V Replica HTTPS Listener (TCP-In) rules are added to Windows Defender Firewall. Before you can use Hyper-V Replica, you must enable one or both rules on the replica Hyper-V host.

  • Authentication certification or Active Directory Domain Services (AD DS) infrastructure requirements, depending on which type of authentication you plan to use:

    • If you plan to use certificate-based authentication, you need an X.509v3 certificate from a trusted certification authority to support mutual authentication at both Hyper-V hosts. When you use certificate-based authentication, Hyper-V hosts can be in different AD DS forests.

    • If you plan to use Kerberos authentication, both Hyper-V hosts need to be joined to the same AD DS forest.

💡
Hyper-V Replica isn't a high-availability technology, it's a disaster-recovery technology. High availability primarily removes single points of failure, so that services are always, or nearly always, available. Because it's a disaster recovery technology, Hyper-V Replica intervenes when high availability fails. Hyper-V Replica doesn't have an automatic failover option.
💡
Hyper-V replica works at the host level and is workload and application agnostic. This means that you can use Site Recovery for any operating system that Hyper-V supports, and to protect all kinds of workloads, including Windows or Linux environments such as: Microsoft SharePoint Server ,Microsoft Dynamics CRM ,Microsoft SQL Server ,Internet Information Services (IIS) ,Third-party applications

Hyper-V Replica host scenarios

You can set up Hyper-V Replica between Hyper-V hosts irrespective of whether they're nodes in a failover cluster. Also, you can set up Hyper-V Replica whether the Hyper-V hosts are members of the same AD DS forest or are in different AD DS forests without any trust between them.

You can use Hyper-V Replica in the following four configurations:

  1. Both Hyper-V hosts are standalone servers. Typically, this configuration isn't the preferred option, unless it's used in test or development scenarios, because it includes only disaster recovery and not high availability.

  2. The Hyper-V host at the primary location is a node in a failover cluster, and the Hyper-V host at the secondary location is on a standalone server. Many environments use this configuration. A failover cluster provides high availability for running VMs at the primary location. If a disaster occurs at the primary location, a replica of the VMs is still available at the secondary location.

  3. Each Hyper-V host is a node in a different failover cluster. With this configuration, if a disaster occurs at the primary location, you can perform a manual failover and continue operations from a secondary location.

  4. The Hyper-V host at the primary location is a standalone server, and the Hyper-V host at the secondary location is a node in a failover cluster. Although technically possible, this configuration is rare. Typically, you want VMs at the primary location to be highly available. Their replicas at the secondary location stay turned off and aren't used until a disaster occurs at the primary location.

Replication settings

Because you must individually configure replication for each VM, you must plan resources on replication hosts for each VM. In addition to resources, you also must plan how to configure the following replication settings:

  • Replica server. Specify the computer name or the fully qualified domain name (FQDN) of the replica server, because the IP address isn't allowed. If the Hyper-V host that you specify isn't configured to allow replication traffic, you can configure it here. If the replica server is a node in a failover cluster, you should enter the name or FQDN of the connection point for the Hyper-V Replica Broker.

  • Connection Parameters. If the replica server is accessible, the Enable Replication Wizard automatically populates the authentication type and replication port fields with appropriate values. If the replica server isn't accessible, you can manually configure these fields. Be aware that you won't be able to enable replication if you can't create a connection to the replica server. On the Connection Parameters page, you can also configure Hyper-V to compress replication data before transmitting it over a network.

  • Replication VHDs. By default, all VHDs are replicated. If some of the VHDs aren't required at the replica Hyper-V host (for example, a VHD that's dedicated to storing page files), exclude them from replication. Be aware that excluding VHDs that include operating systems or applications can result in that VM being unusable at the replica server.

  • Replication frequency. Replication frequency controls how frequently data replicates to the Hyper-V host at the recovery site. If a disaster occurs at the primary site, a shorter replication frequency means less data loss, because changes are replicated to the recovery site more frequently. You can set replication frequency to one of the following:

    • 30 seconds

    • 5 minutes

    • 15 minutes

  • Additional recovery points. You can configure the number and types of recovery points to send to a replica server. By default, the option to maintain only the latest point for recovery is selected, which means that only the parent VHD replicates and all changes merge into that VHD. You can create more hourly recovery points and set the number of additional recovery points to a maximum of 24. You can configure the Volume Shadow Copy Service snapshot frequency to save application-consistent replicas for the VM and not just the changes in the primary VM.

  • Initial replication method and schedule. VMs have large virtual disks, and initial replication can take a long time and cause a lot of network traffic. Although the default option is to immediately send the initial copy over the network, if you don't want immediate replication, you can schedule it to start at a specific time. If you want an initial replication but want to avoid network traffic, you can opt to send the initial copy to external media or use an existing VM on the replica server. Use this option if you restored a copy of the VM at the replica server and you want to use it as the initial copy.


Hyper-V Replica security considerations

You can set up Hyper-V Replica with a Hyper-V host regardless of its location and domain membership, providing you have network connectivity with the primary and replica Hyper-V hosts. Hyper-V hosts are not required to be part of the same AD DS forest.

You can implement Hyper-V Replica when Hyper-V hosts are members of un-trusted domains by configuring certificate-based authentication. Hyper-V Replica implements security at the following levels:

  • Hyper-V creates a local security group named Hyper-V Administrators. Members of this group and local administrators can configure and manage Hyper-V Replica.

  • You can configure a replica server to allow replication from any authenticated server or to limit replication to specific servers. Keep in mind that:

    • You must specify a FQDN for the primary server , or use a wildcard character with a domain suffix.

    • Using IP addresses isn't allowed.

    • If the replica server is in a failover cluster, replication is allowed at the cluster level.

    • When you limit replication to specific servers, you also must specify a trust group that's used to identify the servers within which a VM can move. For example, if you provide disaster recovery services to partner organizations, the trust group prevents one organization from gaining access to another organization's replica machines.

  • The replica Hyper-V host can authenticate a primary Hyper-V host by using Kerberos authentication or certificate-based authentication:

    • Kerberos authentication requires both Hyper-V hosts to be in the same AD DS forest, whereas you can use certificate-based authentication in any environment.

    • Kerberos authentication is used with HTTP traffic, which isn't encrypted, whereas certificate-based authentication is used with HTTPS traffic, which is encrypted.

  • You can establish Hyper-V Replica only if network connectivity exists between the Hyper-V hosts.

  • You should configure Windows Defender Firewall to allow HTTP or HTTPS Hyper-V Replica traffic as needed.


Configure and implement Hyper-V Replica

Hyper-V Replica is available as part of the Hyper-V role. You can use it on standalone Hyper-V servers or on servers that are part of a failover cluster, in which case you should configure Hyper-V Replica Broker. Hyper-V, and thus Hyper-V Replica, has no dependencies on AD DS, except when Hyper-V Replica servers are part of the same failover cluster.

To enable Hyper-V Replica, two steps needs to be completed as follows:

  1. Enable a Hyper-V host to act as a replica server:

    1. In Hyper-V Settings on the host server, in the Replication Configuration group of options, select the Enable this computer as a Replica server check box.

    2. Configure the Hyper-V server settings:

      1. Select and configure the Authentication and ports options.

      2. Select and configure the Authorization and storage options.

        • It's possible to allow replication from any server that successfully authenticates (which is convenient when all servers are part of the same domain) or allow replication only from specified servers. You also must configure a location to store the replica files as part of this configuration setting.
  2. Enable replication on each VM that needs to be replicated on the primary Hyper-V host:

    1. Select the VM you wish to replicate and choose Enable Replication.

    2. In the Enable Replication for <VM_Name> Wizard, specify the replica server.

    3. Specify the connection parameters such as Authentication type.

    4. Select the VHDs to replicate; you can choose more than one if needed.

    5. Select one Replication Frequency option:

      • 30 seconds

      • 5 minutes (the default value)

      • 15 minutes

    6. Configure the recovery points to Maintain only the latest recovery point or Create additional hourly recovery points.

    7. Select an initial replication method. Options include:

      • Send initial copy over the network

      • Send initial copy using external media

      • Start replication immediately

      • Schedule replication on

    8. After configuring these options, start replication.

    9. After you establish the replication relationship, the Status column in Hyper-V Manager displays the replication progress as a percentage of the total replication for the configured VM. The VM replica is in a turned-off state and will start only when you perform a failover.

    10. When the initial replication is complete, the replica updates regularly with changes from the primary VM.


Replication health monitoring

When you enable replication for a VM, changes in the primary VM write to a log file that periodically transfers to the replica server and is then applied to a replica VM's VHD.

Replication health includes the following data:

  • Replication State, which indicates whether replication is enabled for a VM.

  • Replication Mode, which is either Primary or Replica.

  • Current Primary Server, which is the server name.

  • Current Replica Server, which is the server name.

  • Replication Health, which indicates replication status. Replication Health can have one of three values:

    • Normal

    • Warning

    • Critical

  • Statistics, which include data such as the following:

    • From time

    • To time

    • Average size

    • Maximum size

    • Average latency

    • Errors encountered

    • Last synchronized at

    • Successful replication cycles

  • Pending replication, which displays information about the size of data that still needs to replicate and when the replica was last synced with the primary VM.


Failover options

Three types of failover are possible with Hyper-V Replica: Test Failover, Planned Failover, and Failover.

Test Failover

Test Failover is a non-disruptive task that enables you to test a VM on a replica server while the primary VM is running without interrupting the replication. You can only initiate it on the replica server on the replicated VM. Initiating a Test Failover on a replicated VM creates a new checkpoint, and you can use this checkpoint to select a recovery point from which to create a new test VM. This test VM has the same name as the replica, but with - Test appended to the end. The test VM stays disconnected by default to avoid potential conflicts with the running primary VM.

Planned Failover

You can start a Planned Failover to move the primary VM to a replica server, for example, before site maintenance or before an expected disaster. No data loss occurs, but the VM is unavailable for some time during startup. It can only be initiated from the primary site.

After the Planned Failover, the VM will run on the replica server, and it doesn't replicate its changes. If you want to set up replication again, you should reverse the replication and configure settings similar to when you enabled replication.

Failover

If a disruption occurs at the primary server and the primary VM has failed, you can perform a Failover. You can initiate a Failover only on the replica server on the replicated VM, and only when the primary VM is either unavailable or is turned off.


Extended replication

To provide additional disaster recovery protection by preparing for the outage of both the primary and replica sites, Hyper-V Replica supports the replication to a third server. This third server can be at a third site, separate from the locations of the primary and replica servers. With this setup, you can replicate a running VM to two independent servers which could be in different geographic locations, providing additional options for recovering a failed VM. This configuration is known as Extended replication. The replication doesn't happen from the primary server to two other replica servers. Instead, the primary server replicates to the replica server, which in turn replicates to the extended replica server, in a daisy chain as depicted in the following image.

A Hyper-V Replica with a primary site that contains storage and Hyper-V VMs. This is connected by a WAN link to a replica site that contains storage and a replica of the VMs from the primary site. This replica site is then connected via a WAN link to another Extended replica site that contains storage, and is a replica of the VMs from the primary site.


Limits of extended replication configuration

Extended replication has the following configuration limits:

  • Replication frequency can be 5 minutes or 15 minutes only.

  • Replication frequency can't be lower than the initial replication. For example, if the initial replication frequency is 15 minutes, you can't set the extended replication to 5 minutes.

  • You can't change the authentication type.


Did you find this article valuable?

Support Debasish Lenka by becoming a sponsor. Any amount is appreciated!