
VMware Replication and Disaster Recovery Solutions: vSphere Replication and Site Recovery Manager (SRM)
- Tutorial
Site Recovery Manager (SRM)
VMware Site Recovery Manager (SRM) is a business continuity and disaster recovery solution for planning, testing and restoring VMs (virtual machines) from a protected (primary) site to a (backup) recovery site.
SRM offers 3 approach to the protection (replication) VM:
• Storage Group (datastore groups). VM protection in storage groups through third-party replication mechanisms (3rd party). Array-based replication is used.
• Separate VMs . Protection of individual VMs at the host level. SRM is used in combination with VMware vSphere Replication technology.
• Retention Policies(storage policies). VM protection based on special storage policies. Array-based replication is used.
SRM provides 2 options for site restoration (data center):
• Scheduled migration. It assumes the availability and full functionality of the primary and backup sites. It eliminates data loss, it is a planned operation, it runs in a working order, without emergency situations.
• Disaster recovery. Designed for a sudden fall of the main site, switching to the backup site, an unplanned operation.
SRM orchestrates data center recovery processes and replication mechanisms to minimize data loss and recovery time:
• SRM provides VM blanking on the main site and data synchronization between sites in case the main site is operational.
• SRM runs replicated VMs on the backup site in the order determined by the recovery plan.
SRM provides the ability to test recovery plans. For testing, temporary copies of the replicated data are used, which eliminates the impact on the main processes of both sites.
SRM provides 2 deployment options in the context of relationships between sites:
• Basic (unidirectional) option - involves the possibility of migrating the services of the main data center (protected site) to a backup site (recovery site).
• Bidirectional option - provides VM protection in both directions. Each site in the educated pair is the main, while performing the backup function for its neighbor.
Site configuration requirements for SRM to work:
• Identity and compatibility of versions of SRM, vCenter Server, vSphere Replication on both sites.
• In the case of replication at the array level (Array-based replication), the selected replication technology must be supported on both sites, the arrays form a pair.
• The infrastructure of the backup site (hosts, networks, storage) must correspond to the VM and support the load on the main site. A backup site can be loaded (above normal) with unproductive or non-critical VMs that can be stopped if the main site is restored.
• Sites must be connected through a reliable IP network that provides the necessary bandwidth.
• The backup site must be connected to public and private networks accessible to the main site.
The technology requires the installation of SRM-servers (Site Recovery Manager Server) on the primary and backup sites. For small data centers, it is permissible to install an SRM server on the same system as the vCenter server, in particular, install them on the same VM. For large infrastructures, for reasons of load and availability, it is advisable to install the SRM server on a separate system (on a separate VM).
SRM multi-site configurations
The standard configuration, which was discussed above, included 2 sites: primary and backup. Both sites have a vCenter server that communicates via SRM servers installed on both sites. Thus, VMs owned by the vCenter primary site can be restored to the vCenter backup site.
In case the data center has more than 2 sites, SRM supports various multi-site configurations:
• Common recovery site (many-to-one, N: 1) - many protected sites can replicate and restore their VMs to one common backup site;
• The common main site - shared protected site (one-to-many, 1: N) - the main site has several backup sites;
• Many to many - many-to-many (N: N).
The SRM entities (SRM servers) on the primary and backup sites must form a pair, they are assigned the same identifiers (extension ID). Therefore, the number of SRM entities equal to the number of its partner sites should be raised on a common site. For example, if a shared recovery site serves 5 protected sites, then 5 SRM servers must be deployed on it, pairing with protected sites. The shared site SRM servers must be installed on different VMs (host machines) and have unique identifiers. At the same time, many SRM entities of a common site interact with one vCenter server managing this site.
You cannot install multiple SRM servers on the same host machine (VM). Each SRM server must have its own database. One recovery site can have no more than 10 protected sites.

Array-based replication SRM
This approach involves the replication of data between sites at the level of arrays (SHD), through the mechanisms of replication embedded in them. Integration of SRM with arrays is carried out through storage replication adapters (SRAs), these are software components that should be developed by manufacturers of arrays. To support Array-based replication on the SRM-server of each site, SRA must be installed for each array connected to it.

SRM using vSphere Replication
SRM can use vSphere Replication (built-in and free VMware vSphere suite technology) to replicate VM-level data between sites. The operation of vSphere Replication does not depend on the type and model of storage, does not require integration with the array (developed by SRA), and supports any storage compatible with vSphere.
vSphere Replication allows you to create a chain of snapshots for replicated VMs on the backup site - many replicas of protected machines at different points in time. Thus, it becomes possible to select the optimal state of the VM for recovery among the many snapshots of the replica.

Mixed Replication
SRM supports a mixed mode of operation in which both replication mechanisms are shared: Array-based replication and vSphere Replication. This mode requires the deployment and configuration of these technologies on both sites. Configuring different replication mechanisms for the same VMs is not supported. However, SRM allows you to include recovery tasks with different replication mechanisms in one plan, but for different VMs.

vSphere Replication
vSphere Replication is an extension for vCenter that enables replication and recovery of VMs at the hypervisor level, as well as monitoring and managing these processes. This technology is an alternative to array-level replication. The solution supports the following VM site replication options:
• between the source site and the target site (site-to-site);
• between clusters within the same site;
• between multiple source sites and a common destination site (many-to-one).
vSphere Replication is independent of array type and supports any vSphere compatible storage. The solution is included in all editions of vSphere (with the exception of the simplest and most useless) and does not require the purchase of licenses.
Replication is accomplished by transferring modified blocks between sites or clusters of source and target. This implies the initial full synchronization of the source VM and its replica. Setting up the replication job allows you to set the RPO, as well as activate the ability to save multiple intermediate temporary replica states (MPIT - multiple points in time) - an analogue of VM snapshots.
It is possible to monitor and control the state of replication, obtain information about incoming and outgoing replications, the status of sites, replication results and errors.
The process of recovering a VM from a replica is not automated and requires manual intervention. In particular, it requires you to manually select the synchronization of the VM state with the source site or restore the last state from the replica. The restored VM does not have network connections in order not to cause potential conflicts, which requires manual connection of the VM to the necessary virtual networks of the data center. MPIT provides recovery of a replicated VM with a given chain of snapshots, which makes it possible to select the desired state of the restored VM.
The vSphere Replication appliance is the core essence of the solution, which registers and connects as an extension to the vCenter server. vCenter allows the installation and connection of only one vSphere Replication appliance (VR appliance). The VR appliance includes an integrated vSphere Replication server that manages all replication processes. To balance the load, the deployment of additional vSphere Replication server is supported, which are connected to the main VR appliance of this site (vCenter) and, in fact, they themselves are virtual applications.
Site-to-site replication configuration example:

An example of replication configuration between clusters within one site, using 2 VR servers for load balancing (this is not necessary, one VR appliance could be dispensed with):

Many-to-one replication configuration example:
