Storage Replica on Windows Server 2016

    The author of the article is Mikhail Komarov, MVP in the direction of Hyper-V

    The purpose of this article is to talk about the new Storage Replica component that was introduced in Windows Server vNext . The emergence of this technology was expected, as the last few years, Microsoft has been paying close attention to storage systems. The first sign was the new implementation of the SMB 3.0 protocol, which appeared with the release of Windows Server 2012 and was updated with new features for the release of Windows Server 2012 R2.



    Next, in our piggy bank, add a new type of file cluster, the so-called SOFS



    We also mention such nice things as built-in timing, support for RDMA, InfiniBand adapters, Storage Space for combining disk pools, Storage Tiering, which allows you to effectively use a combination of SDD and HDD pools. There are already solutions for disk JBOD shelves that can be connected directly to servers and make storage systems. There are industrial Dell CPS solutions that use these technologies.



    After all this, you could expect volume replication, as in other storage systems, and with the release of Windows Server vNext TP this was implemented.

    Storage Replica is a block-level volume replication technology for Windows using the SMB protocol. Currently, two volumes replication scenarios are implemented: elastic cluster and replication between simple servers.





    Management is implemented as follows: from the Failover Cluster Manager snap-in for an elastic cluster, as well as Windows PowerShell and WMI. Please note that only non-removable drives are supported. I would like to emphasize that Storage Replica is not DFSR, and that replication is done at the block level. The illustration below shows that the storage replica implementation mechanism is below the file system, so block replication is independent of the type of NTFS / CSVFS / ReFS file system.



    Consider the synchronous replication process in more detail. The first step is the receipt of data on the source server. The second step is to write to the log on a separate volume and forward it to the target server. The third step is to write to the log on the target server. The fourth step is the transfer of information to the source server about the successful logging on the target server. The fifth step is to notify the application that the data has been processed. Further, at time t1, data from the log volume will be recorded in the data volume on both servers.



    Consider the asynchronous replication process in more detail. The first step is the receipt of data on the source server. The second step is to write to the journal on a separate volume. On the third - notification of the application that the data has been processed. The fourth step is the transfer of information to the target server. The fifth step is to write to the log on the target server. The sixth step is to inform the source server of the successful logging. Further, at time t1, data from the log volume will be recorded in the data volume on both servers.



    Finish the theory and begin to move on to practice.

    Let's start with the requirements.

    Windows Server Edition - Datacenter Edition. Both computers must be members of a domain. Drives are necessarily GPT, not MBR. No removable media — external USB arrays, flash drives, tape drives, 5.25-inch floppy disks, etc. The same disk geometry (between magazines, between data) and partitions for data are also needed. Free space for logs on a Windows NTFS / ReFS volume (a fixed-size log, it does not increase or decrease). No replication of% SystemRoot%, paging files, hibernation files, DMP files. You also need to open the SMB, WS-MAN ports on the firewall.

    Packet exchange delays

    On average ≤ 5 ms in both directions. If you take the ideal option - the speed of light in vacuum, then 5 ms is about 1,500 km when exchanging in both directions. In reality, fiber reduces speed by about 35%, and there are also switches, routers, firewalls, and so on. The bottom line: most customers are limited to a distance of 30-50 km.

    Network bandwidth

    The initial requirement is a network ≥ 1 Gbit / s when connecting a node-to-node connection between servers (Windows Server requires 1 Gbit / s network cards). It all depends on the I / O operations and the intensity of channel sharing (perhaps SR is not the only function that will generate traffic to the disaster recovery site). Determine the number of I / O operations (125 Mb / s I / O = ~ 1 Gbit / s network load).

    Performance and size of the journal volume

    Flash drives (SSD, NVME, etc.). Larger logs allow you to recover faster after a major failure and switch faster. But the price is disk space.

    There is a Test-SRTopology cmdlet that checks the requirements and recommendations for network bandwidth, log size, the number of I / O operations per second, etc. It works for the specified time and creates an accurate report with recommendations in HTML format.

    Please note that the target volume is always offline. The script for the write-read or read-only target volume is not used. Connect only one to one. You can always use other replication features (for example, Hyper-V Replica for AB, and SR for AC). When resizing a volume, replication is interrupted.

    Consider a demonstration in which two identical virtual machines with the names SR1 and SR2 participated, they were members of the domain. To begin with, on each machine, enable the rules on the firewalls using the following command:

    Enable-NetFirewallRule -CimSession SR1,SR2 -DisplayGroup "Remote Desktop","File and Printer Sharing"

    The result of her work is given below. You can do this from the console.



    Check server availability:

    ping SR2.contoso.com -4 -f -l 1472 -n 300

    In the next step, connect 2 disks to each server and select the GPT partition during initialization. Next, format it in NTFS and assign the letters to the disks. For demonstration, I used dynamic disks. And the magazine drive limited to 15GB.



    Enable the feature using the PowerShell command and reboot the hosts.
    
    $Servers = {список серверов}
    $Servers | ForEach { Install-WindowsFeature –ComputerName $_ –Name WVR –IncludeManagementTools -restart }
    

    Or using the graphical interface.



    Now enable replication using PowerShell, the wizard is available only in the failover cluster version.
     
    New-SRPartnership -SourceComputerName SR1 -SourceRGName rg01 -SourceVolumeName Q: -SourceLogVolumeName T: -DestinationComputerName SR2 -DestinationRGName rg02 -DestinationVolumeName Q: -DestinationLogVolumeName T: -LogSizeInBytes 8gb
    

    This command enables replication on servers SR1, SR2. Defines the Q replication volumes on which the data will lie, and also sets the volumes for the T logs and sets the log size to 8 GB.
    The result of the team we see below.



    I would like to draw attention to the fact that after enabling replication, an additional section appears on the volume, as well as a new event log that contains information about replication.







    This graph shows how the initial initialization occurs after replication is enabled. Pay attention to the set of performance counters that are associated with replication.



    For example, copy the data to the replicated volume and immediately see the network traffic.



    Despite the fact that the volume with the log file is partially filled, data is not visible there. Use the dir command with extensions.



    As we said earlier, the data disk is not available on the second server. It is in RAW format and will be available after disconnecting or switching replication.



    If there is a need to parse replication, remember the additional partition on the disks on two servers and delete them.

    Run DISKPART, select our disk (x, for example)
    
    DISKPART
    LIST DISK
    SELECT DISK X
    attribute disk clear readonly
    

    Find the section (Y “unknown” in size 512KB)
    
    LIST PARTITION
    SELECT PARTITION Y
    

    Checking the partition (GUID 558d43c5-a1ac-43c0-aac8-d1472b2923d1)
    DETAIL PARTITION

    Delete the section
    DELETE PART OVERRIDE


    This concludes our brief overview of this technology, which appeared in Windows Server vNext. Storage Replica in Windows Server Technical Preview

    Resources

    Also popular now: