BranchCache on Windows 7

    Almost a year has passed since the release of the final versions of Windows 7 and Windows Server 2008 R2. What is no reason to once again recall these OSs. I would like to draw attention to the two most interesting, from my point of view, features of the new Windows: BranchCache and DirectAccess. This article is about BranchCache.

    What is BranchCache

    BranchCache is a caching technology built into Windows 7 and Windows Server 2008 R2 designed to optimize (reduce) network traffic transmitted over WAN channels. Accordingly, the main scope of BranchCache is organizations with branches and remote offices, which are connected to each other and the central office by relatively slow data lines.
    BranchCache supports caching of HTTP and SMB traffic. At the same time, Windows 7 must be installed on client computers (Ultimate or Enterprise editions, BranchCache does not work in other editions), and Windows Server 2008 R2 on servers. Therefore, BranchCache only works in conjunction with Windows 7 + Windows Server 2008 R2. If from this place you have not lost the desire to read further, let's discuss the main features of this technology.

    BranchCache Features

    What is the key difference between BranchCache and other caching technologies such as Offline Files or ISA Server Cache? Data is returned to the client application from the cache only if the original data has not changed. I will explain with an example. Suppose a user in a branch is trying to open a document from, say, a letter of resignation template from a file server in a central officeon the provision of leave. The BranchCache module of the user's computer requests file information from the server and checks if the requested file is in the local cache. If not, then the file, of course, is downloaded from the central office server. If the file is already in the local cache, then the server is still accessed at the central office to check if the original file on the server has changed. If changed, then the file is again downloaded from the server. And only if the original file on the server and the file in the cache are absolutely identical, the data from the cache is used. The real request processing algorithm is more complicated, but it seems to me that there is enough information to understand the essence.
    Two important BranchCache features that follow from the example.
    1. Data in BranchCache is always up to date. More precisely, if an application retrieves data from the cache, BranchCache technology ensures that the data is up to date.
    2. No access to the server - no access to the cache. In other words, if the BranchCache module cannot verify the identity of the original and cached files (the server is turned off, problems with the communication channel, etc.), then the data from the cache is not used.
    Well, it's worth adding that BranchCache is transparent to applications and users. The Windows interface does not reflect in any way the fact that the document that was just opened by the user is taken from the cache. Unlike, for example, the Offline Files mechanism.

    Metadata

    We now ask ourselves a fundamental question, namely: how does the cache check and compare the original and cached information? BranchCache uses so-called metadata. The requested file (a document on a file server, an html page on a web server, etc.) is divided into segments of 32 MB. If the file is less than 32 MB, it by definition consists of one segment. Segments, in turn, are divided into blocks of 64 KB. If the file is less than 64 KB, it is always directly downloaded from the server, and BranchCache is not used. For each block and segment, a hash is calculated using the SHA 256 algorithm. All these calculations take place on a server with BranchCache enabled, where the requested file is located. The totality of the hash values ​​of the segments and blocks form a hashlist and serve as the basis for the file metadata. It is these metadata that are transferred to the client computer, where they are compared with the hash list of the cached file. The size of the data hash is approximately 2000 times smaller than the size of the data itself, so the load on the WAN channel during the transfer of metadata is minimal.

    Dividing into segments and blocks allows you to optimize the operation of searching and downloading data. A segment hash is a search unit. As already mentioned, when accessing a file on a remote server — at the central office or another branch — the first thing that the BranchCache module of the client computer does is requesting file metadata from the server. Based on the resulting hash list, BranchCache checks to see if there are file segments in the local cache. If so, the file is opened from the local cache. If not, the client computer sends a search query to the network: “Who has a segment with such a hash?” Depending on the BranchCache operating mode (see below), this request is sent either to a specially configured server with Windows Server 2008 R2 in the same branch office, or computers on the same IP subnet with Windows 7. In the case of a positive response, the desired data segment in blocks is downloaded from the "neighbor". In this sense, a block is a download unit. Thus, the presence of segments can reduce the number of search queries, and the presence of blocks - more quickly transfer the requested data to the application.

    BranchCache Modes of Operation

    To use BranchCache, you must configure this technology both on the client and on the server. There are two possible modes of operation of BranchCache: distributed cache (distributed cache) and dedicated cache (hosted cache).

    Distributed Cache

    In distributed mode, data is cached on the Windows 7 computer that was the first in the branch, or rather, on the IP subnet, that data was downloaded from a remote server. After that, this data becomes available for other computers in the branch. The dynamics of BranchCache are as follows:
    1. A user at a computer in the branch tries to open a document from a remote server. In this case, the computer establishes a connection with the server and requests the required file as if there were no BranchCache at all.
    2. The server authorizes the client and verifies that the client has the appropriate file permissions. If there are no rights, access to the file is denied.
    3. If BranchCache is configured on the server and client, the server returns metadata, including a hash list, instead of a file.
    4. If there are no file segments in the local cache, and the communication channel speed to the server is low (latency exceeds the specified threshold, by default 80 ms), the client generates requests to search for missing segments using the Web Service Dynamic Discovery (WS-Discovery) protocol. These are multicast requests that are distributed only within the subnet, unless the routers are configured differently.
    5. If someone has the requested segments, they are returned in blocks to the client computer. The computer checks the integrity of the received blocks, stores them in its cache, and transfers data to the application. The user sees the opened document.
    6. If none of the “neighbors” have the necessary data, they are downloaded from the server via the WAN channel and stored in the local cache.
    Distributed mode is recommended for small branches where all machines are located on the same subnet. BranchCache on client machines can be easily configured using group policies, and Windows Server 2008 R2 is not required. However, you must remember that when you turn off the computer, its cache becomes inaccessible to other clients of the branch.

    Dedicated cache

    In this mode, the cache (dedicated cache) is concentrated on a branch server with Windows Server 2008 R2 configured accordingly. Any computer with Windows 7 accesses search queries specifically to the server with a dedicated cache, and only to it. The dynamics is as follows:
    1. A user at a computer in the branch tries to open a document from a remote server. In this case, the computer establishes a connection with the server and requests the required file as if there were no BranchCache at all.
    2. The server authorizes the client and verifies that the client has the appropriate file permissions. If there are no rights, access to the file is denied.
    3. If BranchCache is configured on the server and client, the server returns metadata, including a hash list, instead of a file.
    4. If there are no file segments in the local cache and the communication channel speed to the server is low (latency exceeds the specified threshold, by default 80 ms), the client directly accesses the local server with a dedicated cache. The IP address or FQDN of the server with a dedicated cache must be entered in the client settings manually or using group policies. In this case, as already clear, a segment or segments are requested.
    5. If there is data in the dedicated cache, they are returned in blocks to the client computer. The computer checks the integrity of the received blocks, stores them in its cache, and transfers data to the application. The user sees the opened document.
    6. If in the dedicated cache there is no requested data, then the client downloads them from a remote server and saves it in a local cache.
    7. After that, the client sends a notification packet to the server with the dedicated cache about the availability of new data for the dedicated cache.
    8. The server sends a request to the client to receive new data.
    9. The client copies the blocks to the server in a dedicated cache so that other machines in the branch can use this information.
    Dedicated cache provides a higher level of data availability, because unlike client computers, the server is constantly running and does not turn off. Usually. Although in small offices that just does not happen. In addition, there are no restrictions on network topology. A request to a dedicated cache is a unicast request that is routed in the usual way. However, the described operation mode assumes that a server with Windows Server 2008 R2 is in the branch.
    Concluding the review of BranchCache operating modes, it should be noted that these modes are mutually exclusive. A specific client with Windows 7 cannot work simultaneously in one or the other mode.

    HTTP Support

    BranchCache supports caching of HTTP and SMB traffic. There are some features inherent in this caching mechanism in the context of these protocols.
    Let's start with HTTP. Since BranchCache is only integrated in Windows Server 2008 R2 and Windows 7, it is probably already clear that BranchCache for HTTP is only applicable if IIS 7.5 from Windows Server 2008 R2 is used as the web server.
    The second feature is related to the generation of hash lists for website files. A hash list for any website file (html, jpg, etc.) is generated after the first access to this file. This leads to the fact that only on the third access to the file, the body of the file can be obtained from BranchCache. Suppose a branch client first accesses a web page. IIS sends the page to the client over HTTP or HTTPS and generates metadata for it. Therefore, the client received a page on his request, but did not receive a hash list, and therefore cannot put this page in his or her dedicated cache. The second time the client accesses the same page, IIS returns not data, but metadata already available. However, since the data was not cached after the first request, the client has no choice but to download the entire page again. But this time it can be cached. And the third request to this page can be served from BranchCache.
    Finally, due to the fact that BranchCache is actually working on transport mechanisms, caching does not affect SSL and vice versa. That is, BranchCache works effectively both when using HTTP and with HTTPS. By the way, this applies equally to IPSec for the same reason. In this video, I demonstrated the configuration and operation of BranchCache for HTTP .

    SMB support

    The amount of data on file servers can be very large and with a high load computing hashes is a very expensive operation. As a result, in the case of SMB, metadata generation occurs in advance. Therefore, in response to the first request, the client receives a hash list, and the second access to this file can be served from the cache. After the initial generation, metadata is automatically updated every time a file is modified. Plus, the administrator has the opportunity to update the hash list for a given file or folder using the hashgen command-line utility.
    On the client side, BranchCache for SMB also uses the Offline Files service. If you disable this service, caching of SMB traffic will stop working. This does not affect the caching of HTTP traffic.
    You can look at the settings and features of BranchCache for SMB .

    Applications and data

    From an architectural point of view, BranchCache is located below the SMB and HTTP drivers. The operation of this module is transparent to applications. In other words, caching will work when using any application that uses the Windows built-in SMB or HTTP stack.
    However, I would note that the effect of BranchCache is largely dependent on the nature of the data used. I will explain. Let us recall the example already considered when a user in a branch opens a document from a remote server. The client receives a hash list from the server and downloads the file body either from the remote server or from BranchCache (either its own or the "neighbor"). What happens if the user changes the contents of this document and closes it while saving the changes? The entire file is saved on a remote server! Once the file has changed, it means that the hash list needs to be recalculated, and this is the server side, so you cannot save the modified file directly to the cache. If the user immediately tries to open the file again, then according to the considered algorithm, the client computer will receive updated metadata from the server, and nothing will remain, how to completely download the file body from the server. The conclusion is simple: BranchCache will have a tangible effect on relatively static data.

    Security

    BranchCache security topic deserves a separate topic, so if Habr readers are interested, I’m ready to write more about BranchCache security. In the meantime, I would like to mention only a few important points.
    First, BranchCache does not provide any special protective measures when transferring data from a remote server to a branch. If, for example, the file is downloaded via HTTP rather than HTTPS, then the body of the file is transmitted in clear text, and BranchCache, for its part, does not add any encryption for the data.
    Secondly, the cache itself, that is, the file on the hard disk inside which the cached blocks are stored, is not encrypted. If you need additional security measures, you can use the appropriate tools, for example, built-in Windows EFS or BitLocker.
    Finally, BranchCache security mechanisms take effect when exchanging information with “neighboring” computers or a server with a dedicated cache. All requests and responses within the framework of such an exchange are encrypted in order to prevent the malicious user from intentionally substituting incorrect data.

    Summing up, I would like to emphasize once again that BranchCache:
    - reduces the load on the WAN channels connecting the enterprise branches and reduces associated costs;
    - Increases the response speed of applications in branches;
    - is a built-in feature of Windows 7 and Windows Server 2008 R2 and is managed by standard tools.

    Also popular now: