Creating a first-level in-memory cache for StackExchange.Redis .NET clients

Original author: Jonathan Cardy
  • Transfer
Jonathan Cardy wrote the open-source .NET library StackRedis.L1, which allows you to create a first-level cache for Redis. In other words, using the StackExchange.Redis library in a .NET application, you can connect StackRedis.L1 to it to speed up work by locally caching data in RAM. This avoids unnecessary calls to Redis in cases where the data has not been modified. The library is available on GitHub and NuGet.
This article talks about how and why it was created.



Background

For the past couple of years, I have been working on a Repsor custodian application running SharePoint.If you are familiar with SharePoint, then you know that its model requires running applications on a separate server. This improves stability and simplifies the architecture. However, this model works to the detriment of performance, because you have to go to the SharePoint server for data each time, and network delays in this case are not in your favor.

For this reason, we decided to add Redis caching to the application using StackExchange.Redis as a .NET client. To fully cache all application data models, we use sets, ordered sets, and hash tables.
As expected, this greatly accelerated the application. Now pages returned in about 500 ms instead of 2 s, as before. But the analysis showed that even out of these 500 ms, a considerable part of the time was spent sending or receiving data from Redis. Moreover, the main part of the data received each time upon request from the page was not changed.

Cache caching

Redis is not only an incredibly fast caching tool, but also a set of tools for managing cached data. This system is well developed and very widely supported. The StackExchange.Redis library is free and open source, and its community is growing rapidly. Stack Exchange aggressively caches all Redis data, although it is one of the busiest sites on the Internet. But the trump card is that it performs in-memory caching of all available data on the Redis server and, thus, it usually does not even have to access Redis.

The following quote explains, to some extent, how the caching mechanism works in Stack Exchange:
http://meta.stackexchange.com/questions/69164/does-stack-exchange-use-caching-and-if-so-how

“Undoubtedly, the most optimal amount of data that can be sent most quickly over the network is 0 bytes.”


When you cache data before it goes to another cache, you create several levels of cache. If you perform in-memory caching of data before caching it in Redis, the in-memory cache is assigned the first level L1.

Therefore, if Redis is the slowest part of your application, you are on the right track and you can definitely accelerate this shortcoming in the future.

In-memory caching

In the simplest situation when using Redis, your code will look something like this:

//Try and retrieve from Redis
RedisValue redisValue = _cacheDatabase.StringGet(key);
if(redisValue.HasValue)
{
    return redisValue; //It's in Redis - return it
}
else
{
    string strValue = GetValueFromDataSource(); //Get the value from eg. SharePoint or Database etc
    _cacheDatabase.StringSet(key, strValue); //Add to Redis
    return strValue;
}


And if you decide to apply in-memory caching (i.e., L1 cache), the code will get a little more complicated:

//Try and retrieve from memory
if (_memoryCache.ContainsKey(key))
{
    return key;
}
else
{
    //It isn't in memory. Try and retrieve from Redis
    RedisValue redisValue = _cacheDatabase.StringGet(key);
    if (redisValue.HasValue)
    {
        //Add to memory cache
        _memoryCache.Add(key, redisValue);
        return redisValue;  //It's in redis - return it
    }
    else
    {
        string strValue = GetValueFromDataSource(); //Get the value from eg. SharePoint or Database etc
        _cacheDatabase.StringSet(key, strValue); //Add to Redis
        _memoryCache.Add(key, strValue); //Add to memory
        return strValue;
    }
}


Although it’s not so difficult to implement, it’s becoming much more complicated, if you try to do the same for other data models in Redis. In addition, the following problems will occur:

• Redis allows us to manage data through functions like StringAppend. In this case, we will have to declare the in-memory element invalid.
• If you delete a key through KeyDelete, you must also delete it from the in-memory cache.
• If another client changes or deletes a value, it will become obsolete in our client’s in-memory cache.
• When a key expires, it must be removed from the in-memory cache.

The methods for accessing data and updating data of the StackExchange.Redis library are defined in the IDatabase interface. It turns out that we can rewrite the implementation of IDatabase in such a way as to solve all the above problems. Here's how to do it:

• StringAppend - add the data to the in-memory string, and then pass the same operation to Redis. For more complex data operations, it will be necessary to delete the in-memory key.
• KeyDelete, KeyExpire, etc. - deletes in-memory data.
• Operations through another client - Redis keyspace notifications are designed to detect data changes and declare them invalid accordingly.
The beauty of this approach is that you continue to use the same interface as before. It turns out that the implementation of the L1 cache does not require any changes in the code.

Architecture



I chose the following solution with these basic elements:

Static register MemoryCache

You can create new RedisL1Database entities by passing a ready-made entity to Redis IDatabase, and it will continue to use any in-memory cache that it previously created for this database.

Notification Level

NotificationDatabase- publishes special key-space events necessary to keep the in-memory cache up to date. The standard keyspace notifications in Redis do not achieve the same result, because they do not provide enough information to invalidate the required part of the cache. For example, if you delete a hash key, you will receive an HDEL notification with information about which hash it was deleted from. But it does not indicate what element of the hash it was. In turn, special events also contain information about the hash element itself.
NotificationListener- Subscribes to special key-space events and accesses the static cache to invalidate the required key. He also subscribes to a built-in Redis keyspace event called expire. This allows you to quickly remove from memory all expired Redis keys.

Next, we'll look at ways to cache various data models in Redis.

String

Working with a string is relatively simple. The IDatabase StringSet method looks like this:

public bool StringSet(RedisKey key, RedisValue value, TimeSpan? expiry = default(TimeSpan?), When when = When.Always, CommandFlags flags =
CommandFlags.None)

• Key and Value - the names speak for themselves.
• Expiry is an arbitrary period of time, and therefore we need to use the in-memory cache with a validity period.
• When - allows you to define the conditions for creating a line: set the line only if it already exists or does not exist.
• Flags - allows you to specify the details of the Redis cluster (not relevant).

To store data, we use System.Runtime.Caching.MemoryCache, which allows automatic key expiration. The StringSet method looks like this:

public bool StringSet(RedisKey key, RedisValue value, TimeSpan? expiry = default(TimeSpan?), When when = When.Always, CommandFlags flags = CommandFlags.None)
{
    if (when == When.Exists && !_cache.Contains(key))
    {
        //We're only supposed to cache when the key already exists.
        return;
    }
    if (when == When.NotExists && _cache.Contains(key))
    {
        //We're only supposed to cache when the key doesn't already exist.
        return;
    }
    //Remove it from the memorycache before re-adding it (the expiry may have changed)
    _memCache.Remove(key);
    CacheItemPolicy policy = new CacheItemPolicy()
    {
        AbsoluteExpiration = DateTime.UtcNow.Add(expiry.Value)
    };
    _memCache.Add(key, o, policy);
 //Forward the request on to set the string in Redis
    return _redisDb.StringSet(key, value, expiry, when, flags);


StringGet can then read the in-memory cache before trying to access Redis:

 public RedisValue StringGet(RedisKey key, CommandFlags flags = CommandFlags.None)
    {
        var cachedItem = _memCache.Get(key);
        if (cachedItem.HasValue)
        {
            return cachedItem;
        }
        else
        {
            var redisResult = _redisDb.StringGet(key, flags);
            //Cache this key for next time
            _memCache.Add(key, redisResult);
            return redisResult;
        }
    }


Redis supports many operations for strings. In each case, their use is associated either with updating the in-memory value, or with declaring it invalid. In general, the second option works better, because otherwise you risk unnecessarily complicating many Redis operations.

• StringAppend is a very simple operation. To the in-memory line, if one exists and is not declared inactive, data is added.
• StringBitCount, StringBitOperation, StringBitPosition - the operation is performed in Redis, in-memory operation is not required.
• StringIncrement, StringDecrement, StringSetBit, StringSetRange - the in-memory string is declared invalid until the operation is redirected to Redis.
• StringLength - returns the length of the string if it is in the in-memory cache. If not, the operation gets it from Redis.

Many

Many are a little more difficult to handle. The SetAdd method is performed as follows:

1. Check the MemoryCache on the HashSet with this key
• If one does not exist, create it.
2. Add each Redis value to the set.

Adding and removing values ​​from sets is quite simple. The SetMove method is a SetRemove followed by SetAdd.
Most other set queries can be cached. For example:

SetMembers - returns all elements of the set so that the result is stored in memory.
SetContains, SetLength- Checks in-memory sets before accessing Redis.
SetPop - pops a data item from a set of Redis and then removes this item from the in-memory of the set, if any.
SetRandomMember - receives a random element of the set from Redis, then performs its in-memory caching and returns.
SetCombine, SetCombineAndStore - in-memory is not required.
SetMove - removes a data item from an in-memory set, adds it to another in-memory set, and redirects it to Redis.

Hash Tables

Hash tables are also relatively simple, as their in-memory implementation is just a Dictionary, which, in principle, is very similar to a string.

Basic operation:

1. If the hash table is not available in in-memory, you need to create a Dictionaryand save it.
2. Perform operations on the in-memory dictionary, if possible.
3. Redirect the request to Redis, if necessary.
• Cache the result.

The following operations are available for hash tables:

• HashSet - saves values ​​in a dictionary stored in a key, and then redirects the request to Redis.
• HashValues, HashKeys, HashLength - there is no in-memory application.
• HashDecrement, HashIncrement, HashDelete - removes a value from the dictionary and redirects to Redis.
• HashExists - returns true if the value is in the in-memory cache. Otherwise, it redirects the request to Redis.
• HashGet - requests data from the in-memory cache. Otherwise, it redirects the request to Redis.
• HashScan - Retrieves results from Redis and adds them to the in-memory cache.

Ordered set

Ordered sets are undoubtedly the most complex data model in terms of creating an in-memory cache. In this case, the in-memory caching process involves the use of so-called “disjoint ordered sets”. That is, every time the local cache sees a small fragment of an ordered set of Redis, this fragment is added to the “disjoint” ordered set. If, in the future, a subsection of an ordered set is requested, the disjoint ordered set will be checked first. If it entirely contains the necessary section, it can be returned with full confidence that there are no missing components.

Ordered in-memory sets are not sorted by value, but by a special score parameter. Theoretically, it would be possible to expand the implementation of disjoint ordered sets so that it is possible to sort them by value, but at the moment this has not yet been implemented.

Operations use disjoint sets as follows:

SortedSetAdd - values ​​are added to the in-memory set discretely, which means we don’t know if they are related in terms of score.
SortedSetRemove - the value is deleted both from memory and from Redis.
SortedSetRemoveRangeByRank - all in-memory sets are declared invalid.
SortedSetCombineAndStore, SortedSetLength, SortedSetLengthByValue, SortedSetRangeByRank, SortedSetRangeByValue, SortedSetRank - the request is sent directly to Redis.
SortedSetRangeByRankWithScores, SortedSetScan - data is requested from Redis and then discretely cached.
SortedSetRangeByScoreWithScores is the most cached function since scores are returned in order. The cache is checked, and if it can process the request, it is returned. Otherwise, a request is sent to Redis, after which scores are cached in memory as a continuous set of data.
SortedSetRangeByScore- data is taken from the cache, if possible. Otherwise, they are taken from Redis and are not cached, since scores are not returned.
SortedSetIncrement, SortedSetDecrement - in-memory data is updated, and the request is redirected to Redis.
SortedSetScore — The value is taken from memory, if possible. Otherwise, a request is sent to Redis.

The complexity of working with ordered sets is due to two reasons: firstly, the characteristic complexity of constructing an in-memory implementation for existing subsets of ordered sets (i.e., building discontinuous sets). And secondly, in view of the number of available operations in Redis that require execution. To some extent, complexity is reduced due to the ability to implement targeted caching of requests that include Score. One way or another, serious unit testing of all components is necessary.

List

Lists are not so easy to cache in RAM. The reason is that operations with them, as a rule, involve working either with the head or with the tail of the list. And this is not as simple as it might seem at first glance, because we do not have one hundred percent opportunity to make sure that the in-memory list has the same data at the beginning and at the end as the list in Redis. In part, this difficulty could be resolved using keyspace notifications, but so far this has not been realized.

Updates from other customers

Up to this point, we proceeded from the consideration that we have only one Redis database client. In fact, there can be many customers. In such conditions, it is quite likely that one client has data in the in-memory cache, while the other updates it, which makes the data of the first client invalid. There is only one way to solve this problem - to configure communication between all clients. Fortunately, Redis provides a publisher-subscriber messaging engine. This template has 2 types of notifications.

Keyspace

notifications These notifications are published to Redis automatically when a key changes.

Specially Published Notices

Published at the notification level, implemented as part of the caching mechanism for the client. Their goal is to reinforce key-space notifications with additional information necessary to invalidate the individual small parts of the cache.
Working with multiple clients is a major cause of caching issues.

Risks and problems

Lost notifications

Redis does not guarantee delivery of key space notifications. Accordingly, notifications may be lost, and invalid data will remain in the cache. This is one of the most basic risks, but, fortunately, such situations rarely occur. However, when working with two or more clients, this danger must be taken into account.

Decision:Use a reliable messaging mechanism between customers with guaranteed delivery. Unfortunately, this is not yet, so see an alternative solution.
An alternative solution: perform in-memory caching only for a short time, say, for one hour. In this case, all the data that we access within an hour will load faster. And if any notice is lost, this omission will soon be filled up. Call the Dispose method on the database, and then instantiate it again to flush it.

Uses one - everyone uses.
If one client uses this level of caching (L1), all other clients must use it to notify each other in case of data changes.

No limit for in-memory data
At the moment, anything that can be cached will be sent to the cache. In such a situation, your memory may quickly run out. In my case, this is not a problem, but just keep that in mind.

Conclusion
They say that there are only 10 complex problems in programming: cache invalidation, naming and the binary system ... I am sure this project can be useful in many situations. The source code of the library is available on GitHub . Follow the project and take part in it!

Also popular now: