Making a Simple Cache-Based Circuit Breaker in Spring

From the sandbox

This article is for those who use an effective cache in their application and want to add stability not only to the application, but to the whole environment by simply adding 1 class to the project.

If you recognize yourself, read on.

What is a Circuit Breaker?

The theme is hackneyed like the world and I will not bore you, increasing entropy and repeating the same thing. From my point of view, Martin Fowler spoke best of all here , but I’ll try to fit the definition into one sentence:
functionality that prevents knowingly doomed requests to an unavailable service, allowing him to “get up off his knees” and continue normal work .

Ideally, preventing doomed requests, Circuit Breaker (hereinafter CB) should not break your application. Instead, it’s good practice to return, if not the most current data, but still relevant (“not foul”), or, if this is not possible, some default value.

Goals

We single out the main thing:

It is necessary to allow the data source to recover, stopping queries for some time
In case of stopping requests to the target service, you need to give, if not the latest, but still relevant data
If the target service is unavailable and there is no relevant data, provide a behavior strategy (returning the default value or another strategy suitable for a particular case)

Implementation mechanism

Case: service is available (first request)

Let's go to the cache. By key (CRT see below). We see that there is nothing in the cache
We go to the target service. We get the value
We store the value in the cache, set it to such a TTL that will cover the maximum possible time that the target service is unavailable, but at the same time it should not exceed the validity period of the data that you are ready to give to the client in case of loss of connection with the target service
Cache Refresh Time (CRT) is stored in the cache for the value from clause 3 - the time after which you need to try to go to the target service and update the value
Return the value from item 2 to the user

Case: CRT did not expire

Let's go to the cache. By the key we find CRT. We see that it is relevant
Get the value for it from the cache.
Return the value to the user.

Case: CRT expired, target service is available

Let's go to the cache. By the key we find CRT. We see that it is irrelevant
We go to the target service. We get the value
Updating the value in the cache and its TTL
Update CRT for it by adding Cache Refresh Period (CRP) - this is the value that needs to be added to CRT to get the next CRT
Return the value to the user.

Case: CRT expired, target service unavailable

Let's go to the cache. By the key we find CRT. We see that it is irrelevant
We go to the target service. He is unavailable
Get the value from the cache. Not the freshest (with a rotten CRT), but still relevant, since its TTL has not yet expired
We return it to the user

Case: CRT expired, target service unavailable, nothing in cache

Let's go to the cache. By the key we find CRT. We see that it is irrelevant
We go to the target service. He is unavailable
Get the value from the cache. He is not
We are trying to apply a special strategy for such cases. For example, returning the default value for the specified field, or a special value of the type “This information is not currently available”. In general, if this is possible, it is better to return something and not break the application. If this is not possible, then you need to apply the exception throw strategy and the quick response to the exception user.

What we will use

I use Spring Boot 1.5 in my project, still have not found the time to upgrade to the second version.

That the article did not turn out 2 times longer, I will use Lombok.

As Key-Value storage (hereinafter simply referred to as KV) I use Redis 5.0.3, but I’m sure that Hazelcast or an analogue will do. The main thing is that there is an implementation of the CacheManager interface. In my case, this is RedisCacheManager from spring-boot-starter-data-redis.

Implementation

Above, in the section “Implementation Mechanism”, two important definitions were made: CRT and CRP. I will write them again in more detail, because they are very important for understanding the code that follows:

Cache Refresh Time ( CRT ) is a separate entry in KV (key + postfix “_crt”), which shows the time when it is time to go to the target service for fresh data. Unlike TTL, the onset of CRT does not mean that your data is “rotten”, but only that it is likely to get more recent in the target service. Got fresh - well, if not, and the current will come down.

Cache Refresh Period ( CRP) Is a value that is added to the CRT after polling the target service (it does not matter if it is successful or not). Thanks to it, a remote service has the ability to “catch its breath” and restore its work in the event of a fall.

So, traditionally, let's start by designing the main interface. It is through it that you will need to work with the cache if you need CB logic. It should be as simple as possible:

publicinterfaceCircuitBreakerService{
   <T> T getStableValue(StableValueParameter parameter);
   voidevictValue(EvictValueParameter parameter);
}

Interface Parameters:

@Getter@AllArgsConstructorpublicclassStableValueParameter<T> {
   private String cachePrefix; // исключает пересечения ключейprivate String objectCacheKey;
   privatelong crpInSeconds; // Cache Refresh Periodprivate Supplier<T> targetServiceAction; // получение данных с целевого сервисаprivate DisasterStrategy disasterStrategy; // реализация логики кейса: CRT истекло, целевой сервис недоступен, в кеше ничего нетpublicStableValueParameter(
   String cachePrefix,
   String objectCacheKey,
   long crpInSeconds,
   Supplier<T> targetServiceAction
){
   this.cachePrefix = cachePrefix;
   this.objectCacheKey = objectCacheKey;
   this.crpInSeconds = crpInSeconds;
   this.targetServiceAction = targetServiceAction;
   this.disasterStrategy = new ThrowExceptionDisasterStrategy();
}
}

@Getter@AllArgsConstructorpublicclassEvictValueParameter{
   private String cachePrefix;
   private String objectCacheKey;
}

This is how we will use it:

public AccountDataResponse findAccount(String accountId){
   final StableValueParameter<?> parameter = new StableValueParameter<>(
       ACCOUNT_CACHE_PREFIX,
       accountId,
       properties.getCrpInSeconds(),
       () -> bankClient.findById(accountId)
   );
   return circuitBreakerService.getStableValue(parameter);
}

If you need to clear the cache, then:

publicvoidevictAccount(String accountId){
   final EvictValueParameter parameter = new EvictValueParameter(
       ACCOUNT_CACHE_PREFIX,
       accountId
   );
   circuitBreakerService.evictValue(parameter);
}

Now the most interesting thing is the implementation (explained in the comments in the code):

@Overridepublic <T> T getStableValue(StableValueParameter parameter){
       final Cache cache = cacheManager.getCache(parameter.getCachePrefix());
       if (cache == null) {
           return logAndThrowUnexpectedCacheMissing(parameter.getCachePrefix(), parameter.getObjectCacheKey());
       }
       // Идем в кеш. По ключу CRTfinal String crtKey = parameter.getObjectCacheKey() + CRT_CACHE_POSTFIX;
	 // Получаем CRT из кеша, либо заведомо истекшееfinal LocalDateTime crt = Optional.ofNullable(cache.get(crtKey, LocalDateTime.class))
           .orElseGet(() -> DateTimeUtils.now().minusSeconds(1));
       if (DateTimeUtils.now().isBefore(crt)) {
           // если CRT еще не наступил, возвращаем значение из кешаfinal Optional<T> valueFromCache = getFromCache(parameter, cache);
           if (valueFromCache.isPresent()) {
               return valueFromCache.get();
           }
       }
       // если CRT уже наступил, пытаемся обновить кеш значением из целевого сервисаreturn getFromTargetServiceAndUpdateCache(parameter, cache, crtKey, crt);
   }
privatestatic <T> Optional<T> getFromCache(StableValueParameter parameter, Cache cache){
       return (Optional<T>) Optional.ofNullable(cache.get(parameter.getObjectCacheKey()))
           .map(Cache.ValueWrapper::get);
   }

If the target service is unavailable, try to get the still relevant data from the cache:

private <T> T getFromTargetServiceAndUpdateCache(
       StableValueParameter parameter,
       Cache cache,
       String crtKey,
       LocalDateTime crt
   ){
       T result;
       try {
           result = getFromTargetService(parameter);
       }
       /* Circuit breaker exceptions */catch (WebServiceIOException ex) {
           log.warn(
               "[CircuitBreaker] Service responded with error: {}. Try get from cache {}: {}",
               ex.getMessage(),
               parameter.getCachePrefix(),
               parameter.getObjectCacheKey());
           result = getFromCacheOrDisasterStrategy(parameter, cache);
       }
       cache.put(parameter.getObjectCacheKey(), result);
       cache.put(crtKey, crt.plusSeconds(parameter.getCrpInSeconds()));
       return result;
   }
privatestatic <T> T getFromTargetService(StableValueParameter parameter){
       return (T) parameter.getTargetServiceAction().get();
   }

If there was no actual data in the cache (they were deleted by TTL, and the target service is still unavailable), then we use DisasterStrategy:

private <T> T getFromCacheOrDisasterStrategy(StableValueParameter parameter, Cache cache){
       return (T) getFromCache(parameter, cache).orElseGet(() -> parameter.getDisasterStrategy().getValue());
   }

There is nothing interesting in deleting from the cache, I will give it here only for completeness:

private <T> T getFromCacheOrDisasterStrategy(StableValueParameter parameter, Cache cache){
       return (T) getFromCache(parameter, cache).orElseGet(() -> parameter.getDisasterStrategy().getValue());
   }

There is nothing interesting in deleting from the cache, I will give it here only for completeness:

@OverridepublicvoidevictValue(EvictValueParameter parameter){
       final Cache cache = cacheManager.getCache(parameter.getCachePrefix());
       if (cache == null) {
           logAndThrowUnexpectedCacheMissing(parameter.getCachePrefix(), parameter.getObjectCacheKey());
           return;
       }
       final String crtKey = parameter.getObjectCacheKey() + CRT_CACHE_POSTFIX;
       cache.evict(crtKey);
   }

Disaster strategy

This is, in fact, the logic that occurs if the CRT expires, the target service is unavailable, there is nothing in the cache.

I wanted to describe this logic separately, because many do not get their hands to think about how to implement it. But this, in fact, is what makes our system truly stable.

Don't you want to feel that sense of pride in your brainchild when everything that can only fail is refused, and your system still works. Even despite the fact that, for example, in the field “price” not the actual price of the goods will be displayed, but the inscription: “currently being specified”, but how much better is this than the answer “500 service is unavailable”. After all, for example, the remaining 10 fields: product description, etc. you returned. How much does the quality of such a service change? .. My call is to pay more attention to details, making them better.

I finish the lyrical digression. So, the strategy interface will be as follows:

publicinterfaceDisasterStrategy<T> {
   T getValue();
}

You should choose the implementation depending on the specific case. For example, if you can return some default value, then you can do something like this:

publicclassDefaultValueDisasterStrategyimplementsDisasterStrategy<String> {
   @Overridepublic String getValue(){
       return"в настоящий момент уточняется";
   }
}

Or, if in a particular case you don’t have to return anything at all, then you can throw an exception:

publicclassThrowExceptionDisasterStrategyimplementsDisasterStrategy<Object> {
   @Overridepublic Object getValue(){
       thrownew CircuitBreakerNullValueException("Ops! Service is down and there's null value in cache");
   }
}

In this case, the CRT will not be incremented and the next request will again follow to the target service.

Conclusion

I adhere to the following point of view - if you have the opportunity to use a ready-made solution, and not to make a fuss, in fact, although simple, but still bike in this article, so do it. Use this article to understand how it works, not as a guide to action.

There are so many ready-made solutions, especially if you use Spring Boot 2, such as Hystrix.

The most important thing to understand is that this solution is based on the cache and its effectiveness is equal to the effectiveness of the cache. If the cache is ineffective (few hits, many misses), then this Circuit Breaker will be equally inefficient: each cache miss will be accompanied by a trip to the target service, which, perhaps, is in agony and anguish at this moment, trying to rise.

Before using this approach, be sure to measure the effectiveness of your cache. You can do this according to “Cache Hit Rate” = hits / (hits + misses), it should tend to 1, and not to 0.

And yes, no one bothers you to keep several varieties of CB in your project at once, using the one that is best way solves a specific problem.

Tags: