Exclusion Security Guarantees
Errors in error handling are the most common source of
Bullshit errors that occurred to me when writing this article.
The main battles about what is better to use when programming in C # - exceptions or return codes for processing have gone to the distant past (*), but still then battles of a different kind do not cease: yes, well, we stopped at handling exceptions, but how can we handle them “correctly”?
There are many points of view about what is “right”, most of which boils down to the fact that you need to catch only those exceptions that you can handle, and throw the rest to the calling code. Well, if an incomprehensible exception made it to the upper level in a bold way, then shoot the entire application, since it is no longer clear whether it is, darling, in a consistent state or not.
There are many pros and cons to this method of catching and handling exceptions, but today I want to consider a slightly different topic. Namely, the topic of ensuring a consistent state of the application in the light of the occurrence of an exception is three levels of security for exceptions.
In the late 1990s, Dave Abrahams proposed three levels of exception safety: a basic guarantee, a strict guarantee, and a no-exception guarantee. This idea was warmly welcomed by the C ++ community of developers, and after its popularization (and some modification) by Herb Sutter, the security guarantees for exceptions began to be widely used in boosts, in the standard C ++ library, as well as in the development of application applications.
Initially, these guarantees were proposed by Dave Abrahams for implementing the STLPort library in C ++, but the idea of exception safety is not tied to a specific programming language and can be used in other languages that use exceptions as the main error handling mechanism, such as Java or C #. In addition, there are currently two versions of exception security definitions: (1) the original version proposed by Dave Abrahams and (2) the modified version popularized by Satter and Straustrup, and more suitable not only for libraries, but also for application applications.
Original definition : “ in the event of exceptions, there should be no leakage of resources .”
The modern definition is : “ if any exception occurs in a certain method, the state of the program must remain consistent .” This means not only the absence of resource leaks, but also the preservation of class invariants, which is a more general criterion compared to the basic definition.
The difference between these two formulations is due to the fact that this guarantee was originally proposed for the implementation of the library in C ++ and had nothing to do with application applications. But if we talk about a more general case (i.e., about the application, and not just about the library), then we can say that resource leakage is only one of the sources of bugs, but far from the only one. Saving the invariant at any stable point in time (**) is a guarantee that no external code can “see” the mismatched state of the application, which, you see, is no less important than the absence of resource leaks. Few users of a banking application will be interested in memory leaks if, when transferring money from one account to another, money can “leave” from one account, but “not reach” another.
As for the definition of a strict guarantee of exceptions, the original and modern definitions are similar and boil down to the following: “ if an exception occurs during the operation, this should not have any effect on the state of the application ”.
In other words, a strict exception guarantee ensures that transactions are transacted when we receive either all or nothing. In this case, when an exception occurs, we must roll back to the state of the application, which was before the operation, and switch to a new state only if the entire operation is successfully completed.
The guarantee of the absence of exceptions boils down to the following: “ under no circumstances will the function generate exceptions ”.
This guarantee is the simplest in terms of definition, however, it is not as simple as it seems. Firstly, it is almost impossible to provide in the general case, especially in the .Net environment, when an exception can occur at almost any point in the application. In practice, only units of operations follow this guarantee, and it is on the basis of such operations that guarantees of previous levels are built. In C #, one of the few operations that provide this guarantee is link assignment, and in C ++, the swap function that implements the exchange of values. It is on the basis of these functions that a strict guarantee of exceptions is often realized when all the "dirty work" is performed in a temporary object, which is then assigned to the resulting value.
Secondly, in some cases it is impossible to ensure the normal operation of other functions if some operations do not follow the guarantee of the absence of exceptions. So, for example, in C ++, to provide even a basic guarantee of exceptions (or rather resource leakage) in containers, it is necessary that a user-type destructor not throw exceptions.
The three exceptions security guarantees discussed above go from the weakest to the strongest; each subsequent guarantee is a superset of the previous one. This means that the fulfillment of a strict guarantee automatically entails the fulfillment of a basic guarantee, and the guarantee of no exceptions entails the fulfillment of a strict guarantee. If the code does not meet even the basic guarantee of exceptions, then it is a time bomb in your application and sooner or later will lead to unpleasant consequences, breaking its state to hell.
Now let's look at a few examples.
The main way to prevent memory and resource leaks in the C ++ language is the RAII (Resource Acquisition Is Initialization) idiom , which is that the object grabs the resource in the constructor and frees it in the destructor. And since the destructor is called automatically when the object leaves the scope for any reason, including when an exception occurs, it is not surprising that the same idiom is also used to ensure the safety of exceptions.
In C #, this idiom migrated as an IDisposable interface and using construct, however, unlike C ++, it is applicable for controlling the lifetime of a resource in a certain scope, and is not suitable for managing many resources captured in the constructor.
Let's look at an example:
So, we have two disposable classes: DisposableA and DisposableB , each of which captures some managed resource in the constructor and frees it in the Dispose method . Let us not consider the finalizer for now, since it does not help us in any way guarantee the deterministic release of resources, which in some cases is vital.
In this case, when throwing an exception by the constructor of the DisposableB class, we will never call the Dispose method , since the disposablenever existed. In this regard, the behavior of most mainstream programming languages is more or less the same, but there are some differences. The similarity is that if the constructor "crashes", then the calling code will not be able to get a link to an object that has not yet been constructed and to explicitly free its resources. However, unlike the C ++ language, in which the destructor of fully constructed fields is called automatically, this does not happen in the "managed" C # language (***). If the constructor of the DisposableB class throws an exception and does not release previously captured resources on its own, we will receive a “drain of resources” (or, at least, their non-deterministic release).
The same problem can manifest itself in a more subtle way. In the case considered earlier, it is clearly visible that we created an instance of the disposable object, after which an exception is thrown. But there are times when the absence of a basic guarantee of exceptions is a little more difficult to see.
Generating an exception in the constructor of the Derived class violates the basic exception guarantee and leads to a resource leak because the Dispose method of the Base class is not called (****). Again, since the compiler knows about the IDisposable interface only through the prism of the using construct , in all cases when the disposable object is a field of another class, only the programmer is responsible for calling the Dispose method .
In addition to the base class, field initializers can play a similar joke when the constructor of one of the fields can throw an exception:
In this case, if the constructor of the DisposableB class throws an exception when initializing the disposableB field , it will be impossible to catch it and release the resources it has already captured. In C ++, there is such a thing as catching exceptions that have occurred in the initialization list (see Exception and Member Initialization ), but there is no such possibility in C #, so there is only one way out of this situation: try not to allow it.
As for all previous cases, the provision of a basic guarantee of exceptions falls entirely on the shoulders of the developer, since C # does not provide any "sugar" for these purposes. All that remains for us is to create subobjects in the necessary order andcreate a disposable field at the very end of the constructor, or wrap their creation in a try / catch block and clear all resources in case of an exception.
The examples of violation of the basic guarantee of exceptions given above, although they are not far-fetched, are not so common. And if in the case of creating objects containing several managed resources, the C # language compiler cannot help us, then it can help in some other cases, for example, when creating objects and collections.
The initializer of objects and collections (object initializer and collection initializer) ensures the atomicity of creating and initializing an object or filling a collection with a list of elements. Let's look at the following example.
At first glance, it might seem like it's just syntactic sugar for the following:
However, in fact, when the object initializer is called, a temporary variable is created, then the properties of this particular variable are changed, and only then it is assigned to the new object :
This ensures the atomicity of the process of creating an object and the inability to use a partially initialized object in the event of an exception being raised by one of the setters. A similar principle lies in the initializer of collections, when objects are added to a temporary collection and only after it is filled in, a temporary variable is assigned to a new object.
The principle underlying these two concepts can easily be used in the own implementation of a strict guarantee of exceptions in native code. To do this, it is enough to carry out all changes in the internal state of the object in a certain temporary variable and only after their completion atomically change its real state.
Correct handling of exceptions is not a simple matter, and, as some examples have shown, sometimes even a basic guarantee of exceptions is difficult. However, providing such guarantees is a vital condition for developing applications, since it is much easier to hide the complexity of working with resources in one place than to smear it with a thin layer throughout the application. The golden rule formulated ten years ago by Scott Meyers is still valid: create classes that are easy to use correctly and difficult to use incorrectly , and the guarantee of exceptions plays an important role in this.
If we talk about the practical application of these guarantees, you should remember a few points. First, code that does not fulfill the basic exception guarantee is incorrect; based on it, it is simply impossible to create an application whose state will not be broken when it is used or changed (*****). Secondly, do not be paranoid and seek the maximum guarantee. It is almost impossible to achieve a 100% guarantee of no exceptions due to the presence of asynchronous exceptions, but even implementing a strict guarantee can be unreasonably expensive in many cases.
In conclusion, we can say the following: exception security guarantees are not a panacea, but an excellent foundation for building reliable applications .
----------------------
(*) Actually, there wasn’t any “hot” debate for one simple reason: you cannot program on the .Net platform without handling exceptions. Such debates are relevant, for example, in the C ++ language, especially when it comes to low-level programming.
(**) In general, no one always requires invariant conservation; usually, the invariant “before” and “after” of the call to the open method is required, but it is not necessary to save it after the call to the closed method, which performs only “part” of the work.
(***) It may seem very funny that a more sophisticated language such as C # might not do something that old C ++ does, but it really is. Let's, as an example, rewrite the code discussed earlier from C # to C ++:
As mentioned earlier in C ++ (unlike C #), when an exception is thrown in the class constructor, destructors of already constructed fields (i.e., subobjects) will be called automatically. This means that in this case, the call to the destructor of the Resource1 object will occur automatically and there will be no resource leaks.
Such differences in the behavior of the C # and C ++ languages are easily explained. In C ++, a resource is everything, including dynamically allocated memory, therefore, resource management tools are at a higher level. An applied programmer working with the C # language much more often uses resources in the using block rather than capturing resources in the constructor. And if he is faced with such a problem, then he will have to solve it on his own, without the help of a compiler.
By the way, Herb Sutter already spoke about this once in his article: “Constructor Exceptions in C ++, C #, and Java” .
(****) Maybe I already got it with these notes, but it is quite important and, it seems, the penultimate one. They often like to set such an example at interviews, so now, my readers know the correct answer to it!
(*****) Everything that is said in this article applies only to synchronous exceptions, since it is almost impossible to guarantee the consistent occurrence of “asynchronous” exceptions, such as OutOfMemoryException or ThreadAbortException . For the proof here: " On the dangers of the Thread.Abort method. "
Bullshit errors that occurred to me when writing this article.
The main battles about what is better to use when programming in C # - exceptions or return codes for processing have gone to the distant past (*), but still then battles of a different kind do not cease: yes, well, we stopped at handling exceptions, but how can we handle them “correctly”?
There are many points of view about what is “right”, most of which boils down to the fact that you need to catch only those exceptions that you can handle, and throw the rest to the calling code. Well, if an incomprehensible exception made it to the upper level in a bold way, then shoot the entire application, since it is no longer clear whether it is, darling, in a consistent state or not.
There are many pros and cons to this method of catching and handling exceptions, but today I want to consider a slightly different topic. Namely, the topic of ensuring a consistent state of the application in the light of the occurrence of an exception is three levels of security for exceptions.
Three types of guarantees
In the late 1990s, Dave Abrahams proposed three levels of exception safety: a basic guarantee, a strict guarantee, and a no-exception guarantee. This idea was warmly welcomed by the C ++ community of developers, and after its popularization (and some modification) by Herb Sutter, the security guarantees for exceptions began to be widely used in boosts, in the standard C ++ library, as well as in the development of application applications.
Initially, these guarantees were proposed by Dave Abrahams for implementing the STLPort library in C ++, but the idea of exception safety is not tied to a specific programming language and can be used in other languages that use exceptions as the main error handling mechanism, such as Java or C #. In addition, there are currently two versions of exception security definitions: (1) the original version proposed by Dave Abrahams and (2) the modified version popularized by Satter and Straustrup, and more suitable not only for libraries, but also for application applications.
Basic warranty
Original definition : “ in the event of exceptions, there should be no leakage of resources .”
The modern definition is : “ if any exception occurs in a certain method, the state of the program must remain consistent .” This means not only the absence of resource leaks, but also the preservation of class invariants, which is a more general criterion compared to the basic definition.
The difference between these two formulations is due to the fact that this guarantee was originally proposed for the implementation of the library in C ++ and had nothing to do with application applications. But if we talk about a more general case (i.e., about the application, and not just about the library), then we can say that resource leakage is only one of the sources of bugs, but far from the only one. Saving the invariant at any stable point in time (**) is a guarantee that no external code can “see” the mismatched state of the application, which, you see, is no less important than the absence of resource leaks. Few users of a banking application will be interested in memory leaks if, when transferring money from one account to another, money can “leave” from one account, but “not reach” another.
Strict warranty
As for the definition of a strict guarantee of exceptions, the original and modern definitions are similar and boil down to the following: “ if an exception occurs during the operation, this should not have any effect on the state of the application ”.
In other words, a strict exception guarantee ensures that transactions are transacted when we receive either all or nothing. In this case, when an exception occurs, we must roll back to the state of the application, which was before the operation, and switch to a new state only if the entire operation is successfully completed.
No Exclusion Guaranteed
The guarantee of the absence of exceptions boils down to the following: “ under no circumstances will the function generate exceptions ”.
This guarantee is the simplest in terms of definition, however, it is not as simple as it seems. Firstly, it is almost impossible to provide in the general case, especially in the .Net environment, when an exception can occur at almost any point in the application. In practice, only units of operations follow this guarantee, and it is on the basis of such operations that guarantees of previous levels are built. In C #, one of the few operations that provide this guarantee is link assignment, and in C ++, the swap function that implements the exchange of values. It is on the basis of these functions that a strict guarantee of exceptions is often realized when all the "dirty work" is performed in a temporary object, which is then assigned to the resulting value.
Secondly, in some cases it is impossible to ensure the normal operation of other functions if some operations do not follow the guarantee of the absence of exceptions. So, for example, in C ++, to provide even a basic guarantee of exceptions (or rather resource leakage) in containers, it is necessary that a user-type destructor not throw exceptions.
The three exceptions security guarantees discussed above go from the weakest to the strongest; each subsequent guarantee is a superset of the previous one. This means that the fulfillment of a strict guarantee automatically entails the fulfillment of a basic guarantee, and the guarantee of no exceptions entails the fulfillment of a strict guarantee. If the code does not meet even the basic guarantee of exceptions, then it is a time bomb in your application and sooner or later will lead to unpleasant consequences, breaking its state to hell.
Now let's look at a few examples.
Examples of basic warranty violation
The main way to prevent memory and resource leaks in the C ++ language is the RAII (Resource Acquisition Is Initialization) idiom , which is that the object grabs the resource in the constructor and frees it in the destructor. And since the destructor is called automatically when the object leaves the scope for any reason, including when an exception occurs, it is not surprising that the same idiom is also used to ensure the safety of exceptions.
In C #, this idiom migrated as an IDisposable interface and using construct, however, unlike C ++, it is applicable for controlling the lifetime of a resource in a certain scope, and is not suitable for managing many resources captured in the constructor.
Let's look at an example:
// Некоторый класс, содержащий управляемые ресурсы
class DisposableA : IDisposable
{
public void Dispose() {}
}
// Еще один класс с управляемыми ресурсами
class DisposableB : IDisposable
{
public DisposableB()
{
disposableA = new DisposableA();
throw new Exception("OOPS!");
}
public void Dispose() {}
private DisposableA disposableA;
}
// Где-то в приложении
using (var disposable = new DisposableB())
{
// Упс! Метод Dispose не будет вызван ни для
// DisposableB, ни для DisposableA
}
* This source code was highlighted with Source Code Highlighter.
So, we have two disposable classes: DisposableA and DisposableB , each of which captures some managed resource in the constructor and frees it in the Dispose method . Let us not consider the finalizer for now, since it does not help us in any way guarantee the deterministic release of resources, which in some cases is vital.
In this case, when throwing an exception by the constructor of the DisposableB class, we will never call the Dispose method , since the disposablenever existed. In this regard, the behavior of most mainstream programming languages is more or less the same, but there are some differences. The similarity is that if the constructor "crashes", then the calling code will not be able to get a link to an object that has not yet been constructed and to explicitly free its resources. However, unlike the C ++ language, in which the destructor of fully constructed fields is called automatically, this does not happen in the "managed" C # language (***). If the constructor of the DisposableB class throws an exception and does not release previously captured resources on its own, we will receive a “drain of resources” (or, at least, their non-deterministic release).
The same problem can manifest itself in a more subtle way. In the case considered earlier, it is clearly visible that we created an instance of the disposable object, after which an exception is thrown. But there are times when the absence of a basic guarantee of exceptions is a little more difficult to see.
class Base : IDisposable
{
public Base()
{
// Захватываем некоторый ресурс
}
public void Dispose() {}
}
class Derived : Base, IDisposable
{
public Derived(object data)
{
if (data == null)
throw new ArgumentNullException("data");
// OOPS!!
}
}
// И снова где-то в приложении
using (var derived = new Derived(null))
{}
* This source code was highlighted with Source Code Highlighter.
Generating an exception in the constructor of the Derived class violates the basic exception guarantee and leads to a resource leak because the Dispose method of the Base class is not called (****). Again, since the compiler knows about the IDisposable interface only through the prism of the using construct , in all cases when the disposable object is a field of another class, only the programmer is responsible for calling the Dispose method .
In addition to the base class, field initializers can play a similar joke when the constructor of one of the fields can throw an exception:
class ComposedDisposable : IDisposable
{
public void Dispose() {}
private readonly DisposableA disposableA = new DisposableA();
// А что, если конструктор DisposableB упадет? OOPS!!
private readonly DisposableB disposableB = new DisposableB();
}
* This source code was highlighted with Source Code Highlighter.
In this case, if the constructor of the DisposableB class throws an exception when initializing the disposableB field , it will be impossible to catch it and release the resources it has already captured. In C ++, there is such a thing as catching exceptions that have occurred in the initialization list (see Exception and Member Initialization ), but there is no such possibility in C #, so there is only one way out of this situation: try not to allow it.
As for all previous cases, the provision of a basic guarantee of exceptions falls entirely on the shoulders of the developer, since C # does not provide any "sugar" for these purposes. All that remains for us is to create subobjects in the necessary order andcreate a disposable field at the very end of the constructor, or wrap their creation in a try / catch block and clear all resources in case of an exception.
An example of a strict guarantee of exceptions. Object initializer and collection initializer
The examples of violation of the basic guarantee of exceptions given above, although they are not far-fetched, are not so common. And if in the case of creating objects containing several managed resources, the C # language compiler cannot help us, then it can help in some other cases, for example, when creating objects and collections.
The initializer of objects and collections (object initializer and collection initializer) ensures the atomicity of creating and initializing an object or filling a collection with a list of elements. Let's look at the following example.
class Person
{
public string FirstName { get; set; }
public string LastName { get; set; }
public int Age { get; set; }
}
var person = new Person
{
FirstName = "Bill",
LastName = "Gates",
Age = 55,
};
* This source code was highlighted with Source Code Highlighter.
At first glance, it might seem like it's just syntactic sugar for the following:
var person = new Person();
person.FirstName = "Bill";
person.LastName = "Gates";
person.Age = 55;
* This source code was highlighted with Source Code Highlighter.
However, in fact, when the object initializer is called, a temporary variable is created, then the properties of this particular variable are changed, and only then it is assigned to the new object :
var tmpPerson = new Person();
tmpPerson.FirstName = "Bill";
tmpPerson.LastName = "Gates";
tmpPerson.Age = 55;
var person = tmpPerson;
* This source code was highlighted with Source Code Highlighter.
This ensures the atomicity of the process of creating an object and the inability to use a partially initialized object in the event of an exception being raised by one of the setters. A similar principle lies in the initializer of collections, when objects are added to a temporary collection and only after it is filled in, a temporary variable is assigned to a new object.
The principle underlying these two concepts can easily be used in the own implementation of a strict guarantee of exceptions in native code. To do this, it is enough to carry out all changes in the internal state of the object in a certain temporary variable and only after their completion atomically change its real state.
Conclusion
Correct handling of exceptions is not a simple matter, and, as some examples have shown, sometimes even a basic guarantee of exceptions is difficult. However, providing such guarantees is a vital condition for developing applications, since it is much easier to hide the complexity of working with resources in one place than to smear it with a thin layer throughout the application. The golden rule formulated ten years ago by Scott Meyers is still valid: create classes that are easy to use correctly and difficult to use incorrectly , and the guarantee of exceptions plays an important role in this.
If we talk about the practical application of these guarantees, you should remember a few points. First, code that does not fulfill the basic exception guarantee is incorrect; based on it, it is simply impossible to create an application whose state will not be broken when it is used or changed (*****). Secondly, do not be paranoid and seek the maximum guarantee. It is almost impossible to achieve a 100% guarantee of no exceptions due to the presence of asynchronous exceptions, but even implementing a strict guarantee can be unreasonably expensive in many cases.
In conclusion, we can say the following: exception security guarantees are not a panacea, but an excellent foundation for building reliable applications .
----------------------
(*) Actually, there wasn’t any “hot” debate for one simple reason: you cannot program on the .Net platform without handling exceptions. Such debates are relevant, for example, in the C ++ language, especially when it comes to low-level programming.
(**) In general, no one always requires invariant conservation; usually, the invariant “before” and “after” of the call to the open method is required, but it is not necessary to save it after the call to the closed method, which performs only “part” of the work.
(***) It may seem very funny that a more sophisticated language such as C # might not do something that old C ++ does, but it really is. Let's, as an example, rewrite the code discussed earlier from C # to C ++:
class Resource1
{
public:
Resource1()
{
// Захватываем некоторый ресурс, будь-то выделяем память
// в куче или создаем дескриптор ОС
}
~Resource1()
{
// Освобождаем захваченный ресурс
}
};
class Resource2
{
public:
Resource2()
{
// В этой точке кода объект resource1_ уже проинициализирован
throw std::exception("Yahoo!");
}
private:
Resource1 resource1_;
};
// где-то в приложении создаем экземпляр класса Resource2
Resource2 resource2;
* This source code was highlighted with Source Code Highlighter.
As mentioned earlier in C ++ (unlike C #), when an exception is thrown in the class constructor, destructors of already constructed fields (i.e., subobjects) will be called automatically. This means that in this case, the call to the destructor of the Resource1 object will occur automatically and there will be no resource leaks.
Such differences in the behavior of the C # and C ++ languages are easily explained. In C ++, a resource is everything, including dynamically allocated memory, therefore, resource management tools are at a higher level. An applied programmer working with the C # language much more often uses resources in the using block rather than capturing resources in the constructor. And if he is faced with such a problem, then he will have to solve it on his own, without the help of a compiler.
By the way, Herb Sutter already spoke about this once in his article: “Constructor Exceptions in C ++, C #, and Java” .
(****) Maybe I already got it with these notes, but it is quite important and, it seems, the penultimate one. They often like to set such an example at interviews, so now, my readers know the correct answer to it!
(*****) Everything that is said in this article applies only to synchronous exceptions, since it is almost impossible to guarantee the consistent occurrence of “asynchronous” exceptions, such as OutOfMemoryException or ThreadAbortException . For the proof here: " On the dangers of the Thread.Abort method. "