Struct and readonly: how to avoid performance degradation
- Transfer
Using the Struct type and the readonly modifier can sometimes cause a performance hit. Today we will talk about how to avoid this using one Open Source code analyzer - ErrorProne.NET.
As you probably know from my previous publications “ The 'in'-modifier and the readonly structs in C # ” (“The in modifier and readonly structures in C #”) and “The performance traps of the refs in C # ” (“ Performance hooks when using local variables and return values with the ref modifier), working with structures is more difficult than it might seem. Leaving aside the issue of variability, I note that the behavior of structures with the readonly modifier (read-only) and without it in readonly contexts is very different.
It is assumed that structures are used in programming scripts that require high performance, and to work effectively with them you need to know something about the various hidden operations generated by the compiler to ensure the structure remains unchanged.
Here is a brief list of warnings you should remember:
The same rules work if x is a parameter with an in modifier, a local variable with a ref modifier readonly, or the result of a method call that returns a value via a readonly reference.
Here are a few rules to remember. And, most importantly, the code that relies on these rules is very fragile (i.e., changes made to the code immediately cause significant changes in other parts of the code or documentation - approx. Transl.). How many people will notice that replacement
Such properties of structures literally call for the development of analyzers. And the call was heard. Meet ErrorProne.NET - a set of analyzers that informs you about the possibility of changing the program code to improve its design and performance when working with structures.
The best way to avoid subtle mistakes and negative impact on performance when using structures is to make them readonly whenever possible. The readonly modifier in the declaration of the structure clearly expresses the intention of the developer (emphasizing that the structure is immutable) and helps the compiler avoid generating protective copies in many of the contexts mentioned above.
The readonly structure declaration does not violate the integrity of the code. You can safely run the fixer (code correction process) in batch mode and declare all the structures of the entire software solution read-only.
The next step is to assess the safety of using new features (in modifier, ref readonly local variables, etc.). This means that the compiler will not create hidden protective copies that can reduce performance.
Three categories of types can be considered:
The first category includes readonly structures and POCO structures. The compiler will never generate a defensive copy if the structure is readonly. It is also safe in the context of readonly to use POCO structures: access to the fields is considered safe and no protective copies are created.
The second category is the structures without the readonly modifier, which do not contain open fields. In this case, any access to the open member in the context of readonly will cause the creation of a protective copy.
The latter category is structures with public or internal fields and properties or public or internal methods. In this case, the compiler creates defensive copies depending on the member used.
This separation helps to instantly display warnings if the “unfriendly” structure is transmitted with the in modifier, is stored in the local ref variable readonly, etc.
The analyzer does not display warnings if the “unfriendly” structure is used as the readonly field, since there is no alternative. The in and ref modifiers are readonly designed for optimization purposes, specifically to avoid creating redundant copies. If the structure is "unfriendly" with respect to these modifiers, you have other options: pass the argument by value or store a copy in a local variable. In this regard, the readonly fields behave differently: if you want to make the type immutable, you must use these fields. Remember: the code should be clear and elegant, and only secondarily - fast.
The compiler performs many actions hidden from the user. As was shown in the previous publication , it is quite difficult to see when a protective copy is created.
The analyzer detects the following hidden copies:
Note that analyzers display diagnostic messages only if the structure size is ≥16 bytes.
Passing large structures by value and, as a result, the compiler's creation of defensive copies significantly affect performance. At least, this is shown by the results of performance tests. But how will these phenomena affect real-world applications in terms of through-passage time?
To test the analyzers using real code, I used them for two projects: the Roslyn project and the internal project I am currently working on at Microsoft (the project is a standalone computer application with stringent performance requirements); let's call it for clarity "Project D".
Here are the results:
I changed all 300 structures in project D, making them readonly, and then corrected hundreds of cases of their use, indicating that they are transmitted with the in modifier. Then I measured the end-to-end travel time for various performance scenarios. The differences were statistically insignificant.
Does this mean that the possibilities described above are useless? Not at all.
Working on a project with high performance requirements (for example, Roslyn or Project D) implies that a large number of people spend a lot of time on various types of optimization. In fact, in some cases, the structures in our code were transmitted with the ref modifier, and some fields were declared without the readonly modifier to eliminate the generation of protective copies. The lack of productivity growth in the transfer of structures with the in modifier may mean that the code has been well optimized and there is no redundant copying of structures on the critical paths of its passage.
I believe that the issue of using the readonly modifier for structures does not require much thought. If the structure is immutable, then the readonly modifier simply explicitly forces the compiler to make such a design decision. And the lack of protective copies for such structures is just a bonus.
Today my recommendations are as follows: if the structure can be made readonly, then by all means make it so.
The use of other considered features has nuances.
Herb Sutter in his amazing book, Coding Standards in C ++: 101 Rule, Guidelines and Best Practices , introduces the concept of “pre-pessimization.”
“Other things being equal, the complexity of the code and its readability, some effective design patterns and coding idioms must naturally flow from your fingertips. Such code is no more difficult to write than its pessimized alternatives. You are not engaged in preliminary optimization, but avoid voluntary pessimization. ”
From my point of view, the parameter with the in modifier is the very case. If you know that the structure is relatively large (40 bytes or more), then you can always transfer it with the in modifier. The price of using the modifier in is relatively small, since it does not need to correct the calls, and you can get real benefits.
In contrast, for local variables and return values with the readonly ref modifier, the situation is different. I would say that these features should be used when encoding libraries, and in the application code they should be discarded (only if profiling the code does not reveal that the copy operation is really a problem). Using these features requires additional effort from you, and it becomes more difficult for the code reader to understand it.
As you probably know from my previous publications “ The 'in'-modifier and the readonly structs in C # ” (“The in modifier and readonly structures in C #”) and “The performance traps of the refs in C # ” (“ Performance hooks when using local variables and return values with the ref modifier), working with structures is more difficult than it might seem. Leaving aside the issue of variability, I note that the behavior of structures with the readonly modifier (read-only) and without it in readonly contexts is very different.
It is assumed that structures are used in programming scripts that require high performance, and to work effectively with them you need to know something about the various hidden operations generated by the compiler to ensure the structure remains unchanged.
Here is a brief list of warnings you should remember:
- Using large structures that are transmitted or returned by value can cause performance problems on critical program execution paths.
x.Y
causes the creation of a protective copy of x if:x
is a readonly field;- type
x
is a structure without a readonly modifier; Y
- not a field.
The same rules work if x is a parameter with an in modifier, a local variable with a ref modifier readonly, or the result of a method call that returns a value via a readonly reference.
Here are a few rules to remember. And, most importantly, the code that relies on these rules is very fragile (i.e., changes made to the code immediately cause significant changes in other parts of the code or documentation - approx. Transl.). How many people will notice that replacement
public readonly int X
; onpublic int X { get; }
in a frequently used structure without a readonly modifier significantly affects performance? Or how easy is it to see that passing a parameter using the in modifier instead of passing by value can reduce performance? This is indeed possible when using the in property of a parameter in a loop, when a protective copy is created at each iteration. Such properties of structures literally call for the development of analyzers. And the call was heard. Meet ErrorProne.NET - a set of analyzers that informs you about the possibility of changing the program code to improve its design and performance when working with structures.
Code analysis with the message "Make the structure X readonly"
The best way to avoid subtle mistakes and negative impact on performance when using structures is to make them readonly whenever possible. The readonly modifier in the declaration of the structure clearly expresses the intention of the developer (emphasizing that the structure is immutable) and helps the compiler avoid generating protective copies in many of the contexts mentioned above.
The readonly structure declaration does not violate the integrity of the code. You can safely run the fixer (code correction process) in batch mode and declare all the structures of the entire software solution read-only.
Friendliness to the ref readonly modifier
The next step is to assess the safety of using new features (in modifier, ref readonly local variables, etc.). This means that the compiler will not create hidden protective copies that can reduce performance.
Three categories of types can be considered:
- structures that are friendly to ref readonly, the use of which never leads to the creation of protective copies;
- structures unfriendly to ref readonly, the use of which in the context of readonly always leads to the creation of defensive copies;
- neutral structures are structures whose use can generate protective copies depending on the member used in the readonly context.
The first category includes readonly structures and POCO structures. The compiler will never generate a defensive copy if the structure is readonly. It is also safe in the context of readonly to use POCO structures: access to the fields is considered safe and no protective copies are created.
The second category is the structures without the readonly modifier, which do not contain open fields. In this case, any access to the open member in the context of readonly will cause the creation of a protective copy.
The latter category is structures with public or internal fields and properties or public or internal methods. In this case, the compiler creates defensive copies depending on the member used.
This separation helps to instantly display warnings if the “unfriendly” structure is transmitted with the in modifier, is stored in the local ref variable readonly, etc.
The analyzer does not display warnings if the “unfriendly” structure is used as the readonly field, since there is no alternative. The in and ref modifiers are readonly designed for optimization purposes, specifically to avoid creating redundant copies. If the structure is "unfriendly" with respect to these modifiers, you have other options: pass the argument by value or store a copy in a local variable. In this regard, the readonly fields behave differently: if you want to make the type immutable, you must use these fields. Remember: the code should be clear and elegant, and only secondarily - fast.
Hidden copies analysis
The compiler performs many actions hidden from the user. As was shown in the previous publication , it is quite difficult to see when a protective copy is created.
The analyzer detects the following hidden copies:
- Bcc readonly field.
- Bcc argument in.
- Bcc local variable ref readonly.
- Bcc refononly return value.
- Bcc when invoking an extension method that takes a parameter with this modifier by value for an instance of the structure.
publicstruct NonReadOnlyStruct
{
publicreadonlylong PublicField;
publiclong PublicProperty { get; }
publicvoidPublicMethod() { }
privatestaticreadonly NonReadOnlyStruct _ros;
publicstaticvoidSamples(in NonReadOnlyStruct nrs)
{
// Ok. Public field access causes no hidden copiesvar x = nrs.PublicField;
// Ok. No hidden copies.
x = _ros.PublicField;
// Hidden copy: Property access on 'in'-parameter
x = nrs.PublicProperty;
// Hidden copy: Method call on readonly field
_ros.PublicMethod();
refreadonlyvar local = ref nrs;
// Hidden copy: method call on ref readonly local
local.PublicMethod();
// Hidden copy: method call on ref readonly return
Local().PublicMethod();
refreadonly NonReadOnlyStruct Local() => ref _ros;
}
}
Note that analyzers display diagnostic messages only if the structure size is ≥16 bytes.
Using analyzers in real projects
Passing large structures by value and, as a result, the compiler's creation of defensive copies significantly affect performance. At least, this is shown by the results of performance tests. But how will these phenomena affect real-world applications in terms of through-passage time?
To test the analyzers using real code, I used them for two projects: the Roslyn project and the internal project I am currently working on at Microsoft (the project is a standalone computer application with stringent performance requirements); let's call it for clarity "Project D".
Here are the results:
- In projects with high performance requirements, as a rule, contains a lot of structures, and most of them can be done readonly. For example, in the Roslyn project, the analyzer detected about 400 structures that can be made readonly, and in project D, approximately 300.
- In projects with high performance requirements, hidden copies should be created only in exceptional situations. I found only a few such cases in the Roslyn project, since most of the structures have public fields instead of public properties. This avoids creating security copies in a situation where the structures are stored in the readonly fields. There were more hidden copies in project D, since at least half of them had get-only properties (read-only access).
- The transfer of even fairly large structures using the in modifier is likely to have a very weak (almost imperceptible) effect on the through-passage time of the program.
I changed all 300 structures in project D, making them readonly, and then corrected hundreds of cases of their use, indicating that they are transmitted with the in modifier. Then I measured the end-to-end travel time for various performance scenarios. The differences were statistically insignificant.
Does this mean that the possibilities described above are useless? Not at all.
Working on a project with high performance requirements (for example, Roslyn or Project D) implies that a large number of people spend a lot of time on various types of optimization. In fact, in some cases, the structures in our code were transmitted with the ref modifier, and some fields were declared without the readonly modifier to eliminate the generation of protective copies. The lack of productivity growth in the transfer of structures with the in modifier may mean that the code has been well optimized and there is no redundant copying of structures on the critical paths of its passage.
What should I do with these features?
I believe that the issue of using the readonly modifier for structures does not require much thought. If the structure is immutable, then the readonly modifier simply explicitly forces the compiler to make such a design decision. And the lack of protective copies for such structures is just a bonus.
Today my recommendations are as follows: if the structure can be made readonly, then by all means make it so.
The use of other considered features has nuances.
Pre-optimization vs. pre-pessimization?
Herb Sutter in his amazing book, Coding Standards in C ++: 101 Rule, Guidelines and Best Practices , introduces the concept of “pre-pessimization.”
“Other things being equal, the complexity of the code and its readability, some effective design patterns and coding idioms must naturally flow from your fingertips. Such code is no more difficult to write than its pessimized alternatives. You are not engaged in preliminary optimization, but avoid voluntary pessimization. ”
From my point of view, the parameter with the in modifier is the very case. If you know that the structure is relatively large (40 bytes or more), then you can always transfer it with the in modifier. The price of using the modifier in is relatively small, since it does not need to correct the calls, and you can get real benefits.
In contrast, for local variables and return values with the readonly ref modifier, the situation is different. I would say that these features should be used when encoding libraries, and in the application code they should be discarded (only if profiling the code does not reveal that the copy operation is really a problem). Using these features requires additional effort from you, and it becomes more difficult for the code reader to understand it.
Conclusion
- Use the readonly modifier for structures where possible.
- Consider using the in modifier for large structures.
- Consider using local variables and return values with the ref modifier readonly to encode libraries or in cases where the results of code profiling suggest that this might be useful.
- Use ErrorProne.NET to find problems with the code and share the results.