Stack Implementation Details - Part Two
- Transfer
Several people asked me, in the context of my previous post about meaningful types, why, after all, meaningful types are located on the stack, but reference types don't.
In short, "because they can." And since stack “cheap” we place them on the stack whenever possible.
The long answer is ... long.
To begin with, in general terms, we define what we call a heap and what we call a stack. First a bunch.
The CLR heap * is a marvel of engineering with a huge amount of detail. The description below is not how the heap actually works, but it is enough to get a general idea.
The idea is that there are large blocks of memory allocated for instances of reference types. These memory blocks can have “holes” because some memory blocks are occupied by “living” objects, and some are ready for use under new objects. In the ideal case, we would like all the occupied memory to be located in one place as a continuous block, and the rest of the address space to be free.
In such a situation, when allocating a new portion of memory, we would move the pointer to the upper limit of the occupied memory up by the required amount and “eat away” part of the previously free memory. This newly reserved memory would be used for the newly created object. Such an operation is extremely cheap: just move the pointer and fill in a piece of memory with zeros if necessary.
If we have "holes", then we must keep a "free sheet" - a list of free sections. Then we can look in this list for a free place of a suitable size and fill it. This operation is a bit more expensive because a list is searched. I would like to avoid such a situation because it is not optimal.
Garbage collection takes place in 3 stages: marking, collection and compression **. In the “marking” phase, we assume that all objects are “dead” (unattainable from the root of the approx. Per.). The CLR knows which of the objects are guaranteed to be alive at the time the assembly begins and marks them alive. All objects to which they refer are also marked as live, etc. until all transitive closures of living objects are marked. In the assembly phase, all “dead” objects turn into “holes”. In the compression phase, the blocks are reorganized so that living objects make up a continuous memory block without "holes".
The described model is complicated by the fact that there are three such areas: the CLR collector implements generations. At first, the objects are in a heap with a "short lifetime". If they survive *** then over time they are transferred to the heap with average life time and if there they survive long enough, then they are transferred to the heap with long life time. GC very often runs on a heap with a short lifetime and very rarely on a heap with a long lifetime; the idea is to save on the constant check of long-lived objects whether they are still alive or not. But we also want short-lived objects to quickly free up memory. The GC has a huge variety of finely tuned policies for high performance. They strike a balance between a state where memory is similar to Swiss cheese and time spent in the compression phase. Very large objects are stored in a special heap with a completely different compression policy. Etc. etc. I do not know all the details and fortunately I do not need this. (And of course I didn’t complicate things with details that weren’t relevant to this article, such as “binding objects” ****, finalization, weak links, etc.)
Now compare this to the stack. A stack, like a bunch, is a large chunk of memory with a pointer to the upper bound. But what really makes this piece of memory a stack is that the memory at the bottom of the stack always lives longer than the memory at the top of the stack; the stack is strictly ordered. The object that must die first is above, the object that must die last is below. Based on this, we know that the stack will never have holes and it will never need compression. We also know that the memory on the stack is always freed from the top and we do not need to maintain a list of free sectors. We know that everything that is in the stack below is guaranteed to be alive and we don’t need to mark and collect anything.
Allocating memory on the stack is just moving the pointer - exactly the same as in the best (and fairly typical) case when allocating memory on the heap. But because of all these properties of the stack, freeing memory is also just a pointer move! And this is exactly where we save a lot of time. I got the opinion that a lot of people think that the allocation on the stack is cheap, and on the heap - the roads. But in fact, these are almost the same time operations, usually. But the process of freeing up memory is freeing up memory, defragmenting and moving objects from generation to generation, all together these are very significant movements of memory blocks in comparison with what we see on the stack.
Obviously, it's better to use a stack than a bunch if you can. But when can you? Only when all the conditions for the stack to work are met. Local variables and parameters of significant types are the “sweetest” cases when all conditions are met. The local data of the calling functions located at the bottom of the stack is guaranteed to live longer than the local data located at the top of the stack of called functions. Local variables of significant types are passed by value, and not by reference, this ensures that only a local variable points to a given piece of memory and does not need to be calculated to determine the lifetime of the object. And there is only one way to pass a reference to a significant local variable - it is ref or out, which are passed to the functions located above on the stack. Local variables
A few additions: The
paragraph above explains why we cannot create a field of type ref int. If you could keep a reference to an object with a short lifetime inside the field of a long-living object, if it suddenly happened, the stack would lose its advantages and meaningful types would become just another kind of reference types that need garbage collection.
Closures of anonymous functions and closures of blocks of operators, the compiler implements through the fields of hidden classes. Now, I think, you understand why it is forbidden to close ref and out variables.
Of course, we didn’t want to create an ugly and strange rule like “you can use any local variable in the closure except the function parameters passed through ref and out”. But since Since we wanted to use optimization, placing the values on the stacks, we were forced to add such a seemingly strange restriction to the language. This, however, as always, is the art of compromise.
By the way, the CLR allows you to return ref types. Theoretically, you could create a method “ref int Foo () {...}” that returns a reference to an integer variable. If for some strange reason we decided to allow this in C #, then we would have to adjust the compiler and check that the returned link is assigned either to a variable on the heap or to a variable located on the stack below.
Let's get back to our sheep. Local variables are located on the stack because they can. They can because 1 - "normal" local variables have a strictly defined lifetime and 2 - significant types are always copied by value and 3 - you can store a link to local variables only in some container whose lifetime is longer than the local lifetime variable. In contrast, the lifetime of reference types is determined by the number of live links, copied by reference and these links can be stored anywhere. This is the additional freedom that reference types give you when asking for time in exchange for a more complex and expensive garbage collection strategy.
But then again, these are implementation details. Using the stack for local variables is just an optimization that the CLR does for you. The main feature of significant types is that objects of such types are copied by value, and not that their memory can be optimized by the runtime.
In short, "because they can." And since stack “cheap” we place them on the stack whenever possible.
The long answer is ... long.
To begin with, in general terms, we define what we call a heap and what we call a stack. First a bunch.
The CLR heap * is a marvel of engineering with a huge amount of detail. The description below is not how the heap actually works, but it is enough to get a general idea.
The idea is that there are large blocks of memory allocated for instances of reference types. These memory blocks can have “holes” because some memory blocks are occupied by “living” objects, and some are ready for use under new objects. In the ideal case, we would like all the occupied memory to be located in one place as a continuous block, and the rest of the address space to be free.
In such a situation, when allocating a new portion of memory, we would move the pointer to the upper limit of the occupied memory up by the required amount and “eat away” part of the previously free memory. This newly reserved memory would be used for the newly created object. Such an operation is extremely cheap: just move the pointer and fill in a piece of memory with zeros if necessary.
If we have "holes", then we must keep a "free sheet" - a list of free sections. Then we can look in this list for a free place of a suitable size and fill it. This operation is a bit more expensive because a list is searched. I would like to avoid such a situation because it is not optimal.
Garbage collection takes place in 3 stages: marking, collection and compression **. In the “marking” phase, we assume that all objects are “dead” (unattainable from the root of the approx. Per.). The CLR knows which of the objects are guaranteed to be alive at the time the assembly begins and marks them alive. All objects to which they refer are also marked as live, etc. until all transitive closures of living objects are marked. In the assembly phase, all “dead” objects turn into “holes”. In the compression phase, the blocks are reorganized so that living objects make up a continuous memory block without "holes".
The described model is complicated by the fact that there are three such areas: the CLR collector implements generations. At first, the objects are in a heap with a "short lifetime". If they survive *** then over time they are transferred to the heap with average life time and if there they survive long enough, then they are transferred to the heap with long life time. GC very often runs on a heap with a short lifetime and very rarely on a heap with a long lifetime; the idea is to save on the constant check of long-lived objects whether they are still alive or not. But we also want short-lived objects to quickly free up memory. The GC has a huge variety of finely tuned policies for high performance. They strike a balance between a state where memory is similar to Swiss cheese and time spent in the compression phase. Very large objects are stored in a special heap with a completely different compression policy. Etc. etc. I do not know all the details and fortunately I do not need this. (And of course I didn’t complicate things with details that weren’t relevant to this article, such as “binding objects” ****, finalization, weak links, etc.)
Now compare this to the stack. A stack, like a bunch, is a large chunk of memory with a pointer to the upper bound. But what really makes this piece of memory a stack is that the memory at the bottom of the stack always lives longer than the memory at the top of the stack; the stack is strictly ordered. The object that must die first is above, the object that must die last is below. Based on this, we know that the stack will never have holes and it will never need compression. We also know that the memory on the stack is always freed from the top and we do not need to maintain a list of free sectors. We know that everything that is in the stack below is guaranteed to be alive and we don’t need to mark and collect anything.
Allocating memory on the stack is just moving the pointer - exactly the same as in the best (and fairly typical) case when allocating memory on the heap. But because of all these properties of the stack, freeing memory is also just a pointer move! And this is exactly where we save a lot of time. I got the opinion that a lot of people think that the allocation on the stack is cheap, and on the heap - the roads. But in fact, these are almost the same time operations, usually. But the process of freeing up memory is freeing up memory, defragmenting and moving objects from generation to generation, all together these are very significant movements of memory blocks in comparison with what we see on the stack.
Obviously, it's better to use a stack than a bunch if you can. But when can you? Only when all the conditions for the stack to work are met. Local variables and parameters of significant types are the “sweetest” cases when all conditions are met. The local data of the calling functions located at the bottom of the stack is guaranteed to live longer than the local data located at the top of the stack of called functions. Local variables of significant types are passed by value, and not by reference, this ensures that only a local variable points to a given piece of memory and does not need to be calculated to determine the lifetime of the object. And there is only one way to pass a reference to a significant local variable - it is ref or out, which are passed to the functions located above on the stack. Local variables
A few additions: The
paragraph above explains why we cannot create a field of type ref int. If you could keep a reference to an object with a short lifetime inside the field of a long-living object, if it suddenly happened, the stack would lose its advantages and meaningful types would become just another kind of reference types that need garbage collection.
Closures of anonymous functions and closures of blocks of operators, the compiler implements through the fields of hidden classes. Now, I think, you understand why it is forbidden to close ref and out variables.
Of course, we didn’t want to create an ugly and strange rule like “you can use any local variable in the closure except the function parameters passed through ref and out”. But since Since we wanted to use optimization, placing the values on the stacks, we were forced to add such a seemingly strange restriction to the language. This, however, as always, is the art of compromise.
By the way, the CLR allows you to return ref types. Theoretically, you could create a method “ref int Foo () {...}” that returns a reference to an integer variable. If for some strange reason we decided to allow this in C #, then we would have to adjust the compiler and check that the returned link is assigned either to a variable on the heap or to a variable located on the stack below.
Let's get back to our sheep. Local variables are located on the stack because they can. They can because 1 - "normal" local variables have a strictly defined lifetime and 2 - significant types are always copied by value and 3 - you can store a link to local variables only in some container whose lifetime is longer than the local lifetime variable. In contrast, the lifetime of reference types is determined by the number of live links, copied by reference and these links can be stored anywhere. This is the additional freedom that reference types give you when asking for time in exchange for a more complex and expensive garbage collection strategy.
But then again, these are implementation details. Using the stack for local variables is just an optimization that the CLR does for you. The main feature of significant types is that objects of such types are copied by value, and not that their memory can be optimized by the runtime.
(*) From the translator: in .net there is still a bunch for internal CLR objects, but they usually don’t consider it, so in this case it means the bunch that GC collects and in which instances of objects created by the user are stored.
(**) From the translator: compression in this context is equivalent to defragmentation
(***) From the translator: not collected during garbage collection
(****) From the translator: in the original pinning - objects that the GC does not move in memory. More details here: msdn.microsoft.com/en-us/library/f58wzh21%28VS.80%29.aspx .