Garbage collection and lifetime of objects

Original author: SergeyT
  • Transfer
It would seem a simple question: can the CLR call the finalizer of the object when the instance method has not completed its execution?

In other words, is it possible in the following case to see “Finalizing instance.” Before “Finished doing something.”?

internal class GcIsWeird
{
    ~GcIsWeird()
    {
        Console.WriteLine("Finalizing instance.");
    }
    public int data = 42;
    public void DoSomething()
    {
        Console.WriteLine("Doing something. The answer is ... " + data);
        // Some other code...
        Console.WriteLine("Finished doing something.");
    }
}


Answer: It depends .

In debug builds this will never happen, but in Release it is possible. To simplify this discussion, consider the following static method:

static void SomeWeirdAndVeryLongRunningStaticMethod()
{
    var heavyWeightInstance = new int[42_000_000];
    // The very last reference to 'heavyWeightInstance'
    Console.WriteLine(heavyWeightInstance.Length);
    for (int i = 0; i < 10_000; i++)
    {
        // Doing some useful stuff.
        Thread.Sleep(42);
    }
}

The local variable 'heavyWeightInstance' is used only in the first two lines and could theoretically be compiled by the GC after that. One could assign the variable null explicitly to free the link, but this is not required. The CLR has an optimization that allows you to collect objects if they are no longer in use. The JIT compiler allocates a special table called the "Pointer Table" or GCInfo (see gcinfo.cpp in the coreclr repo ), which gives enough information to the garbage collector to decide when a variable is reachable and when not.

An instance method is just a static method with the 'this' pointer passed in the first argument. This means that all optimizations are valid for both instance methods and static methods.

To prove that this is true, we can run the following program and look at the result.

class Program
{
    internal class GcIsWeird
    {
        ~GcIsWeird()
        {
            Console.WriteLine("Finalizing instance.");
        }
        public int data = 42;
        public void DoSomething()
        {
            Console.WriteLine("Doing something. The answer is ... " + data);
            CheckReachability(this);
            Console.WriteLine("Finished doing something.");
        }
    }
    static void CheckReachability(object d)
    {
        var weakRef = new WeakReference(d);
        Console.WriteLine("Calling GC.Collect...");
        GC.Collect();
        GC.WaitForPendingFinalizers();
        GC.Collect();
        string message = weakRef.IsAlive ? "alive" : "dead";
        Console.WriteLine("Object is " + message);
    }
    static void Main(string[] args)
    {
        new GcIsWeird().DoSomething();
    }
}

As expected, running this program in “release” mode will lead to the following conclusion: The output shows that the object was assembled during the execution of the instance method. Now let's see how this happens.

Doing something. The answer is ... 42
Calling GC.Collect...
Finalizing instance.
Object is dead
Finished doing something




  • First, we can use WinDbg and invoke the GCInfo command for a given method table.
  • Secondly, we can compile CoreClr and run the application with JIT tracing enabled.

I decided to use the second option. To do this, use the instructions described in the JIT Dumps section and complete the following steps:

  1. Build CoreCLR Repo (do not forget to install all the necessary Visual Studio components, such as VC ++, CMake and Python).
  2. Install dotnet cli.
  3. Create an application for dotnet core.
  4. Create and publish a dotnet core application.
  5. Copy the newly built coreclr binaries to the folder with the published application.
  6. Set several environment variables, such as COMPlus_JitDump = YourMethodName.
  7. Launch the application.

And here is the result:

*************** After end code gen, before unwindEmit()
IN0002: 000012 call     CORINFO_HELP_NEWSFAST
IN0003: 000017 mov      rcx, 0x1FE90003070
// Console.WriteLine("Doing something. The answer is ... " + data);
IN0004: 000021 mov      rcx, gword ptr [rcx]
IN0005: 000024 mov      edx, dword ptr [rsi+8]
IN0006: 000027 mov      dword ptr [rax+8], edx
IN0007: 00002A mov      rdx, rax
IN0008: 00002D call     System.String:Concat(ref,ref):ref
IN0009: 000032 mov      rcx, rax
IN000a: 000035 call     System.Console:WriteLine(ref)
// CheckReachability(this);
IN000b: 00003A mov      rcx, rsi
// После этого момента указатель «this» доступен для GC
IN000c: 00003D call     Reachability.Program:CheckReachability(ref)
// Console.WriteLine
IN000d: 000042 mov      rcx, 0x1FE90003078
IN000e: 00004C mov      rcx, gword ptr [rcx]
IN000f: 00004F mov      rax, 0x7FFB6C6B0160
*************** Variable debug info
2 vars
  0(   UNKNOWN) : From 00000000h to 00000008h, in rcx
  0(   UNKNOWN) : From 00000008h to 0000003Ah, in rsi
*************** In gcInfoBlockHdrSave()
Register slot id for reg rsi = 0.
Set state of slot 0 at instr offset 0x12 to Live.
Set state of slot 0 at instr offset 0x17 to Dead.
Set state of slot 0 at instr offset 0x2d to Live.
Set state of slot 0 at instr offset 0x32 to Dead.
Set state of slot 0 at instr offset 0x35 to Live.
Set state of slot 0 at instr offset 0x3a to Dead.

The dump from the Jit compiler will be slightly different from what you can see in WinDBG or in the 'Disassembly' window in Visual Studio. The main difference is that it shows much more information, including the number of local variables (when they are used in terms of ASM bias) and GCInfo. Another useful aspect that shows the offset of the commands, which helps to understand the contents of the GCInfo table.

In this case, it is clear that the “this” pointer is no longer needed after the command with offset 0x3A, i.e. right before the call to CheckReachability . This is the reason why the object was assembled (destroyed) after the GC was called inside the CheckReachability method.

Conclusion


JIT and GC work together to track some supporting information that helps the GC collect objects as soon as they are no longer used by the application.
The C # language specification says that this optimization is possible, but not necessary: ​​“if a local variable from the current scope is the only reference to the object, and this local variable is no longer used on any path of the procedure, then the garbage collector can ( but not obliged ) to consider this object unused and accessible for assembly ”. So you should not rely on this behavior in production code.

Also popular now: