When this == null: a fictional story from the CLR world

    I once had a chance to debug such a code in C #, which "out of the blue" fell from NullReferenceException:

    	public class Tester {
    		public string Property { get; set; }
    		public void Foo() {
    			this.Property = "Some string"; // NullReferenceException
    		}
    	}
    

    Yes, here on this very line with property assignment fell NullReferenceException. What’s the matter, I think - has the runtime stopped checking for an instance before calling the instance methods?

    As it turned out - in a way, yes, I stopped . True, the compiler turned out to be not what it claims to be, and the checks are not guaranteed at all by runtime ... More details are under the cut.

    For those who are not familiar with the specifics of C #, I will explain the chain of my thoughts. So, in the class Testerthere is an instance method Fooand an instance property Property. Somebody call the method Foo, but in reference to this.Propertyrevealed a surprise that led to the generation of the runtime exception NullReferenceException.

    In a normal situation, this exception could mean that there is a given string in the string this == null, and therefore the string this.Property = smthcannot access the property. But for a C # programmer, this sounds completely impossible - because if a method was somehow called Foo, then an instance of the class exists and thiscannot equal null! How could you call method y null?

    And nevertheless, here he is, pointing to this line! We begin to doubt everything, including our own sanity, and write the following test program in C #:

    static class Program {
        static void Main() {
            Tester t = null;
            t.Foo();
        }
    }
    

    Compile, execute - yes, the program crashes NullReferenceExceptionon the line t.Foo();, but Foodoes not enter the method . This is what happens, under some conditions runtime forgot to perform a check on null?

    Not really. (Rantime does not perform this check at all .) The compiler is not to blame for everything that happens, of course. Only here is not the C # compiler (which, obviously, complies with the laws and does not allow the method y to be called null), but the C ++ / CLI compiler, with the help of which the code was compiled, which called the method in the original way Foo. Yes, the participation of C ++ / CLI in this story would immediately arouse a lot of suspicions, and I initially specially kept silent about this to make it more interesting :)

    Well, let's continue our experiments and write the same program in C ++ / CLI (for this you need to add a link to the assembly containing the class Tester):

    int main() {
       Tester ^t = nullptr;
       t->Foo();
    }
    

    Compile, run - bam! Drops NullReferenceExceptioninside the method Foo, just like in the original case. That is, the instance method was Foosomehow called at the null reference, bypassing any checks.

    What is going on? We have in our hands two completely identical programs in different languages. We assume that they should be compiled into almost the same (or at least similar) bytecode if the compilers of both languages ​​meet the CLI specifications. We begin to deal with the received bytecode. We take ildasmand parse the program code in C #. I give a complete listing of the method Program.Main(in the comments I quoted the source code lines corresponding to the bytecode):

    .method private hidebysig static void  Main() cil managed
    {
      .entrypoint
      // Code size       11 (0xb)
      .maxstack  1
      .locals init ([0] class [Shared]ThisIsNull.Tester t)
      IL_0000:  nop
      IL_0001:  ldnull
      IL_0002:  stloc.0 // Tester t = null;
      IL_0003:  ldloc.0
      IL_0004:  callvirt   instance void [Shared]ThisIsNull.Tester::Foo() // t.Foo()
      IL_0009:  nop
      IL_000a:  ret
    }
    

    The most interesting thing here is the line IL_0004. We see that the compiler called the method Foousing the instruction callvirt. Now compare with the corresponding C ++ / CLI code:

    .method assembly static int32 modopt([mscorlib]System.Runtime.CompilerServices.CallConvCdecl) 
            main() cil managed
    {
      .vtentry 1 : 1
      // Code size       12 (0xc)
      .maxstack  1
      .locals ([0] class [Shared]ThisIsNull.Tester t)
      IL_0000:  ldnull
      IL_0001:  stloc.0 // Tester ^t = nullptr;
      IL_0002:  ldnull
      IL_0003:  stloc.0 // t = nullptr;
      IL_0004:  ldloc.0
      IL_0005:  call       instance void [Shared]ThisIsNull.Tester::Foo() // t->Foo();
      IL_000a:  ldc.i4.0
      IL_000b:  ret
    }
    

    Of the changes interesting to us, in addition to double zeroing the variable, here the method call is not through callvirt, but through call.

    The CIL instruction callvirtis actually for virtual calls. However, it has one more small feature - since virtual calls are usually made in the CLI via the virtual method table, the responsibility of the instruction callvirtis also to check the link to nulland throw an exception NullReferenceExceptionif something goes wrong.

    The instruction callsimply calls the method without checking the links (and without using virtual dispatching mechanisms).

    It turns out that the C # compiler just uses a feature of the instructioncallvirtand therefore generates it for all calls in general (except for static and explicit calls to methods of the base class through base.) - just because it protects the code from calling the method at the null reference. At the same time, the C ++ / CLI compiler acts according to the good old laws of the wild West undefined behavior: if the contents of the link are not defined, then the behavior of the program is also not defined. If the compiler knows that the method cannot be virtual, then it will not try to generate virtual calls.

    Whether this behavior of the C # compiler affects performance, and if so, to what extent is an open question. In principle, in most cases, JIT should cope with the optimization and inlining of such code, if in fact the called methods are not virtual. In this regard, the C # compiler relies entirely on JIT and for its part does not make any optimization attempts.

    In the context of the facts investigated, it is also interesting, for example, here is a fragment of the published class code System.Stringthat once raised questions on StackOverflow :

            public bool Equals(String value) { 
                if (this == null)                        //this is necessary to guard against reverse-pinvokes and
                    throw new NullReferenceException();  //other callers who do not use the callvirt instruction
                if (value == null) 
                    return false;
                if (Object.ReferenceEquals(this, value)) 
                    return true;
                return EqualsHelper(this, value);
            }
    

    Now it becomes clear what the comment says (however, these comments were not always there), and under what conditions this check may work.

    In several methods, the developers of the framework had to defend themselves against method calls nullin this way. The fact is that string comparison in the method is EqualsHelperimplemented using a unsafe-code, which may well try to access the memory at the zero address, which will surely lead to all kinds of bad consequences.
    UPD: The user a553 correctly notes in the comments that, with this code, the developers, among other things, corrected a potential error in which the call ((string)null).Equals(null)could return falserather than fall off NullReferenceExceptionas it should be.

    Conclusions:


    1. The CLI does not guarantee that this != nulleven when invoking instance methods and properties.
    2. The C # compiler respects this rule when generating bytecode for C # code, but your code can also be called from other languages.
    3. In particular, the C ++ / CLI compiler does not comply with these rules and may well pass control to instance methods without defining the corresponding instance.
    4. It follows that your code can sometimes be called in context this == nullfor various reasons (code generation, reflection, compilers of other languages), and you need to be prepared for this. If you are developing a library intended for widespread use in an interop environment, it might even be worth adding tests for nullpublic classes from externally accessible classes.

    PS:


    All the code used in the article is available on github .

    Also popular now: