Objective-C Runtime for C-Schnicks. Part 2



    Hello again. My series of articles is dedicated to those programmers who switched from C to Objective-C and would like to answer the questions “how exactly is Objective-C based on C?” And “how does it all come from the inside?”.

    Many thanks to everyone for the feedback, it is the interest that you showed that serves as an incentive for me to continue my articles on a thorough study of Objective-C Runtime. I started this part with the subject of my articles, because I want to make a couple of clarifications:

    1. My articles are not a guide to Objective C. We study Objective-C Runtime so low that it is understood at the C language level.
    2. My articles are not a guide to the C language and debuggers. We descend to the level of the C language, but not lower. Therefore, I do not touch upon issues such as the presentation of data in memory. It is assumed that you know all this without me.


    Of course, articles will be of interest to other categories of programmers as well. But keep in mind these two points.

    If you have not read the first article, I highly recommend reading it first: http://habrahabr.ru/post/250955/ . And if you have already read, then welcome to cat.

    We “call methods” and pedants “send messages”



    In the previous article, we dealt with a “method call” or, as it is also called, “sending messages”:

    [myObj someMethod];


    We came to the conclusion that at runtime, such a design ultimately boils down to calling the objc_msgSend () function, and we figured out the selectors well.

    Now, let's also examine in detail the objc_msgSend () function to understand the principles of this notorious sending messages to an object.

    This function is called every time you call a method on an object. It is logical to assume that the speed of its work greatly affects the speed of the entire application. Therefore, if you look at the source code of this function, you will find that it is implemented in assembly language for each platform.

    Before dealing with the source code, I suggest that you familiarize yourself with the documentation :

    ...

    The function of sending a message does everything that is needed for dynamic linking:

    • First of all, she finds the procedure (method implementation) that the selector refers to. Since the same method can be implemented by completely different classes, the procedure that it ( the objc_msgSend function, author's note ) looks for depends on the recipient class (to which we send the message, author's note ).
    • Then she calls this procedure, passing her the recipient object (a pointer to it) and all the arguments that were passed in the method call.
    • And finally, it returns the result of the procedure as its own result.

    ...


    Lyrical digression
    Already on the basis of documentation alone, we understand that the phrase “call a method” is absolutely correct when applied to the Objective C language. Therefore, if some wise guy corrects you, they say correctly say “send a message” and not “call a method”, then you can feel free to send it to two well-known words - reading documentation.


    Well, with the second and third paragraph, everything is so clear. But the first one needs to be dealt with in a little more detail: in what way a completely abstract selector is converted into a very specific function.

    We communicate with class methods in C



    Since the objc_msgSend () function we are familiar with is first of all looking for a function that implements the called method, then we can find this function and call it on our own.

    Let's write a small test program that will allow us to get acquainted with the method call a little closer:

    #import 
    #import 
    @interface TestClass : NSObject
    - (void)someMethod;
    - (void)callSomeMethod;
    - (void)methodWithParam:(const char *)param;
    @end
    @implementation TestClass
    - (void)someMethod {
      NSLog(@"Hello from %p.%s!", self, _cmd);
    }
    - (void)callSomeMethod {
      NSLog(@"Hello from %p.%s!", self, _cmd);
      [self someMethod];
    }
    - (void)methodWithParam:(const char *)param {
      NSLog(@"Hello from %p.%s! My parameter is: <%s>", self, _cmd, param);
    }
    @end
    int main(int argc, const char * argv[]) {
      TestClass * myObj = [[TestClass alloc] init];
      [myObj someMethod];
      [myObj callSomeMethod];
      [myObj methodWithParam:"I'm a parameter"];
      return 0;
    }


    From the documentation we become aware that when calling the desired function, objc_msgSend () passes parameters to it in the following order:

    1. A pointer to the object whose method we called
    2. The selector by which we called the method
    3. The remaining arguments that we passed to the method


    That is why our test program looks like this: in each of the methods we output to the log self and _cmd, in which there is a pointer to "ourselves" and a selector, respectively.

    If you run this program, the output will be approximately as follows:

    2015-02-21 12: 43: 18.817 ObjCRuntimeTest [7092: 2454834] Hello from 0x1002061f0.someMethod!
    2015-02-21 12: 43: 18.818 ObjCRuntimeTest [7092: 2454834] Hello from 0x1002061f0.callSomeMethod!
    2015-02-21 12: 43: 18.819 ObjCRuntimeTest [7092: 2454834] Hello from 0x1002061f0.someMethod!
    2015-02-21 12: 43: 18.819 ObjCRuntimeTest [7092: 2454834] Hello from 0x1002061f0.methodWithParam :! My parameter is:


    Now let's try to call these methods using the C language. To do this, we take a pointer to a function from the object that implements the method of our class. Given that we work at the level of the C language, we should determine the types that will allow us to work with pointers to our functions. Given all this, we have the following code in the main () function:

    int main(int argc, const char * argv[]) {
      typedef void (*MethodWithoutParams)(id, SEL);
      typedef void (*MethodWithParam)(id, SEL, const char *);
      TestClass * myObj = [[TestClass alloc] init];
      MethodWithoutParams someMethodImplementation = [myObj methodForSelector:@selector(someMethod)];
      MethodWithoutParams callSomeMethodImplementation = [myObj methodForSelector:@selector(callSomeMethod)];
      MethodWithParam methodWithParamImplementation = [myObj methodForSelector:@selector(methodWithParam:)];
      someMethodImplementation(myObj, @selector(someMethod));
      callSomeMethodImplementation(myObj, @selector(callSomeMethod));
      methodWithParamImplementation(myObj, @selector(methodWithParam:), "I'm a parameter");
      return 0;
    }


    Well, we already called methods exclusively by means of the C language. The exception in this case was only the selectors, with which we already understood quite a bit in the previous article. And for us, only the methodForForselector: method remained a black box.

    The messaging engine in Objective-C Runtime



    The key to implementing the message engine in Objective-C Runtime is how the compiler represents your classes and objects.

    If expressed in terms of the C ++ language, then objects in RAM are created not only for each of the instances of your classes, but also for each class. That is, by describing the class that inherits the base class NSObject, and creating two instances of this class, at run time you will get two objects you created and one object of your class .

    This very object of the class contains a pointer to the object of the parent class and the correspondence table of selectors and function addresses, called the dispatch table. It is with the help of this table that the objc_msgSend () function searches for the desired function that needs to be called for the selector passed to it.

    Each class that inherits from NSObject or NSProxy has an isa field, which is exactly the same as a pointer to a class object. When you call a method on an object, the objc_msgSend () function follows the isa pointer to a class object and looks for the address of the function that implements this method. If he does not find such a function, then he goes to the object of the class of the parent object and looks for this function there. This happens until the desired function is found. If the function was not found anywhere, including in an object of the NSObject class, then a well-known exception is thrown to us:

    unrecognized selector sent to instance ...


    And in fact...
    Currently, the fairly slow function search process is slightly improved. If you call a method of an object, then it, once found, will be placed in a certain cache table. Thus, if you call the methodForSelector: method on an object, then the first time you will search for the desired function, and when the function is found in the object of the NSObject class, it will be cached in the table of your class, and the next time the function will be searched it won’t take much time.

    In addition, an exception will not occur immediately if the implementation of the method is not found. There is a mechanism like Message Forwarding .


    Let's confirm this with real research based on the Objective-C Runtime source code and the NSObject class.

    As we already understood, NSObject has a methodForForSelector: method, whose source code looks like this:

    + (IMP)methodForSelector:(SEL)sel {
        if (!sel) [self doesNotRecognizeSelector:sel];
        return object_getMethodImplementation((id)self, sel); // self - указатель на объект класса
    }
    - (IMP)methodForSelector:(SEL)sel {
        if (!sel) [self doesNotRecognizeSelector:sel];
        return object_getMethodImplementation(self, sel); // self - указатель на наш объект
    }


    As we can see, this method is implemented both for the class itself and for class objects. In both cases, the same object_getMethodImplementation () function is used:

    IMP object_getMethodImplementation(id obj, SEL name)
    {
        Class cls = (obj ? obj->getIsa() : nil);
        return class_getMethodImplementation(cls, name);
    }


    Stop! What is the construction of "(obj? Obj-> getIsa (): nil)"!? Indeed, in all the articles they tell us ...



    And the whole thing starts with the build settings of the Objective-C Runtime project file:

    CLANG_CXX_LANGUAGE_STANDARD = "gnu ++ 0x";
    CLANG_CXX_LIBRARY = "libc ++";


    And here is an implementation of the completely si-plus-plush getIsa () method:

    inline Class 
    objc_object::getIsa() 
    {
        if (isTaggedPointer()) {
            uintptr_t slot = ((uintptr_t)this >> TAG_SLOT_SHIFT) & TAG_SLOT_MASK;
            return objc_tag_classes[slot];
        }
        return ISA();
    }


    In general, it just so happened that any object in Objective-C must contain the isa field. And the class object is no exception.

    All of this pornography is pretty messy. Method methodForSelector: has an absolutely identical implementation both as an object method and as a class method. The only difference is that in the first case, self points to our object, and in the second, to the class object.

    Damn it, what the hell !? How can we call obj-> getIsa () on a class object? What is going on there?

    But the fact is that the class object really has the same field that points to the "class object for this class." If expressed correctly, it points to a metaclass. If you call the method of the object (the method that starts with the "-" sign), then its implementation is searched in its class. If you call the class method (starts with the "+" sign), then its implementation is searched in its metaclass.

    I lied to you a little at the beginning of the article, saying that at runtime, when you create two objects of your class, you get three objects: two instances of your class and a class object. In fact, a class object is always created in conjunction with a metaclass object. That is, in the end you get 4 objects.

    To visually imagine the whole essence of this lawlessness, I will insert the picture from this article here:



    Let's return to our case, where the class_getMethodImplementation () function is finally called through self:

    IMP class_getMethodImplementation(Class cls, SEL sel)
    {
        IMP imp;
        if (!cls  ||  !sel) return nil;
        imp = lookUpImpOrNil(cls, sel, nil, 
                             YES/*initialize*/, YES/*cache*/, YES/*resolver*/);
        // Translate forwarding function to C-callable external version
        if (!imp) {
            return _objc_msgForward;
        }
        return imp;
    }


    Inquisitives can trace that the lookUpImpOrNil () function uses the lookUpImpOrForward () function, the implementation of which is again on the Apple website . The function is written in C, which will make sure that everything works exactly as it is written in the documentation.

    Summarizing



    And finally, as last time, let's call the method exclusively by means of the C language:

    #import 
    #import 
    @interface TestClass : NSObject
    @end
    @implementation TestClass
    + (void)someClassMethod {
      NSLog(@"Hello from some class method!");
    }
    - (void)someInstanceMethod {
      NSLog(@"Hello from some instance method!");
    }
    @end
    int main(int argc, const char * argv[]) {
      typedef void (*MyMethodType)(id, SEL);
      TestClass * myObj = [[TestClass alloc] init];
      Class myObjClassObject = object_getClass(myObj);
      Class myObjMetaclassObject = object_getClass(myObjClassObject);
      MyMethodType instanceMethod = class_getMethodImplementation(myObjClassObject, @selector(someInstanceMethod));
      MyMethodType classMethod = class_getMethodImplementation(myObjMetaclassObject, @selector(someClassMethod));
      instanceMethod(myObj, @selector(someInstanceMethod));
      classMethod(myObjClassObject, @selector(someClassMethod));
      return 0;
    }


    In fact, we are still far from understanding the message mechanism in Objective C. For example, we have not figured out how to return the result from the called methods. But read about it in the following parts :).

    Also popular now: