Calling a function with an "unknown" name in C ++. Part 1 - cdecl

    Formulation of the problem


    What did I mean when I wrote the "unknown" function name? This means that the name of the function, its parameters and, in the end, the calling convention, become known only during program execution. Let us challenge her! =)

    Now we will try to call the function according to the cdecl standard.
    Wikipedia excerpt:
    The cdecl calling convention is used by many C systems for the x86 architecture. In cdecl, function parameters are pushed on the stack in a right-to-left order. Function return values ​​are returned in the EAX register (except for floating point values, which are returned in the x87 register ST0). Registers EAX, ECX, and EDX are available for use in the function.

    In general, parameters are passed through the stack in reverse order, the resulting value will be in EAX except for floating point numbers - they will be in the x87 pseudo-stack.

    We will
    draw up a work plan: 1) Generate a buffer in memory, which can be unchanged, word for word (4 bytes) on the stack.
    2) Find out the address of the function that we will call
    3) Put the buffer on the stack according to the words
    4) Call the function
    5) Pull out the result

    Go!



    What do we have:
    1) char * sName - here is the name of the function
    2) int N - the number of parameters
    3) enum CParamType {cptNone = 0, cptPointer, cptInt, cptDouble} - possible data types -
    let's restrict ourselves to these 4) CParamType Params [] - list of parameter types
    5) void * ParamList [] - actually, pointers to variables with parameters
    6) CParamType RetType - data type of the result
    7) void * Ret - pointer to the memory where you want to drop the result
    8) enum CCallConvention {cccNone = 0, cccCDecl, cccStdCall, cccFastCall} - types of calling conventions
    9) CCallConvention conv - calling conventions. To begin with, we will call only the cdecl functions.

    This is a necessary and sufficient list of declarations that we need to call.
    In C / C ++, there is no means to implement this operation, so you have to turn to assembler.

    1. Create a buffer


    First, count the number of words. Everything is simple - void *, int - 4 bytes - 1 word, double - 8 bytes - 2 words.
    Copy Source | Copy HTML
    1. int WordCount= 0;
    2. for(int i= 0,i
    3. {
    4.   switch(Params[i])
    5.   {
    6.     case cptPointer:
    7.     case cptInt:
    8.         WordCount++;
    9.         break;
    10.     case cptDouble:
    11.         WordCount+=2;
    12.         break;
    13.   }
    14. }


    Counted. We allocate memory: We
    void* Buffer = new char[4*WordCount];

    fill the buffer: void *, int - we place without changes, and in double we swap words.
    Copy Source | Copy HTML
    1. int offset= 0;
    2. double x;
    3. for(int i= 0,i
    4. {
    5.   switch(Params[i])
    6.   {
    7.     case cptPointer:
    8.     case cptInt:
    9.         *(int*)(buf+offset)=*((int*)(ParamList[i]));
    10.         offset+=4;
    11.         break;
    12.     case cptDouble:
    13.         x=*((double*)(((DTMain*)(v->T))->pData));
    14.         memcpy(buf+offset+4,&x,4);
    15.         memcpy(buf+offset,(char*)&x+4,4);
    16.         offset+=8;
    17.         break;
    18.   }
    19. }


    I think there is nothing to comment on. offset - offset by the buffer.

    2. Find out the address of the function


    Everything is quite simple here.
    void* addr = dlsym(NULL,sName);
    Where the first parameter is the library descriptor. NULL to search in the current context.
    We connect dlfcn.h and do not forget to add -ldl to the link parameters.

    3. We put on the stack a buffer according to the words


    Fuh. The most interesting.
    To work with the stack, we naturally need assembler. I use the gnu compiler, so the assembler with the AT&T syntax doesn’t kick, I myself don’t really like it, but I don’t have to choose.
    Copy Source | Copy HTML
    1. asm("\
          movl $0, %%eax;\
          movl %2,%%ebx; \
          movl %3,%%ecx; \
          l1: cmpl %%ecx, %%eax; \
          je l2;\
          pushl (%%ebx,%%eax,4); \
          addl $1,%%eax;\
          jmp l1;"
    2. :"=r"(b)
    3. : "r"(addr), "r"(Buffer), "g"(WordCount)
    4. : "%eax"
    5. );


    We make a loop: until ecx (WordCount) becomes 0, put the word on the stack and reduce ecx.

    4. Call the function



    We do it
    l2: call *%1;
    after filling the stack. % 1 is a pointer to a function (addr).

    5. Return the result



    There are 2 options: the whole result or fractional. According to the agreement, by default the result will be in% eax, but if with a floating point, then in the x87 all-stack.
    1) The whole result
    movl %%eax, %0;
    where% 0 is the result variable.

    2) The option with a floating point
    In theory, here you need to remove the answer from ST (0). So far, I have not been able to do this. I would like to see possible solutions in the comments. Thanks in advance.

    That's it! The task was really not trivial. I hope someone needs this post.

    PS We need all this to write an interpreter.
    _________
    The text is prepared in HabrRedaktor

    UPD: Highlighted source codes

    Also popular now: