memset - side of darkness



    After reading the article The most dangerous function in the C / C ++ world, I found it useful to delve into the evil lurking in the dark memset cellar and write an addendum to reveal the essence of the problem more widely.

    In the C language, memset () is widely used, with many traps. Excerpt from C ++ Reference:
    void * memset (void * ptr, int value, size_t num);
    Fill block of memory
    Sets the first num bytes of the block of memory pointed by ptr to the specified value (interpreted as an unsigned char).
    Parameters
    ptr - Pointer to the block of memory to fill.
    value - Value to be set. The value is passed as an int, but the function fills the block of memory using the unsigned char conversion of this value.
    num - Number of bytes to be set to the value. size_t is an unsigned integral type.
    Return Value
    ptr is returned.

    As has been repeatedly noted, there are many rakes that even experienced developers step on. From the Andrey2008 article described in a brief summary of typical errors :

    No. 1. Trying to calculate the size of an array or structure, do not use sizeof () for pointers to an array / structure, it will return you a pointer size of 4 or 8 bytes, instead of the size of the array / structure.

    No. 2. The third argument, memset (), takes the number of bytes as input, not the number of elements, regardless of the data type. I’ll also add, for example, the int type can occupy either 4 or 8 bytes, depending on the architecture. In this case, use sizeof (int).

    No. 3. Do not confuse the arguments. The correct sequence is a pointer, value, length in bytes.

    Number 4. Do not use memset when working with class objects.

    But this is just the tip of the iceberg.

    Memset alternative


    memset is a low-level function that requires the developer to take into account all the features of the computer architecture and its use should be justified. Let's start by considering the alternative = {0} , instead of memset , they say this allows you to initialize an array or string at the compilation stage, which should increase the speed of the program, unlike memset (also ZeroMemory), which initialize data at runtime. I decided to check it out.

    void doInitialize()
    {
       char p0[25] = {0} ;           // установит все 25 символов в 0
       char p1[25] = "" ;            // установит все 25 символов в 0
       wchar_t p2[25] = {0} ;        // установит 25 символов в 0
       wchar_t p3[25] = L"" ;        // установит все 25 символов в 0
       short        p4[62] = {0}     // установит 62 значения в 0
       int          p5[37] = {-1} ;  // установит значение первого элемента в -1
       unsigned int p6[10] = {89} ;  // установит значение первого элемента 89
    }
    

    C99 [$ 6.7.8 / 21]
    If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.

    At the same time, this initialization removes problems No. 1, No. 2, No. 3 with a confusion of parameters and buffer sizes. That is, we will not confuse the second and third arguments in some places, the size does not need to be transferred. Let's see how compilers convert such code. I can’t check all the compilers right away, gcc included in android-ndk-r10c and gcc in ubuntu 04/14 turned out to be on hand.

    gcc -v
    1) gcc version 4.9 20140827 (prerelease) (GCC)
    2) gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1)

    Let's see how the compiler behaves on such a piece of code:

    void empty_string(){
        int i;
        char p1[25] = {0};
        printf("\np1: ");
        for (i = 0; i < 25; i++)
            printf("%x,",p1[i]);
    }
    

    So, without optimization (-O0), the initialization of the array is compiled into such assembler code (we look through the binaries using objdump):

    gcc -O0, ELF 32-bit, ARM, EABI5
        83d8:       e3a03000        mov     r3, #0
        83dc:       e50b3024        str     r3, [fp, #-36]  ; 0x24
        83e0:       e24b3020        sub     r3, fp, #32
        83e4:       e3a02000        mov     r2, #0
        83e8:       e5832000        str     r2, [r3]
        83ec:       e2833004        add     r3, r3, #4
        83f0:       e3a02000        mov     r2, #0
        83f4:       e5832000        str     r2, [r3]
        83f8:       e2833004        add     r3, r3, #4
        83fc:       e3a02000        mov     r2, #0
        8400:       e5832000        str     r2, [r3]
        8404:       e2833004        add     r3, r3, #4
        8408:       e3a02000        mov     r2, #0
        840c:       e5832000        str     r2, [r3]
        8410:       e2833004        add     r3, r3, #4
        8414:       e3a02000        mov     r2, #0
        8418:       e5832000        str     r2, [r3]
        841c:       e2833004        add     r3, r3, #4
        8420:       e3a02000        mov     r2, #0
        8424:       e5c32000        strb    r2, [r3]
        8428:       e2833001        add     r3, r3, #1
    


    gcc -O0, ELF 64-bit, x86-64
      400700:       48 c7 45 d0 00 00 00 00    movq   $0x0,-0x30(%rbp)
      400708:       48 c7 45 d8 00 00 00 00    movq   $0x0,-0x28(%rbp)
      400710:       48 c7 45 e0 00 00 00 00    movq   $0x0,-0x20(%rbp)
      400718:       c6 45 e8 00                movb   $0x0,-0x18(%rbp)
    


    As expected, without optimization, we get a run-time code that will eat O (n) processor time (where n is the buffer length). What can the compiler with optimization (-O3) do, we can see below.

    gcc -O3, 32-bit, ARM

    000083ac :
        83ac:       e59f002c        ldr     r0, [pc, #44]   ; 83e0 
        83b0:       e92d4038        push    {r3, r4, r5, lr}
        83b4:       e08f0000        add     r0, pc, r0
        83b8:       ebffffb2        bl      8288 
        83bc:       e59f5020        ldr     r5, [pc, #32]   ; 83e4 
        83c0:       e3a04019        mov     r4, #25
        83c4:       e08f5005        add     r5, pc, r5
        83c8:       e1a00005        mov     r0, r5
        83cc:       e3a01000        mov     r1, #0
        83d0:       ebffffac        bl      8288 
        83d4:       e2544001        subs    r4, r4, #1
        83d8:       1afffffa        bne     83c8 
        83dc:       e8bd8038        pop     {r3, r4, r5, pc}
    gcc -O3, 64-bit, x86-64
    00000000004006d0 :
      4006d0:       53                      push   %rbx
      4006d1:       be a4 08 40 00          mov    $0x4008a4,%esi
      4006d6:       bf 01 00 00 00          mov    $0x1,%edi
      4006db:       31 c0                   xor    %eax,%eax
      4006dd:       bb 32 00 00 00          mov    $0x32,%ebx
      4006e2:       e8 d9 fd ff ff          callq  4004c0 <__printf_chk@plt>
      4006e7:       66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
      4006ee:       00 00 
      4006f0:       31 d2                   xor    %edx,%edx
      4006f2:       31 c0                   xor    %eax,%eax
      4006f4:       be aa 08 40 00          mov    $0x4008aa,%esi
      4006f9:       bf 01 00 00 00          mov    $0x1,%edi
      4006fe:       e8 bd fd ff ff          callq  4004c0 <__printf_chk@plt>
      400703:       83 eb 01                sub    $0x1,%ebx
      400706:       75 e8                   jne    4006f0 
      400708:       5b                      pop    %rbx
      400709:       c3                      retq   
    


    We see that a piece of code with zeroing in run-time just disappeared, we got the promised O (1) performance, let's figure out where printf gets its values ​​from? We are interested in this piece:

    83bc:           ldr     r5, [pc, #32]
    83c0:           mov     r4, #25     ;// В r4 записываем количество циклов for, это наш счётчик цикла
    83c4:           add     r5, pc, r5  ;// В r5 записываем текст "%x," как константу, в памяти она хранится как 002c7825
    83c8:           mov     r0, r5      ;// r5 неизменно передаётся в r0 на каждой итерации цикла, это первый параметр printf()
    83cc:           mov     r1, #0      ;// записываем константу 0 (вместо фактического p1[i]) как второй параметр printf()
    83d0:           bl      8288 
    83d4:           subs    r4, r4, #1  ;// Отнимаем единицу в счётчике цикла
    83d8:           bne     83c8   ;// Если не дошли до 0, то переходим на начало цикла 83c8
    

    That is, the compiler simply threw out the array, and instead of its values ​​uses 0, as a constant laid down at the stage of compilation. OK, but what happens if we use memset ? Let's see a few pieces of objdump, for example, under ARM:

    Without optimization, -O0 :

        83d8:       e24b3024        sub     r3, fp, #36     ; 0x24
        83dc:       e1a00003        mov     r0, r3
        83e0:       e3a01000        mov     r1, #0
        83e4:       e3a02019        mov     r2, #25
        83e8:       ebffffa3        bl      827c 

    With optimization -O3 :

        83c0:       e58d3004        str     r3, [sp, #4]
        83c4:       e58d3008        str     r3, [sp, #8]
        83c8:       e58d300c        str     r3, [sp, #12]
        83cc:       e58d3010        str     r3, [sp, #16]
        83d0:       e58d3014        str     r3, [sp, #20]
        83d4:       e58d3018        str     r3, [sp, #24]
        83d8:       e5cd301c        strb    r3, [sp, #28]
    

    x86-64
    Without optimization -O0:
      400816:       ba 19 00 00 00          mov    $0x19,%edx
      40081b:       be 00 00 00 00          mov    $0x0,%esi
      400820:       48 89 c7                mov    %rax,%rdi
      400823:       e8 a8 fc ff ff          callq  4004d0 

    With optimization -O3:
      4007f4:       48 c7 04 24 00 00 00 00    movq   $0x0,(%rsp)
      4007fc:       48 c7 44 24 08 00 00 00    movq   $0x0,0x8(%rsp)
      400805:       48 c7 44 24 10 00 00 00    movq   $0x0,0x10(%rsp)
      40080e:       c6 44 24 18 00             movb   $0x0,0x18(%rsp)
    


    That is, optimization simply removes the memset call by inserting it inline. In such cases, memset will always work in O (n) time, but initialization with = {0} during optimization works for a constant, in our case it doesn’t completely take away processor clock cycles, brazenly discarding the fact of the array existence and replacing all its elements zeros. But let's see if this is always the case and what happens if we write a non-zero value after initialization? The test function will look like this:

    void empty_string(){
        int i;
        char p1[25] = {0};
        p1[0] = 65;
        printf("\np1: ");
        for (i = 0; i < 25; i++)
            printf("%x,",p1[i]);
    }
    

    After compilation, we get an already familiar code block:

        8404:       e3a02041        mov     r2, #65 ; 0x41
        8408:       e08f0000        add     r0, pc, r0
        840c:       e58d3004        str     r3, [sp, #4]
        8410:       e58d3008        str     r3, [sp, #8]
        8414:       e58d300c        str     r3, [sp, #12]
        8418:       e58d3010        str     r3, [sp, #16]
        841c:       e58d3014        str     r3, [sp, #20]
        8420:       e58d3018        str     r3, [sp, #24]
        8424:       e5cd301c        strb    r3, [sp, #28]
        8428:       e5cd2004        strb    r2, [sp, #4]
    

    x86-64
      4006f8:       48 c7 04 24 00 00 00    movq   $0x0,(%rsp)
      4006ff:       00 
      400700:       48 c7 44 24 08 00 00    movq   $0x0,0x8(%rsp)
      400707:       00 00 
      400709:       48 c7 44 24 10 00 00    movq   $0x0,0x10(%rsp)
      400710:       00 00 
      400712:       c6 44 24 18 00          movb   $0x0,0x18(%rsp)
      400717:       c6 04 24 41             movb   $0x41,(%rsp)
    


    And it looks as if the compiler inserted an optimized version of memset to us . And let's see what happens if the size of the array grows significantly? Say, not 25 bytes, but 25 kilobytes!

        83fc:       e24ddc61        sub     sp, sp, #24832  ; 0x6100
        8400:       e24dd0a8        sub     sp, sp, #168    ; 0xa8
        8404:       e3a01000        mov     r1, #0
        8408:       e59f2054        ldr     r2, [pc, #84]   ; 8464 
        840c:       e1a0000d        mov     r0, sp
        8410:       ebffff99        bl      827c 

    x86-64
      400720:       55                      push   %rbp
      400721:       ba a8 61 00 00          mov    $0x61a8,%edx
      400726:       31 f6                   xor    %esi,%esi
      400728:       53                      push   %rbx
      400729:       48 81 ec b8 61 00 00    sub    $0x61b8,%rsp
      400730:       48 89 e7                mov    %rsp,%rdi
      400733:       48 8d ac 24 a8 61 00    lea    0x61a8(%rsp),%rbp
      40073a:       00 
      40073b:       48 89 e3                mov    %rsp,%rbx
      40073e:       64 48 8b 04 25 28 00    mov    %fs:0x28,%rax
      400745:       00 00 
      400747:       48 89 84 24 a8 61 00    mov    %rax,0x61a8(%rsp)
      40074e:       00 
      40074f:       31 c0                   xor    %eax,%eax
      400751:       e8 8a fd ff ff          callq  4004e0 
      400756:       be 54 09 40 00          mov    $0x400954,%esi
      40075b:       bf 01 00 00 00          mov    $0x1,%edi
      400760:       31 c0                   xor    %eax,%eax
      400762:       c6 04 24 41             movb   $0x41,(%rsp)
      400766:       e8 a5 fd ff ff          callq  400510 <__printf_chk@plt>
      40076b:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
      400770:       0f be 13                movsbl (%rbx),%edx
      400773:       31 c0                   xor    %eax,%eax
      400775:       be 5a 09 40 00          mov    $0x40095a,%esi
      40077a:       bf 01 00 00 00          mov    $0x1,%edi
      40077f:       48 83 c3 01             add    $0x1,%rbx
      400783:       e8 88 fd ff ff          callq  400510 <__printf_chk@plt>
      400788:       48 39 eb                cmp    %rbp,%rbx
      40078b:       75 e3                   jne    400770 
      40078d:       48 8b 84 24 a8 61 00    mov    0x61a8(%rsp),%rax
      400794:       00 
      400795:       64 48 33 04 25 28 00    xor    %fs:0x28,%rax
      40079c:       00 00 
      40079e:       75 0a                   jne    4007aa 
      4007a0:       48 81 c4 b8 61 00 00    add    $0x61b8,%rsp
      4007a7:       5b                      pop    %rbx
      4007a8:       5d                      pop    %rbp
      4007a9:       c3                      retq
    


    Wow!

    Line = {0} goes to the side of darkness, memset rejoices!

    However, we will not be forgotten, nevertheless we decided to solve the problem with the parameters, now we will not succeed in confusing the arguments.

    Line initialization


    It will also not be superfluous to consider the option of initializing the array = "" . C uses null-terminated strings, that is, the first character with a byte value of 0x00 means the end of the string. Therefore, to initialize a line, it makes no sense to nullify all elements, it is enough just to reset the first one. Here are some ways to initialize an empty string:

    void doInitializeCString()
    {
       char p0[25] = {0} ;           // установит все символы в 0
       char p1[25] = "" ;            // установит все символы в 0
       char p2[25] ;
       p2[0] = 0 ;                   // установит первый символ в 0
       char p3[25] ;
       memset(p3, 0, sizeof(p3)) ;   // установит 25 символов в 0
       char p4[25] ;
       strcpy(p4, "") ;              // установит первый символ в 0
       char *p5 = (char *) calloc(25, sizeof(char)) ;  // установит все символы в 0
    }
    

    The most reliable way how initialization will work through = "" is to parse objdump again. Without optimization, we won’t see anything special, everything is similar there = {0} , we will consider right away with the -O3 option. So we compile under ARM:
    such a function
    void empty_string(){
        int i;
        char p1[25] = "";
        printf("\np1: ");
        for (i = 0; i < 25; i++)
            printf("%x,",p1[i]);
    }
    


    And, suddenly, we get the zeroing of all elements of the array.

        83c0:       e58d3004        str     r3, [sp, #4]
        83c4:       e58d3008        str     r3, [sp, #8]
        83c8:       e58d300c        str     r3, [sp, #12]
        83cc:       e58d3010        str     r3, [sp, #16]
        83d0:       e58d3014        str     r3, [sp, #20]
        83d4:       e58d3018        str     r3, [sp, #24]
        83d8:       e5cd301c        strb    r3, [sp, #28]
    

    x86-64
      400768:       48 c7 04 24 00 00 00 00    movq   $0x0,(%rsp)
      400770:       48 c7 44 24 08 00 00 00    movq   $0x0,0x8(%rsp)
      400779:       48 c7 44 24 10 00 00 00    movq   $0x0,0x10(%rsp)
      400782:       c6 44 24 18 00          movb   $0x0,0x18(%rsp)
    


    Oh well! Why nullify all unused characters in a null-terminated string ?! It is enough to reset one single byte. Hmm, and if there are 25 thousand bytes, what will it do? And here is what:

        8474:       e24ddc61        sub     sp, sp, #24832  ; 0x6100
        8478:       e24dd0a8        sub     sp, sp, #168    ; 0xa8
        847c:       e3a0c000        mov     ip, #0
        8480:       e28d3f6a        add     r3, sp, #424    ; 0x1a8
        8484:       e1a0100c        mov     r1, ip
        8488:       e59f204c        ldr     r2, [pc, #76]   ; 84dc 
        848c:       e28d0004        add     r0, sp, #4
        8490:       e503c1a8        str     ip, [r3, #-424] ; 0x1a8
        8494:       ebffff78        bl      827c 

    x86-64
    00000000004007b0 :
      4007b0:       55                      push   %rbp
      4007b1:       ba a0 61 00 00          mov    $0x61a0,%edx
      4007b6:       31 f6                   xor    %esi,%esi
      4007b8:       53                      push   %rbx
      4007b9:       48 81 ec b8 61 00 00    sub    $0x61b8,%rsp
      4007c0:       48 8d 7c 24 08          lea    0x8(%rsp),%rdi
      4007c5:       48 8d ac 24 a8 61 00    lea    0x61a8(%rsp),%rbp
      4007cc:       00 
      4007cd:       48 c7 04 24 00 00 00    movq   $0x0,(%rsp)
      4007d4:       00 
      4007d5:       64 48 8b 04 25 28 00    mov    %fs:0x28,%rax
      4007dc:       00 00 
      4007de:       48 89 84 24 a8 61 00    mov    %rax,0x61a8(%rsp)
      4007e5:       00 
      4007e6:       31 c0                   xor    %eax,%eax
      4007e8:       48 89 e3                mov    %rsp,%rbx
      4007eb:       e8 f0 fc ff ff          callq  4004e0 
      4007f0:       be 54 09 40 00          mov    $0x400954,%esi
      4007f5:       bf 01 00 00 00          mov    $0x1,%edi
      4007fa:       31 c0                   xor    %eax,%eax
      4007fc:       e8 0f fd ff ff          callq  400510 <__printf_chk@plt>
      400801:       0f 1f 80 00 00 00 00    nopl   0x0(%rax)
      400808:       0f be 13                movsbl (%rbx),%edx
      40080b:       31 c0                   xor    %eax,%eax
      40080d:       be 5a 09 40 00          mov    $0x40095a,%esi
      400812:       bf 01 00 00 00          mov    $0x1,%edi
      400817:       48 83 c3 01             add    $0x1,%rbx
      40081b:       e8 f0 fc ff ff          callq  400510 <__printf_chk@plt>
      400820:       48 39 eb                cmp    %rbp,%rbx
      400823:       75 e3                   jne    400808 
      400825:       48 8b 84 24 a8 61 00    mov    0x61a8(%rsp),%rax
      40082c:       00 
      40082d:       64 48 33 04 25 28 00    xor    %fs:0x28,%rax
      400834:       00 00 
      400836:       75 0a                   jne    400842 
      400838:       48 81 c4 b8 61 00 00    add    $0x61b8,%rsp
      40083f:       5b                      pop    %rbx
      400840:       5d                      pop    %rbp
      400841:       c3                      retq
    


    It looks like a dark memset is chasing us. If you still want to fight against the darkness, then it is worth mentioning what other traps await you.



    memset may initialize numbers with incorrect values


    If you want to fill an array of integers with non-zero values, check out the byte-filling data.

    void doInitializeToMistakenValues()
    {
       char           pChar[25] ;
       unsigned char  pUChar[25] ;
       short          pShort[25] ;
       unsigned short pUShort[25] ;
       int            pInt[25] ;
       unsigned int   pUInt[25] ;
       // Значения 2-байтовых и 4-байтовых элементов будут отличны от единицы
       memset(pChar,   1,  sizeof(pChar)) ;   // 1
       memset(pUChar,  1,  sizeof(pUChar)) ;  // 1
       memset(pShort,  1,  sizeof(pShort)) ;  // 257
       memset(pUShort, 1,  sizeof(pUShort)) ; // 257
       memset(pInt,    1,  sizeof(pInt)) ;    // 16843009
       memset(pUInt,   1,  sizeof(pUInt)) ;   // 16843009
       // Значения unsigned массивов заполнится байтами 0xFF
       memset(pChar,   -1, sizeof(pChar)) ;   // -1
       memset(pUChar,  -1, sizeof(pUChar)) ;  // 255
       memset(pShort,  -1, sizeof(pShort)) ;  // -1
       memset(pUShort, -1, sizeof(pUShort)) ; // 65535
       memset(pInt,    -1, sizeof(pInt)) ;    // -1
       memset(pUInt,   -1, sizeof(pUInt)) ;   // 4294967295
    }
    

    Let's see how it turns out. So, let's say we have an int array, pass the second parameter to the unit, what happens?

    And here's what:

    0x01010101 - in hexadecimal notation, each byte will be filled with a unit, and the correct value
    0x00000001 will not be possible to set with the memset function. But actually this is not a bug, it is a feature.

    That's just ignorance of these features leads to unpredictable errors.

    memset may set an invalid value


    If we set the bytes -1 to double elements, we get the value Not-A-Number (NaN), and after subsequent calculations, each operation with the value NaN will return NaN, thus breaking the entire chain of calculations.

    In the same way, setting -1 to the bool type is incorrect and it will not formally be either true or false. Although in most cases it will behave as true. In most cases ...

    And lastly, memset is only designed to work with simple data structures . Never use memset with managed data structures, this function is intended only for low-level operations.


    The article used materials memset is evil .

    Also read about printf
    function vulnerabilities .

    Also popular now: