What is faster while (true) or for (;;)?

    In the raw materials of different authors I saw different versions of the perpetual cycle. Most often I met the following:
    while (true) {
    ...
    }
    

    and
    for (;;) {
    ...
    }
    

    Since everyone defended “their eternal cycle” as a native, I decided to figure it out. Who writes the most optimal code.

    I wrote 2 sources:

    while.c:
    #include 
    int main (int argc, char* argv[])
    {
        while(1){
           printf("1\n");
        }
    }
    


    for.c:
    #include 
    int main (int argc, char* argv[])
    {
        for(;;){
            printf("1\n");
        }
    }
    


    Collected them:
    $ gcc -O3 while.c -o while.o3
    $ gcc -O2 while.c -o while.o2
    $ gcc -O1 while.c -o while.o1
    $ gcc -O3 for.c -o for.o3
    $ gcc -O2 for.c -o for.o2
    $ gcc -O1 for.c -o for.o1
    


    And disassembled. Who is too lazy to read assembler lists - you can scroll down the page. Actually listings:
    
    $ objdump -d ./while.o3
    ...
    0000000000400430 
    : 400430: 48 83 ec 08 sub $0x8,%rsp 400434: 0f 1f 40 00 nopl 0x0(%rax) 400438: bf d4 05 40 00 mov $0x4005d4,%edi 40043d: e8 be ff ff ff callq 400400 400442: eb f4 jmp 400438 ... $ objdump -d ./while.o2 ... 0000000000400430
    : 400430: 48 83 ec 08 sub $0x8,%rsp 400434: 0f 1f 40 00 nopl 0x0(%rax) 400438: bf d4 05 40 00 mov $0x4005d4,%edi 40043d: e8 be ff ff ff callq 400400 400442: eb f4 jmp 400438 ... $ objdump -d ./while.o1 ... 000000000040051c
    : 40051c: 48 83 ec 08 sub $0x8,%rsp 400520: bf d4 05 40 00 mov $0x4005d4,%edi 400525: e8 d6 fe ff ff callq 400400 40052a: eb f4 jmp 400520 ... $ objdump -d ./for.o1 ... 000000000040051c
    : 40051c: 48 83 ec 08 sub $0x8,%rsp 400520: bf d4 05 40 00 mov $0x4005d4,%edi 400525: e8 d6 fe ff ff callq 400400 40052a: eb f4 jmp 400520 ... $ objdump -d ./for.o2 ... 0000000000400430
    : 400430: 48 83 ec 08 sub $0x8,%rsp 400434: 0f 1f 40 00 nopl 0x0(%rax) 400438: bf d4 05 40 00 mov $0x4005d4,%edi 40043d: e8 be ff ff ff callq 400400 400442: eb f4 jmp 400438 ... $ objdump -d ./for.o3 0000000000400430
    : 400430: 48 83 ec 08 sub $0x8,%rsp 400434: 0f 1f 40 00 nopl 0x0(%rax) 400438: bf d4 05 40 00 mov $0x4005d4,%edi 40043d: e8 be ff ff ff callq 400400 400442: eb f4 jmp 400438


    We disassemble on the fingers



    Various optimizations did not affect the implementation of the while (true) loop - it always executed 3 commands: mov, callq and jmp. Also, optimizations did not affect the implementation of for - it was also always from 3 commands: mov, callq, jmp. Between themselves mov, callq and jmp were no different. The length of instructions in bytes in all 6 cases is unchanged.

    There is only a small difference between the -O1 and -O2 / -O3 jmp implementations executed on main + 4 and not on main + 8, but given that this is a static address (as seen from the asm code) it also does not make a difference performance ... Although ... what if the memory pages are different, as far as I know for gestures between different memory pages in x86 (and amd64) additional efforts are required!

    We
    recognize : 400438/4096 = 97.763183594
    400520/4096 = 97.783203125

    Carried. The memory page is one. Yes, this is the 97th page of the Virtual Memory of the Virtual Address Space of the process. But we also need it.

    Total


    while (true) and for (;;) are identical in performance with each other and with any -Ox optimizations. So if you are asked which of them is faster - feel free to say that “for (;;)” - 8 characters to write faster than “while (true)” - 12 characters.

    For those who do not believe that without -Ox it will be the same:
    $ gcc while.c -o while.noO
    $ objdump -d while.noO
    ...
     40052b:       bf e4 05 40 00          mov    $0x4005e4,%edi
     400530:       e8 cb fe ff ff          callq  400400 
     400535:       eb f4                   jmp    40052b 
    ...
    $ gcc for.c -o for.noO
    $ objdump -d for.noO
    ...
     40052b:       bf e4 05 40 00          mov    $0x4005e4,%edi
     400530:       e8 cb fe ff ff          callq  400400 
     400535:       eb f4                   jmp    40052b 
    ...
    


    PS of course, all this will be true on the compiler “gcc version 4.7.2 (Debian 4.7.2-5)”

    Also popular now: