Catching bug # 52001 in PHP 5.3: pointers and uninitialized variables

    In the wake of the recently found tvv 'om bug.

    When executing the following code in PHP versions 5.3.0-5.3.2, the result exceeded all expectations. The result is '2'. I managed to find a bug and fix it: # 52001 . Briefly: the pointer to a special gag variable for uninitialized variables was erased, through which all CV variables in PHP are created. When I saw the PHP source code for the first time, I started the search by checking for the lousiness of the scanner and PHP parser. It turned out that the compilation was working correctly: for this it was necessary to enable the parser debugging mode. This helped to name the variables and figure out which structure belongs to. In particular, to understand the ownership of all sorts of zend_do_ * compiler functions.

    f(0, $$var);
    $x = 1;
    $y = 2;
    echo $x;
    function f($a, $b) {};








    Then it became clear that there are two different modes of calling functions: by name and by address. The first is used if the name is not known to the compiler in compilation mode. In this mode, the arguments are passed slightly differently, since the prototype is not known to the compiler.

    Pseudo-randomly poking printouts of variable addresses, I found that indeed two variables (x and y) have the same addresses in PHP internals, which was clearly a bug. At first there was a doubt that the variables were correctly looked up in the namespace, which was dispelled by the inclusion of debugging: printing the entire hash of the namespace when searching for variables in it.

    It turned out the following: a call by name leads to a special marking of the transferred variables, since they can be links (because the prototype is unknown).

    The $$ var variable, and all read variables, are created as a special uninitialized variable. The code handler for extracting the variable to call the function ensured that the passed value could be used as ref, for which it was necessary to copy this variable. In this case, using the pointer to the pointer, the value of the pointer is rewritten to the very special uninitialized variable. It becomes equal to the newly allocated memory and has reference count = 1.

    After that, any new initialized variable receives this memory. An “incorrect” reference count leads to the fact that when writing to these variables they are not copied (copy-on-write) and the same piece of memory is used as common for all new variables. This leads to the fact that all the data is the same, similar to the old bug in Fortran when assignment 1 = 2 could be made.

    That's all. It was an interesting experience, I really like such bugs.

    You can read more about the technical part in this comment .

    Also popular now: