PHP under C-shny debugger: we dig inside Zend Engine
Somehow I had to face a problem: the PHP web crawler works fine for itself, it works, and then suddenly (after 3-6 hours of work) it stops doing something and starts eating 100% CPU. How to look for such a problem? How do you know where it goes in cycles? But what if you connect to PHP with a debugger and find out everything you need from there? Details under the cut.
There are not many options here: you can arrange entries in the logs throughout the script and look at which one he stopped. From this we can somehow assume where and how it hangs. It takes a very long time - I placed an entry in the logs, caught a hang, looked, there wasn’t enough information, we set up even more entries, etc. - therefore, I left this option for later, if no other is suitable.
Using xdebug for this will not work - as I understand it, it does not have the functionality of connecting to an already running PHP script. And if you run the script already under xdebug, then again you won’t get to click “run” and then, when it hangs, click “pause” - in xdebug you can only travel by breakpoints (correct me if I'm wrong here).
My main work is related to PHP, but often I have to write in C ++ under GCC (which I must say really like). There is experience in debugging c ++ programs directly on the server using gdb - this is not very difficult and in fact, the gdb debugger is quite convenient for the console program. So why not try debugging our PHP script with it? At the same time, you can dig deeper into PHP internals a bit lively.
Need ssh access to the server. root is not needed - we can do everything locally. So:
You can ask the administrator to install it, or compile and install it locally. I asked the admin.
In fact, the debug version is not needed. All that is needed is for PHP to be compiled with the -g switch. For some reason, PHP 5.2.17 was not assembled in my debug assembly with this key, which greatly facilitated the matter - I managed to use the same extensions that are used for the regular version. As far as I understood, if I had assembled PHP in a debug version, I would not be able to use these same extensions - I would have to use those that would come together with PHP.
Looking ahead, I’ll say that I still needed libxml2 to build PHP. Plus, it turned out that the problem was in libcurl, so in addition I also built libcurl in the debug assembly to get inside it.
So, we collect (I write from memory, so there may be inaccuracies):
PHP build is a bit more complicated - you still need to specify the paths to php.ini files in debian, the path to the compiled libxml2 and the path to the compiled libcurl:
I repeat again. Compiling PHP with --disable-debug (the compiler option -g was specified anyway) and it was easier to link all the ready-made modules than installing PHP completely with all modules locally. Therefore, I did not do make install. It might be better to configure it with the --prefix = $ HOME / libs option and make make install, but what I did above turned out to be enough for my purposes.
All compiled - run PHP. Here, too, things are not so smooth: I did not immediately find the option to tell him where the extensions are, so I had to specify this directory every time I started PHP:
The error with curl is clear - we already compiled PHP with the built-in curl module, so when trying to connect an external curl.so, such an error pops up. It's okay, in general.
With the assembly everything, you can run and catch the bug.
In order not to overload the reader with unnecessary information, I made a small script in PHP where you can see the debug features through gdb:
So, run the script:
look at the PID of our process and run GDB in another terminal:
Attach to our process:
If we see the line Reading symbols from [lib] ..... done - then everything went well, and we can safely debug this binary.
We look backtrace First of all we are interested in frames inside execute () [Zend / zend_vm_execute.h: 92]. These are calls to PHP functions. How to find out where we are at the moment in the PHP script: A few explanations: f [number] throws us into a certain frame, print [cheatam] - print a character that is in the scope of this frame.
In the example above, we got the class name, the name of the method / function and the line number where it is called (in frame 5, the class name is not defined, because it is a built-in sleep () function). In fact, we got a backtrace PHP script. Already on the basis of this information, one can understand where the legs of the elusive bug described at the beginning of the article come from.
That's all for today. If there is interest in the topic, next time I will tell you how to look at the contents of variables and how arrays are arranged in PHP. Thanks for attention. I hope someone was interested in the material.
What can be done in this situation?
There are not many options here: you can arrange entries in the logs throughout the script and look at which one he stopped. From this we can somehow assume where and how it hangs. It takes a very long time - I placed an entry in the logs, caught a hang, looked, there wasn’t enough information, we set up even more entries, etc. - therefore, I left this option for later, if no other is suitable.
Using xdebug for this will not work - as I understand it, it does not have the functionality of connecting to an already running PHP script. And if you run the script already under xdebug, then again you won’t get to click “run” and then, when it hangs, click “pause” - in xdebug you can only travel by breakpoints (correct me if I'm wrong here).
Idea - you can try using GDB!
My main work is related to PHP, but often I have to write in C ++ under GCC (which I must say really like). There is experience in debugging c ++ programs directly on the server using gdb - this is not very difficult and in fact, the gdb debugger is quite convenient for the console program. So why not try debugging our PHP script with it? At the same time, you can dig deeper into PHP internals a bit lively.
What do we need
Need ssh access to the server. root is not needed - we can do everything locally. So:
Gdb
You can ask the administrator to install it, or compile and install it locally. I asked the admin.
PHP compiled with debugging information
In fact, the debug version is not needed. All that is needed is for PHP to be compiled with the -g switch. For some reason, PHP 5.2.17 was not assembled in my debug assembly with this key, which greatly facilitated the matter - I managed to use the same extensions that are used for the regular version. As far as I understood, if I had assembled PHP in a debug version, I would not be able to use these same extensions - I would have to use those that would come together with PHP.
Looking ahead, I’ll say that I still needed libxml2 to build PHP. Plus, it turned out that the problem was in libcurl, so in addition I also built libcurl in the debug assembly to get inside it.
So, we collect (I write from memory, so there may be inaccuracies):
$ wget
$ tar -xzf libxml2-2.7.8.tar.gz
$ cd libxml2-2.7.8
$ ./configure --prefix=$HOME/libs
$ make && make install
$ wget
$ tar -xzf curl-7.18.2.tar.gz
$ cd curl-7.18.2
$ ./configure --prefix=$HOME/libs --enable-debug
$ make && make install
PHP build is a bit more complicated - you still need to specify the paths to php.ini files in debian, the path to the compiled libxml2 and the path to the compiled libcurl:
$ wget
$ tar -xzf php-5.2.17.tar.gz
$ cd php-5.2.17
$ ./configure --disable-debug --with-config-file-path=/etc/php5/cli
--with-config-file-scan-dir=/etc/php5/cli/conf.d
--with-libxml-dir=$HOME/libs --disable-pdo --with-curl=$HOME/libs
$ make
I repeat again. Compiling PHP with --disable-debug (the compiler option -g was specified anyway) and it was easier to link all the ready-made modules than installing PHP completely with all modules locally. Therefore, I did not do make install. It might be better to configure it with the --prefix = $ HOME / libs option and make make install, but what I did above turned out to be enough for my purposes.
All compiled - run PHP. Here, too, things are not so smooth: I did not immediately find the option to tell him where the extensions are, so I had to specify this directory every time I started PHP:
$ php/php-5.2.17/sapi/cli/php -d extension_dir=/usr/lib/php5/20060613
PHP Warning: Module 'curl' already loaded in Unknown on line 0
The error with curl is clear - we already compiled PHP with the built-in curl module, so when trying to connect an external curl.so, such an error pops up. It's okay, in general.
With the assembly everything, you can run and catch the bug.
Actually, launch and debug
In order not to overload the reader with unnecessary information, I made a small script in PHP where you can see the debug features through gdb:
_a = $a;
}
public function run() {
while (true) {
sleep(1);
}
}
}
class B {
protected $_a = NULL;
protected $_b = NULL;
public function __construct() {
$this->_b = rand(1000, 9999);
$this->_a = new A(rand(1000, 9999));
}
public function run() {
$this->_a->run();
}
}
$b = new B;
$b->run();
So, run the script:
$ php/php-5.2.17/sapi/cli/php -d extension_dir=/usr/lib/php5/20060613 test/test.php
look at the PID of our process and run GDB in another terminal:
$ ps auwx | grep test.php
$ gdb
GNU gdb 6.8-debian
Copyright (C) 2008 Free Software Foundation, Inc.
......
This GDB was configured as "x86_64-linux-gnu".
(gdb)
Attach to our process:
(gdb) attach 7455
Attaching to process 7455
Reading symbols from //php/php-5.2.17/sapi/cli/php...done.
.....
Reading symbols from //libs/lib/libcurl.so.4...done.
Loaded symbols for //libs/lib/libcurl.so.4
.....
0x00007fd9e6c22040 in nanosleep () from /lib/libc.so.6
(gdb)
If we see the line Reading symbols from [lib] ..... done - then everything went well, and we can safely debug this binary.
We look backtrace First of all we are interested in frames inside execute () [Zend / zend_vm_execute.h: 92]. These are calls to PHP functions. How to find out where we are at the moment in the PHP script: A few explanations: f [number] throws us into a certain frame, print [cheatam] - print a character that is in the scope of this frame.
(gdb) bt
#0 0x00007fd9e6c22040 in nanosleep () from /lib/libc.so.6
#1 0x00007fd9e6c21e97 in sleep () from /lib/libc.so.6
#2 0x0000000000587277 in zif_sleep (ht=1, return_value=0x278c010, return_value_ptr=0x0, this_ptr=0x0,
return_value_used=0) at /[homedir]/php/php-5.2.17/ext/standard/basic_functions.c:4794
#3 0x000000000068a733 in zend_do_fcall_common_helper_SPEC (execute_data=0x7fff0b7d6310)
at /[homedir]/php/php-5.2.17/Zend/zend_vm_execute.h:200
#4 0x0000000000690204 in ZEND_DO_FCALL_SPEC_CONST_HANDLER (execute_data=0x7fff0b7d6310)
at /[homedir]/php/php-5.2.17/Zend/zend_vm_execute.h:1740
#5 0x000000000068a221 in execute (op_array=0x278ad38)
at /[homedir]/php/php-5.2.17/Zend/zend_vm_execute.h:92
#6 0x00007fd9e655b90f in zend_oe () from /usr/lib/php5/20060613/ZendOptimizer.so
#7 0x000000000068a886 in zend_do_fcall_common_helper_SPEC (execute_data=0x7fff0b7d6570)
at /[homedir]/php/php-5.2.17/Zend/zend_vm_execute.h:234
#8 0x000000000068b3af in ZEND_DO_FCALL_BY_NAME_SPEC_HANDLER (execute_data=0x7fff0b7d6570)
at /[homedir]/php/php-5.2.17/Zend/zend_vm_execute.h:322
#9 0x000000000068a221 in execute (op_array=0x278b8c0)
at /[homedir]/php/php-5.2.17/Zend/zend_vm_execute.h:92
#10 0x00007fd9e655b90f in zend_oe () from /usr/lib/php5/20060613/ZendOptimizer.so
#11 0x000000000068a886 in zend_do_fcall_common_helper_SPEC (execute_data=0x7fff0b7d68a0)
at /[homedir]/php/php-5.2.17/Zend/zend_vm_execute.h:234
#12 0x000000000068b3af in ZEND_DO_FCALL_BY_NAME_SPEC_HANDLER (execute_data=0x7fff0b7d68a0)
at /[homedir]/php/php-5.2.17/Zend/zend_vm_execute.h:322
#13 0x000000000068a221 in execute (op_array=0x2787b88)
at /[homedir]/php/php-5.2.17/Zend/zend_vm_execute.h:92
#14 0x00007fd9e655b90f in zend_oe () from /usr/lib/php5/20060613/ZendOptimizer.so
#15 0x0000000000665598 in zend_execute_scripts (type=8, retval=0x0, file_count=3)
at /[homedir]/php/php-5.2.17/Zend/zend.c:1134
#16 0x0000000000615608 in php_execute_script (primary_file=0x7fff0b7d8ee0)
at /[homedir]/php/php-5.2.17/main/main.c:2036
#17 0x00000000006dfa82 in main (argc=4, argv=0x7fff0b7d90f8)
at /[homedir]/php/php-5.2.17/sapi/cli/php_cli.c:1165
(gdb)
(gdb) f 13
#13 0x000000000068a221 in execute (op_array=0x2d3fb88)
at /[homedir]/php/php-5.2.17/Zend/zend_vm_execute.h:92
92 if (EX(opline)->handler(&execute_data TSRMLS_CC) > 0) {
(gdb) print execute_data.function_state.function->common.scope->name
$20 = 0x2d423a0 "B"
(gdb) print execute_data.function_state.function->common.function_name
$21 = 0x2d43790 "run"
(gdb) print execute_data.opline->lineno
$22 = 28
(gdb) f 9
#9 0x000000000068a221 in execute (op_array=0x2d438c0)
at /[homedir]/php/php-5.2.17/Zend/zend_vm_execute.h:92
92 if (EX(opline)->handler(&execute_data TSRMLS_CC) > 0) {
(gdb) print execute_data.function_state.function->common.scope->name
$23 = 0x2d42380 "A"
(gdb) print execute_data.function_state.function->common.function_name
$24 = 0x2d44c48 "run"
(gdb) print execute_data.opline->lineno
$25 = 23
(gdb) f 5
#5 0x000000000068a221 in execute (op_array=0x2d42d38)
at /[homedir]/php/php-5.2.17/Zend/zend_vm_execute.h:92
92 if (EX(opline)->handler(&execute_data TSRMLS_CC) > 0) {
(gdb) print execute_data.function_state.function->common.function_name
$26 = 0x770781 "sleep"
(gdb) print execute_data.opline->lineno
$27 = 10
(gdb)
In the example above, we got the class name, the name of the method / function and the line number where it is called (in frame 5, the class name is not defined, because it is a built-in sleep () function). In fact, we got a backtrace PHP script. Already on the basis of this information, one can understand where the legs of the elusive bug described at the beginning of the article come from.
That's all for today. If there is interest in the topic, next time I will tell you how to look at the contents of variables and how arrays are arranged in PHP. Thanks for attention. I hope someone was interested in the material.