Apache 2.x under surveillance or how to find out even more

    Introduction



    I will continue the series of articles on the topic “Apache 2.x under supervision. Monitoring system load by web server. " Again, under consideration is the previously known [2] Apache module - mod_performance. At the time of writing this article, a new version of the module, 0.2, was posted for access on the module’s site [1]. The further narration in the article will be on the basis of the Question-Answer principle.


    What's new in mod_performance 0.2?



    I want to once again focus on what the module is intended for:
    • the module is designed to collect and accumulate statistics on the use of resources (CPU and memory, script execution time, as well as the I / O process) by the Apache 2.2 web server;
    • The module allows you to analyze the collected data.


    If you briefly describe the innovations, you get something like this list:
    • Saving the information collected by the module in the MySQL database.
    • Saving the information collected by the module in a separate text log in a user-defined format.
    • The number of data collected has been expanded, i.e. now the units of statistics collected are not limited to percentages, but are still stored in seconds and megabytes
    • Saving the information collected by the module in the PostgreSQL database.
    • Fixed a number of errors affecting the stability in the 0.1 version.
    • Added compatibility of the module with Apache-itk configurations, as well as Apache + mod_ruid2.
    • Added collection of statistics of input operations of output of the monitored process.
    • The effect of the module on server performance has been reduced, that is, the module no longer issues a 503 error if it is unable to connect to the daemon.


    What are the operating principles of the module?



    I repeat once again for those who have not read the previous article [2] and add new information.
    The module allows you to track how much resources are consumed by the request received by the web server. Each time saving a portion of data about the fulfilled request.
    Immediately make a reservation that the data about the request is saved only after the completion of the request, i.e. the data is accumulated for history and analysis. For those who are interested in the current server load - use mod_status.
    As a resource usage analyzer, it’s not a scoreboard, as in mod_status and perl extensions, but glibtop.
    The module allows you to track as absolutely all requests, and specific, filtered by the rule using regular expressions. It will be more precisely said that the module ALWAYS processes only those requests that correspond to a filter containing a regular expression.
    And now I will describe how the statistics of the request are collected.
    When starting the Apache web server, the mod_performance module daemon is launched. A running daemon opens a unix socket and listens for connections. When processing the request by the server, the request is checked for the condition: is it necessary to save statistics for the request or not. If the test is successful, the server process that accepted the connection sends the daemon information and the PID (TID) of the process / thread that will process the request. The daemon starts two threads: 1) the first thread, which is waiting for the final transfer of data from the process processing the request; 2) a thread that periodically polls memory usage by the process that processes the request and calculates the maximum value. At the end of the process that processes the request, the daemon writes data to the statistics database.

    How are statistics about CPU usage collected?



    The CPU usage score is calculated as follows. When a request is received, the module takes readings of the jiffies of the process (the system as a whole and the current process) and at the end of the request measurement is performed again and this data is sent to the daemon. Based on them, a decision is made to use the processor. Those. if watching with top you saw a load: 0%, 10%, 100%, 20%, then do not expect the module to save 100%, because the exact amount of time that during the existence of the request was allocated specifically for this process will be saved. "By eye" for the above example, this number will be equal to - 32%.

    How are statistics about memory usage collected?



    But the memory indicator is collected according to a different principle. During the processing of the request, the daemon every 10 milliseconds measures the memory usage of the process that processes the request and at the end saves the maximum value.

    How are statistics about I / O usage collected?



    This indicator is monitored like a CPU - it reads the read and written process data at the beginning of the request and at the end of the request. The difference of these values ​​is converted to kilobytes and stored in the database. Those. in fact, this indicator tracks the number of bytes written / read during the existence of the request. I must say right away that the indicator used is the source: / proc / [pid] / io - read_bytes, write_bytes, cancelled_write_bytes.

    Launch Recommendations



    By default, the module has all its files: socket, sqlite database, and the global log is stored in the / etc / httpd / log folder. But as practice has shown, it is not always possible to use this folder. Often the demon does not have rights to it, because the daemon runs under the apache user (I write about CentOS, therefore apache).
    I immediately recommend on the machine where you are going to use the module to create a folder, for example - / statistics / apache. Set it as the owner of the apache user and allow him to write and read (only carefully with the itk and mod_ruid modes, so that the module with the modified user can also write to the socket of this folder).

    PerformanceSocket / statistics / apache / perfsock


    Where to save query statistics?



    Here it is an important question - where to save the collected statistics. To simplify this puzzle, the new version of the module has built-in support for such databases as: SQLite, MySQL, PostgreSQL, as well as exotics - saving to the log file. Now you do not need to adapt to the module - it will adapt to you. For the module to work successfully (not in the "Save to Log" mode), you must have at least one of the following libraries on the machine:
    • libsqlite3.so;
    • libmysqlclient_r.so;
    • libpq.so.

    When building the package, mysql-devel, sqlite-devel, postgresql-devel are not required. These libraries are loaded dynamically while the module is running. More precisely, the library necessary for the selected mode will be loaded.

    Example 1. Working with SQLite. The easiest option, the database is created automatically, the table is created automatically, there is no need for users and so on. There is only one very important point for those who have already used version 0.1 module: in the old database and the new one the table structures are different, therefore, for the successful operation, the old database is better to be deleted. Because The module itself does not recreate the existing table.

    To work with SQLite you need:
    PerformanceDB / statistics / apache / perfdb
    PerformanceLogType SQLite
    


    Example 2. Working with MySQL. More difficult option.
    Create a database, for example, perf and a user perf with rights to this database:
    mysql> create database perf;
    mysql> CREATE USER 'perf' @ 'localhost' IDENTIFIED BY 'perf';
    mysql> GRANT ALL PRIVILEGES ON *. * TO 'perf' @ 'localhost' WITH GRANT OPTION;
    


    the module will create the table itself. And now in the module settings:
    PerformanceLogType MySQL
    PerformanceDbUserName perf
    PerformanceDBPassword perf
    PerformanceDBName perf
    


    And again, a very important remark for those who have already used the module version 0.2 of earlier versions than 0.2-8: in the old database and the new one the table structures are different, therefore it is better to delete the old database for successful operation. Because The module itself does not recreate the existing table.

    Example 3 . Work with PostgreSQL. More difficult option.
    It is also necessary to create a database and user for access:
    postgres = # CREATE USER perf WITH PASSWORD 'perf';
    postgres = # CREATE DATABASE perf;
    postgres = # GRANT ALL PRIVILEGES ON DATABASE perf to perf;
    


    in the file /var/lib/pgsql/data/pg_hba.conf

    local all all trust
    host all all 0.0.0.0/0 trust
    host all all::: 1/128 trust
    


    and finally module settings:
    PerformanceLogType Postgres
    PerformanceDbUserName perf
    PerformanceDBPassword perf
    PerformanceDBName perf
    


    Example 4 . Work with a text log.
    In this mode, no additional libraries are required. It is enough to assign a file where statistics will be consolidated.

    PerformanceLogType Log
    PerformanceLog /statistics/apache/perf.log
    


    By default, data in this file falls into this format:

    [% DATE%] from% HOST% (% URI%) script% SCRIPT%: cpu% CPU% (% CPUS%), memory% MEM% (% MEMMB%), execution time% EXCTIME%, IO: R -% BYTES_R% W -% BYTES_W%

    which unfolds at:
    [2011-06-05 19:28:28] from example.com (/index.php) script / var / www / example.com/index.php: cpu 0.093897 (0.010000), memory 0.558202 (5.597656), execution time 10.298639, IO: R - 104.000000 W - 248.000000
    [2011-06-05 19:28:39] from example.com (/ index2.php) script /var/www/example.com/index2.php: cpu 0.000000 (0.000000), memory 0.558202 (5.597656), execution time 10.159158, IO: R - 0.000000 W - 0.000000


    And now in more detail. For this mode, you can set the format of the line displayed in the log. There are predefined macro names for this:

    • % DATE% - converted to the start date of the request;
    • % CPU% - CPU usage in percent;
    • % MEM% - memory usage as a percentage;
    • % URI% - Request URI
    • % HOST% - the name of the virtual host to which the request was addressed;
    • % SCRIPT% - script name;
    • % EXCTIME% - the duration of the script in seconds;
    • % CPUS% - how many seconds the system spent exactly on this process in seconds;
    • % MEMMB% - memory usage in megabytes;
    • % BYTES_W% - kilobyte recorded;
    • % BYTES_R% - kilobyte read;
    • %% - display a percent sign.

    For example:
    Hello from% HOST% I use% CPU% %% cpu today% DATE%
    will expand to
    Hello from example.com I use 0.23% cpu today 2011-06-05 19:28:28

    Such a log can be global, and also for each virtual host its own. As well as each host can have its own unique output format to the log.

    Another important feature is that the accumulated data analysis screen is not available in this mode. Those. do not execute module handlers. In this case, analyzing logs utilities must be written separately.

    What reports are available in the new version of the module by default?



    As before, in version 0.1, reports are available in the new module:
    • Show output without analytics - display collected information without analysis, filtered by host, script and URI (graphical and text mode);
    • Maximal% CPU - display only records with the maximum value of% CPU (taking into account filtering);
    • Maximal memory% - display only records with a maximum value of% memory (including filtering);
    • Maximal execution request time - output the longest running script;
    • Host requests statistics - display statistics of calls to hosts sorted in descending order (in% of the total, including filters);
    • Number of requests per domain - display statistics of calls to hosts sorted in descending order (not a percentage but a quantity);
    • Average usage per host - display the average server load by each host (the sum of% CPU, the sum of% MEMORY, the amount of script execution, the average% CPU for the period, the average% memory usage, the average script execution time);
    • Show current daemon threads - show a list of requests monitored by the daemon (displayed only for the performance-status handler and with the PerformanceExtended parameter enabled).


    Displayed fields in reports:
    • ID - record identifier;
    • DATE ADD - when the request passed;
    • HOSTNAME - virtual host name;
    • URI - uri of the request;
    • SCRIPT - a running script;
    • CPU (%) - CPU usage in%;
    • MEM (%) - memory usage in%;
    • TIME EXEC (sec) - query execution time;
    • CPU TM (sec) - processor time in seconds;
    • MEM USE (Mb) - memory usage in megabytes;
    • IO READ (Kb) - read by the KB process;
    • IO WRITE (Kb) - recorded by the Kbyte process.


    Reports are available in the mode: SQLite, MySQL, Postgres.

    How to assemble a module?



    I repeat, because compared with the previous version, changes have appeared (installation under Debian [4]).
    All actions must be performed as root:
    1) install the necessary packages for assembly:
    yum install httpd-devel apr-devel libgtop2-devel gd-devel

    2) create a temporary pack for source codes:
    mkdir ~ / my_tmp
    cd ~ / my_tmp

    3) create a temporary pack for source codes:
    wget http://lexvit.dn.ua/utils/getfile.php?file_name=mod_performance-0.2.tar.gz -O mod_performance-0.2.tar.gz
    tar zxvf mod_performance-0.2.tar.gz
    cd mod_performance-0.2 /

    4) assemble the module:
    make

    5) we do not pay attention to warning. The main thing is that there is no error. If everything went fine, then:
    make install

    or
    cp .libs / mod_performance.so <path where to copy>


    Instructions on the module parameters are available at the address in the links [3].

    How does a module affect query processing speed?



    From a theoretical point of view, the module in fact should not affect the speed of processing the request, because in the process of the request itself, only CPU information is read, the rest is read by the daemon. And the main burden lies precisely with the demon. The load may increase on the server, as the daemon needs to access the database to record information. And also you should not forget about the memory that is required for the threads to work.
    To study the practical part of this question, a small study was conducted using the ab utility (ApacheBench).

    1st test. We studied a php script that creates a load on the file subsystem:

    Without the mod_performance module:
    Time taken for tests: 205.952423 seconds
    Complete requests: 100
    Failed requests: 0
    Requests per second: 0.49 [# / sec] (mean)
    Time per request: 10297.621 [ms] (mean)
    Time per request: 2059.524 [ms] (mean, across all concurrent requests)
    


    With mod_performance module:
    Time taken for tests: 206.386260 seconds
    Complete requests: 100
    Failed requests: 0
    Requests per second: 0.48 [# / sec] (mean)
    Time per request: 10319.313 [ms] (mean)
    Time per request: 2063.863 [ms] (mean, across all concurrent requests)
    


    2nd test. Examining a php script that creates a load on the CPU.

    Without mod_performance module:
    Time taken for tests: 60.333852 seconds
    Complete requests: 100
    Failed requests: 0
    Requests per second: 1.66 [# / sec] (mean)
    Time per request: 3016.692 [ms] (mean)
    Time per request: 603.339 [ms] (mean, across all concurrent requests)
    


    With mod_performance module:
    Time taken for tests: 60.714260 seconds
    Complete requests: 100
    Failed requests: 0
    Requests per second: 1.65 [# / sec] (mean)
    Time per request: 3035.713 [ms] (mean)
    Time per request: 607.143 [ms] (mean, across all concurrent requests)
    


    3rd test. Examining a php script that runs quickly and does not create a load.

    Without mod_performance module:
    Time taken for tests: 0.075594 seconds
    Complete requests: 100
    Failed requests: 0
    Requests per second: 1322.86 [# / sec] (mean)
    Time per request: 3.780 [ms] (mean)
    Time per request: 0.756 [ms] (mean, across all concurrent requests)
    


    With mod_performance module:
    Time taken for tests: 0.109116 seconds
    Complete requests: 100
    Failed requests: 0
    Requests per second: 916.46 [# / sec] (mean)
    Time per request: 5.456 [ms] (mean)
    Time per request: 1.091 [ms] (mean, across all concurrent requests)
    


    The investigated machine: virtual, 1Gb RAM, AMD Phenom (tm) 8650 Triple-Core Processor processor, CentOS 5.5 OS.

    The larger the request, the less noticeable is the influence of the module. The first tests showed that the influence of the module is insignificant, the last test showed an increase in query processing time by one and a half times. But judging by the execution time of the request, this is an acceptable victim.

    References



    1. Mod_performance module site - http://lexvit.dn.ua/files/
    2. Previous article about the module - http://habrahabr.ru/blogs/server_side_optimization/119011/
    3. Instructions on module parameters - http://lexvit.dn.ua/articles/?art_id=mod_performance0_2_mht201105267239
    4. Building a module for Debian 6.0 (version 0.1, thanks to Maxim for the article) - http://linuxwork.org.ua/debian/ustanovka-i-nastrojka-modulya-mod_performance-dlya-apache-na-debian-6-0-squeeze /

    Also popular now: