WordPress Page Caching

    image

    Recently, a lot of posts on this topic have appeared on Habré, but in essence they can be called: “Look, I installed the Varnish / W3 Total Cache and keep a million requests on the“ Hello world page ”. This article is designed more for geeks who want to learn how it all works and write your own plugin for page caching.

    What for?


    A standard question that arises for each developer before creating a bicycle of existing functionality. Indeed, there are a lot of ready-made plug-ins and many of them are quite high-quality, but you need to understand that they are primarily designed for static blogs. What to do if you do not have a standard WordPress site?

    Let's get started


    What tools does WordPress provide us with?


    As everyone knows, this CMS makes it easy to expand its functionality with plugins, but not everyone knows that there are several types of plugins:
    • ordinary plugins
      are located in wp-content / plugins; the
      administrator can freely install, activate and deactivate them;
    • required plugins
      are located in wp-content / mu-plugins;
      these plugins are included automatically and cannot be deactivated;
    • system plugins
      located in wp-content
      allow you to redefine kernel classes or implement your own functionality in them;
      These include:
      • sunrise.php Loaded
        at the very beginning of kernel initialization. Most often used for domain mapping;
      • db.php
        Allows you to override the standard class for working with the database;
      • object-cache.php
        Allows you to override the standard class of object caching, for example if you want to use Memcached or Redis;
      • advanced-cache.php
        Allows you to implement page caching, which is what we need!


    advanced-cache.php


    In order for this plugin to start functioning, it must be placed in the wp-content directory , and in wp-config.php add the line:
    define('WP_CACHE', true);

    If you look at the WordPress code, you can see that this script is loaded at an early stage of the platform loading.
    // wp-settings.php:63
    // For an advanced caching plugin to use. Uses a static drop-in because you would only want one.
    if ( WP_CACHE )
    	WP_DEBUG ? include( WP_CONTENT_DIR . '/advanced-cache.php' ) : @include( WP_CONTENT_DIR . '/advanced-cache.php' );

    Also, after loading the kernel, CMS will try to call the wp_cache_postload () function, but more on that later.
    // wp-settings.php:226
    if ( WP_CACHE && function_exists( 'wp_cache_postload' ) )
    	wp_cache_postload();


    Storage


    For storing the cache, it is best to use fast storages, since the speed of returning content from the cache directly depends on their speed. I would not recommend using MySql or the file system, Memcached, Redis, or other storages that use RAM will handle this much better.

    I personally like Redis, because it is quite simple to use, it has good read / write speed and as a nice bonus it saves a copy of the data on the hard drive, which will allow us not to lose information when the server reboots.
    $redis = new Redis();
    // подключение к серверу
    $redis->connect( 'localhost' );
    // сохранить данные $value под ключем $key на время $timeout
    $redis->set( $key, $value, $timeout );
    // получить данные по ключу $key
    $redis->get( $key );
    // удалить данные по ключу $key
    $redis->del( $key );

    Of course, this is not a complete list of methods, the entire list of APIs can be studied on the official website , but for most tasks this is enough.

    If the site uses a pumped object cache ( object-cache.php ), then it makes sense to use its API:
    wp_cache_set( $key, $value, $group, $timeout );
    wp_cache_get( $key, $group );
    wp_cache_delete( $key, $group );


    The simplest page caching


    The code has been deliberately simplified, many checks have been removed so as not to confuse the reader with unnecessary constructions and focus on the logic of caching itself. In the advanced-cache.php file, write:
    // если как хранилище используется объектный кеш, то его нужно инициализировать вручную,
    // поскольку на данном этапе загрузки он еще не загружен
    wp_start_object_cache();
    // формируем ключ
    // чаще всего это URL страницы
    $key = 'host:' . md5( $_SERVER['HTTP_HOST'] ) . ':uri:' . md5( $_SERVER['REQUEST_URI'] );
    // берем данные из кеша по ключу
    if( $data = wp_cache_get( $key, 'advanced-cache' ) ) {
        // если данные существуют, отображаем их и завершаем выполнение
        $html = $data['html'];
        die($html);
    }
    // если данных нет, продолжаем выполнение
    // не сохраняем в кеш запросы админ панели
    if( ! is_admin() ) {
        // перехватываем буфер вывода
        ob_start( function( $html ) use( $key ) {
            $data = [
                'html' => $html,
                'created' => current_time('mysql'),
                'execute_time' => timer_stop(),
            ];
            // после генерации страницы сохраняем данные в кеш на 10 минут
            wp_cache_set($key, $data, 'advanced-cache', MINUTE_IN_SECONDS * 10);
            return $html;
        });
    }


    That's all, you got the simplest working page cache, now let's look at each section in more detail.

    Key creation
    $key = 'host:' . md5( $_SERVER['HTTP_HOST'] ) . ':uri:' . md5( $_SERVER['REQUEST_URI'] );
    In this case, the key is the URL of the page. Using the global variable $ _SERVER and hashing is not a good practice, but it’s suitable for a simple example. I advise you to add separating sections of the string as "host:" and "uri:", since they are convenient to use in regular expressions. For example, get all the keys for a specific host:
    $keys = $redis->keys( 'host:' . md5( 'site.com' ) . ':*' );

    Issue from cache
    // берем данные из кеша по ключу
    if( $data = wp_cache_get( $key, 'advanced-cache' ) ) {
        // если данные существуют, отображаем их и завершаем выполнение
        $html = $data['html'];
        die($html);
    }
    Everything is simple here, if the cache has already been created, then we issue it to the user and complete the execution.

    Saving the
    PHP cache ob_start function intercepts all subsequent output to the buffer and allows it to be processed at the end of the script. In simple words, we get all the site content in the $ html variable.
    ob_start( function( $html ) {
        // $html - HTML код готовой страницы
        return $html; 
    }

    Next, save the data in the cache:
    $data = [
        'html' => $html,
        'created' => current_time('mysql'),
         'execute_time' => timer_stop(),
    ];
    wp_cache_set($key, $data, 'advanced-cache', MINUTE_IN_SECONDS * 10);

    It makes sense to save not only HTML, but also other useful information: cache creation time, etc. I highly recommend saving HTTP headers, at least Content-Type, and sending them when issued from the cache.

    Improving


    In the example above, we used the is_admin () function to exclude admin panel caching, but this method is not very practical for two reasons:
    • requests for admin-ajax.php do not get into the cache;
    • if the administrator is the first to visit the page, then his “admin bar” and other things harmful to users will go into the cache;

    The best solution for a simple site is to not use the cache for logged-in users (administrators) at all. Since advanced-cache.php is executed before the kernel is fully loaded, we cannot use the is_user_logged_in () function , but we can determine whether cookies are authenticated (as you know, WordPress does not use sessions).
    // проверяем наличие cookie wordpress_logged_in_*
    $is_logged = count( preg_grep( '/wordpress_logged_in_/', array_keys( $_COOKIE ) ) ) > 0;
    // сохраняем кеш только не залогиненых пользователей
    if( ! $is_logged ) {
        ob_start( function( $html ) use( $key ) {
            // ....
            return $html;
        });
    }


    We complicate the task


    Let's say our site gives different content to users from different regions or countries. In this case, the cache key should be not only the page URL, but also the region:
    $region = get_regeon_by_client_ip( $_SERVER['REMOTE_ADDR'] );
    $key = 'host:' . md5( $_SERVER['HTTP_HOST'] ) . ':uri:' . md5( $_SERVER['REQUEST_URI'] ) . ':region:' . md5( $region );

    According to this principle, we can create a different cache for different user groups according to any parameters.

    wp_cache_postload ()


    This function is called after loading the kernel and it is also convenient to use in some cases.
    From experience, I’ll say that this option works much more stable:
    function wp_cache_postload() {
        add_action( 'wp', function () {
            ob_start( function( $html ) {
                // ...
                return $html;
            });
        }, 0);
    }

    At the time of wp_cache_postload () call , the add_action function already exists and can be used.

    There are situations when, to generate a cache key, you need data that cannot be obtained from cookie, IP and other resources available at the initialization stage. For example, you need to generate an individual cache for each user (sometimes this makes sense).
    function wp_cache_postload() {
        $key = 'host:' . md5( $_SERVER['HTTP_HOST'] ) . ':uri:' . md5( $_SERVER['REQUEST_URI'] ) 
            . ':user:' . get_current_user_id();
        if( $data = wp_cache_get( $key, 'advanced-cache' ) ) {
            $html = $data['html'];
            die($html);
        }
        add_action( 'wp', function () {
            ob_start( function( $html ) {
                // ...
                return $html;
            });
        }, 0);
    }

    As you can see in the example, all the logic is placed in the body of wp_cache_postload and all the platform functions are already available here, including get_current_user_id () . This option is a bit slower than the previous one, but we get endless possibilities for fine-tuning the page cache.

    What you should not forget


    1. These examples are very simplified if you use them in your projects - do not be too lazy to add conditions for caching:
      • only get requests
      • only if there are no errors on the page
      • only if there is no set cookie
      • only if status 200 or 301
    2. The effectiveness of the cache depends on its lifetime. When increasing $ timeout, take the time to consider cache invalidation when changing data.
    3. WP Cron starts later advanced-cache.php, it may just not work with high cache hit.


    Conclusion


    There is nothing complicated in writing your own page caching. Of course, this makes no sense for a typical site, but if you spawned a monster - this material should be useful.

    Also popular now: