How to track file uploads from your WordPress site



    There was a task of tracking file downloads from the site (images, documents, videos, distributions, ...), because regular statistics services cannot do this without changing the file URLs. And the statistics should be visible in the usual place (for example, Google Analytics or FireBase).

    After going through several plugins (most have the words Download and Manager in the name), I found that they are all organized by the principle of manually compiling a list of files for monitoring. And in many of them, protection against unauthorized downloads is implemented, which is redundant in this task. You could use them, but if there are a lot of files, then in the end:

    • it’s too inconvenient and too long to start an element for each file;
    • files can change their location - again you have to fix the element.

    As a result, we made our own implementation in the form of a plugin for WordPress, which simply indicates the directory (relative path of the site) and then monitors the downloads of its contents.

    The link to the free plugin is here for those who have enough information above. The following are examples of statistics and details of the technical implementation.

    Where statistics go


    So far, the two most basic places of aggregation of statistics have been supported.

    Google analytics


    Statistics are published in the form of messages (Events), through which the category (Event Category) is specified, in the action (Event Action) the URI before the file is indicated and in the message label (Event Label) the query parameters are specified if the corresponding setting is set. As a result, you can conveniently observe the download dynamics of each file in a directory in the Google Analytics console.


    WordPress Database Table


    Mainly for debugging. Here, the number of downloads is simply counted, the temporal dynamics is not visible. Fields of the table: IP, file URI, query parameters (if any) and counter. Data can be seen with any SQL editor (e.g. phpMyAdmin).

    Each record is assigned an ID to be deleted individually if necessary.


    Interception of file access


    Downloading files is handled by the Apache Web server itself, so a handler was made in .htaccess with redirection to a PHP script.

    It looks like this: Specially made exceptions for system files with types htaccess, php, js, css. To minimize response time, the script call is implemented through the seraph_dlstat_api parameter for index.php, which is checked almost immediately after loading all the WordPress scripts necessary for processing. This is done on the do_parse_request action hook , the very first callback after loading the entire working environment (wp-load.php execution).



    RewriteEngine On
    RewriteCond %{REQUEST_URI} !\.(htaccess|php|js|css)$
    RewriteCond %{REQUEST_URI} ^/mypath/(.*)
    RewriteRule ^(.*) /index\.php\?seraph_dlstat_api=Get&uri=%{REQUEST_URI} [L,QSA]







    Further, the script processes \ registers the URI and returns the contents of the file through the readfile system function. Also, partial file loading via HTTP_RANGE is supported, where the file is already read in blocks.

    Delayed data sending


    To maximize response time, asynchronous statistics are supported. When accessing the file, a record is created in the database and the file is immediately returned to the client. And already on the triggering of the WordPress sheduler ( WP Cron ), the data is taken in a packet from the table and statistics are sent .

    For Google Analytics, this is acceptable because It supports asynchronous message reception by specifying a delay time .

    By default, WP Cron fires whenever a page loads. You can configure WP Cron from the system sheduler to further optimize response time.

    Conclusion


    As a result, for a client, downloading a file is indistinguishable from standard processing by the Web server and now it is possible to track this.

    I would be grateful for any feedback.

    Also popular now: