Logreplica: collecting logs from the entire cluster into a single point in real time

    I continue to share useful utilities that I use in various projects. This time we will talk about logreplica - a simple tool that allows you to organize reliable transfer of logs from different cluster servers to a single machine with large “real-time” disks. This is very convenient if you want to centrally monitor or analyze logs from the entire cluster as if they were written directly to a single machine.

    We can say that logreplica was conceived as a more convenient and reliable way to collect logs in a central place than the way to use the syslog / syslog-ng settings.

    The advantage of logreplica is its ease of configuration: you only configure the “mask” of the names of the log files and specify the addresses of the source machines, and later the logs corresponding to the mask are automatically and on the fly added to the central machine (including if on machines -sources new log files appear, unknown at the time logreplica started). When adding a new machine, you do not need to configure anything on it: just include its name in the config file.

    Config example /etc/dklab_logreplica.conf

    # What directory at the current machine to store logs to.
    destination = / var / log / cluster
    # Skip these log files prefixes while storing to destination directory.
    skip_destination_prefixes = / var / log: / var / lib / pgsql / data / logs
    # Path to internal state directory (may not change).
    scoreboard = /var/run/dklab_logreplica.scoreboard
    # Time interval to sleep before log growth checks.
    delay = 0.25
    # Default user to connect to remote hosts.
    user = root
    # What files to monitor at all of logs source machines.
    [files]
    / var / log / {messages, maillog}
    / var / log / httpd / * _ log
    # What hosts to connect to gather logs from.
    [hosts]
    first = machine1.example.com
    second=nobody@machine2.example.com
    

    How to install and configure logreplica


    On source machines, you do not need to configure anything: they immediately end up in the system after being registered in the central config file. So, dozens of servers can be freely connected to logreplica without configuring them: all settings are located in a single file /etc/dklab_logreplica.conf .
    1. Select the server on which all the logs will be added; copy the logreplica distribution to / opt / dklab_logreplica / .
    2. Copy the init script dklab_logreplica.init in /etc/init.d and configure it to start at boot time (see. The chkconfig , the update-the rc.d etc. The Linux utility).
    3. Copy the default config dklab_logreplica.conf.sample to /etc/dklab_logreplica.conf and modify it according to your system:
      • specify the machines from which the logs will be collected;
      • specify the masks for the names of the log files to be monitored;
      • specify the directory where the logs will be copied in real time.
    4. Create a private + public key using ssh-keygen -t rsa . Put the public key into the machines where the logs will be downloaded from: ssh-copy-id root @ machine-to-be-pulled . As a result, you should get passwordless access from the log server to all source machines, otherwise logreplica will not work.
    5. Now run /etc/init.d/dklab_logreplica start - and logreplica will begin to track changes in the log files on the source machines and store the data in the destination directory.
    The logreplica daemon can reconnect to source machines if it suddenly disappears due to network problems. This will transfer all the data that has accumulated in the logs during the downtime, so that nothing is lost. Logreplica also monitors log rotation on source machines and processes it correctly.

    Problems resolved by logreplica


    Suppose a cluster has several physical or virtual machines that perform various tasks (for example, SQL-server, web-frontend, balancer, mail server, etc.). Sometimes you want to have logs from all these machines in a single place - for example, to read various statistics or just conveniently (albeit even using tail -f) to monitor what is happening in the whole system.

    Of course, you can configure syslog or syslog-ng to send all the logs over the network to the central log server, however, if you do so, you will encounter a number of inconveniences:
    1. In the event of temporary interruptions in communication between machines (this happens, unfortunately, more often than we would like), pieces of logs may be lost, so you will sometimes lose data.
    2. In practice, it’s rather difficult to maintain the syslog or syslog-ng configuration synchronized with the real state of affairs in the cluster: new machines can be added, the set of log files can change, etc.
    3. Not all services support the transfer of logs to syslog (for example, apache and nginx write them directly to files to minimize delays). So you have to use named pipes to transfer data to syslog, which means that configs and dependencies will grow and grow.
    4. Often you want to replicate log files to the central machine by mask, rather than hard-coded specifying the name of the log file. Syslog and syslog-ng cannot do this.
    5. Finally, in practice, it turns out to be very convenient to store logs not only on the central machine, but also on the machine where they are created (a narrower “rotation window” can be configured on the source machine). So your syslog and syslog-ng configs are growing again ...
    The logreplica utility was created in order to solve all these problems. In general, the logs simply “magically” and in real time appear on the central machine, without any changes in the settings of services and remote machines.

    Links: logreplica on dklab.ru , logreplica on GitHub . Write in the comments if you know analogues or have a different vision for the described problem.

    Also popular now: