Monitoring file changes in Node.js

Original author: Dave Johnson
  • Transfer

The material, the translation of which we publish today, is devoted to the organization of monitoring changes in files in Node.js. The author of the material, Dave Johnson, says that he needed a file monitoring system in the process of creating an IoT project related to feeding aquarium fish. When one of the family members feeds them, he presses one of the three buttons. In particular, we are talking about the button on the expansion board connected to the Raspberry Pi, the Amazon Dash button, and the button in the web interface. Any of these actions leads to writing a line to the log file indicating the date, time and type of event. As a result, looking at the contents of this file, you can understand whether it is time to feed the fish or not. Here is a fragment of it:

2018-5-21 19:06:48|circuit board
2018-5-21 10:11:22|dash button
2018-5-20 11:46:54|web

After the formation of the file is adjusted, it is necessary that the system based on Node.js reacts to events of change of this file and takes the necessary actions. Here several approaches to solving this problem will be considered and analyzed.

File Monitoring Packages and Node.js Built-in Features


This material explores Node.js built-in file monitoring capabilities. In fact, such tasks can only be solved using Node tools, without resorting to using third-party packages. However, if you are not against external dependencies or simply want to get to a working solution as quickly as possible without going into details, you can use the appropriate packages. For example, with chokidar and node-watch packages. These are great libraries that are based on Node’s internal file system monitoring capabilities. It is not difficult to use them; they solve the tasks assigned to them. Therefore, if you need to organize monitoring of files without particularly going into the implementation of certain things in Node, these packages will help you with this. If, in addition to obtaining a practical result, you are also interested in how the corresponding Node subsystems are arranged, let's examine them together.

First steps


In order to explore various Node tools for organizing file monitoring, we first create and configure a new project. We will focus on novice Node developers, so we will describe everything in sufficient detail.

So, in order to create a project, create a new folder and go to it using the terminal. In the terminal, execute the following command:

$ npm init -y

In response to it, the system will create a file package.jsonfor the Node.js project.
Now install the log-timestamp package from npm and save it package.jsonas a dependency:

$ npm install --save log-timestamp

The package log-timestampallows you to attach a timestamp to messages that are displayed in the console using the command console.log. This will allow you to analyze the time of occurrence of events associated with monitoring files. This package is needed exclusively for educational purposes, and, for example, if you are preparing something similar to what we are talking about, for use in production, log-timestampyou will not need it.

Using fs.watchFile


The built-in Node.js fs.watchFile method may seem like a logical choice for organizing monitoring the state of our log file. The callback passed to this method will be called every time the file is modified. We will test fs.watchFile.

const fs = require('fs');
require('log-timestamp');
const buttonPressesLogFile = './button-presses.log';
console.log(`Watching for file changes on ${buttonPressesLogFile}`);
fs.watchFile(buttonPressesLogFile, (curr, prev) => {
  console.log(`${buttonPressesLogFile} file Changed`);
});

Here we start monitoring changes in the file button-pressed.log. A callback is called after the file changes.

The callback functions are passed two type arguments fs.stats. This is an object with data on the current state of the file ( curr), and an object with data on its previous state ( prev). This allows, for example, to find out the time of the previous file modification using the design prev.mtime.

If, after running the above code, open the file button-pressed.logand make changes to it, the program will respond to this, the corresponding entry will appear in the console.

$ node file-watcher.js
[2018-05-21T00:54:55.885Z] Watching for file changes on ./button-presses.log
[2018-05-21T00:55:04.731Z] ./button-presses.log file Changed

When experimenting, you can notice a delay between the moment the file is modified and the moment the message about it appears in the console. Why? The thing is that the method fs.watchFile, by default, polls files for changes every 5.007 seconds. This time can be changed by passing fs.watchFilean object with parameters containing the property to the method interval:

fs.watchFile(buttonPressesLogFile, { interval: 1000 }, (curr, prev) => {
  console.log(`${buttonPressesLogFile} file Changed`);
});

Here we set the polling interval to 1000 milliseconds, thereby indicating that we want the system to poll our log file every second.

Note that the fs.watchFile documentation indicates that the callback function in the handler will be called whenever a file is accessed. While preparing this material, I worked in Node v9.8.0, and in my case the system behaved wrong. A callback was called only when changes were made to the observed file.

Using fs.watch


The fs.watch method is a much better way to monitor files . While it fs.watchFilespends system resources on polling files, it fs.watchrelies on the operating system and system notifications of changes to the file system. The documentation says that Node uses the mechanism inotifyon the Linux family of OS, FSEventson MacOS, and ReadDirectoryChangesWon Windows to receive asynchronous notifications when files change (compare this with synchronous file polling). The performance gain obtained from using fs.watchinstead fs.watchFileis even more significant when, for example, you need to monitor all the files in a certain directory, since as the first argument tofs.watchYou can transfer either the path to a specific file, or to the folder. We will test fs.watch.

const fs = require('fs');
require('log-timestamp');
const buttonPressesLogFile = './button-presses.log';
console.log(`Watching for file changes on ${buttonPressesLogFile}`);
fs.watch(buttonPressesLogFile, (event, filename) => {
  if (filename) {
    console.log(`${filename} file Changed`);
  }
});

Here we observe what happens with the log file, and, upon detecting the changes, we display the corresponding message in the console.

Change the log file and see what happens. What will be described below happens when you run the example on Raspberry Pi (Raspbian), so what you see when you run it on your system may look different. So, here's what happened after the file changes were made.

$ node file-watcher.js
[2018-05-21T00:55:52.588Z] Watching for file changes on ./button-presses.log
[2018-05-21T00:56:00.773Z] button-presses.log file Changed
[2018-05-21T00:56:00.793Z] button-presses.log file Changed
[2018-05-21T00:56:00.802Z] button-presses.log file Changed
[2018-05-21T00:56:00.813Z] button-presses.log file Changed

It turns out interesting: one change was made to the file, and a handler that responded to the file change was called four times. The number of these events varies by platform. It is possible that one change causes several events due to the fact that the operation of writing a file to disk lasts for a certain period of time X, and the system detects several changes to the file in this period of time. In order to get rid of such “false positives”, we need to modify our solution and make it less sensitive.

Here is one technical feature fs.watch. This method allows you to respond to events that occur either when a file is renamed (these are events rename), or when its contents change (change) If we need accuracy and we want to observe only changes in the contents of the file, the code must be brought to the following state:

const fs = require('fs');
require('log-timestamp');
const buttonPressesLogFile = './button-presses.log';
console.log(`Watching for file changes on ${buttonPressesLogFile}`);
fs.watch(buttonPressesLogFile, (event, filename) => {
  if (filename && event ==='change') {
    console.log(`${filename} file Changed`);
  }
});

In our case, such a modification of the code will not fundamentally change anything, but, perhaps, if you build your own system to monitor the state of files, this technique will come in handy. In addition, it should be noted that, when experimenting with this code, the event renamewas detected when Node started under Windows, but not under Raspbian.

Attempt to improve No. 1: comparing file modification moments


We need the handler to be called only when real changes are made to the log file. Therefore, we will try to improve the code fs.watchby observing the moment the file is modified, which will allow us to identify real changes and avoid false positives.

const fs = require('fs');
require('log-timestamp');
const buttonPressesLogFile = './button-presses.log';
console.log(`Watching for file changes on ${buttonPressesLogFile}`);
let previousMTime = new Date(0);
fs.watch(buttonPressesLogFile, (event, filename) => {
  if (filename) {
    const stats = fs.statSync(filename);
    if (stats.mtime.valueOf() === previousMTime.valueOf()) {
      return;
    }
    previousMTime = stats.mtime;
    console.log(`${filename} file Changed`);
  }
});

Here we write the previousMTimevalue of the previous moment of file modification into a variable and call it console.logonly in cases when the time of file modification changes. It seems that the idea is a good one and now everything should work as we need. Check this out.

$ node file-watcher.js
[2018-05-21T00:56:50.167Z] Watching for file changes on ./button-presses.log
[2018-05-21T00:56:55.611Z] button-presses.log file Changed
[2018-05-21T00:56:55.629Z] button-presses.log file Changed
[2018-05-21T00:56:55.645Z] button-presses.log file Changed

The result, unfortunately, does not look much better than what we saw last time. Obviously, the system (Raspbian in this case) generates many events in the process of saving the file, and in order to avoid seeing unnecessary messages, we will have to find another way to improve the code.

Attempted Improvement # 2: Compare MD5 Checksums


We will create an MD5 hash (checksum) of the file contents at the beginning of work, and then, with each file change event that it responds to fs.watch, we will calculate the checksum again. Perhaps we will be able to get rid of unnecessary messages about a file change if we take into account the state of the contents of the file.

To do this, we first need to install the md5 package .

$ npm install --save md5

Now we will use this package and write a code designed to detect real file changes using a checksum.

const fs = require('fs');
const md5 = require('md5');
require('log-timestamp');
const buttonPressesLogFile = './button-presses.log';
console.log(`Watching for file changes on ${buttonPressesLogFile}`);
let md5Previous = null;
fs.watch(buttonPressesLogFile, (event, filename) => {
  if (filename) {
    const md5Current = md5(fs.readFileSync(buttonPressesLogFile));
    if (md5Current === md5Previous) {
      return;
    }
    md5Previous = md5Current;
    console.log(`${filename} file Changed`);
  }
});

In this code, we use an approach reminiscent of the one we used to compare the file modification time, but here we analyze the changes in the file contents using its checksum. Let's see how this code behaves in practice.

$ node file-watcher.js
[2018-05-21T00:56:50.167Z] Watching for file changes on ./button-presses.log
[2018-05-21T00:59:00.924Z] button-presses.log file Changed
[2018-05-21T00:59:00.936Z] button-presses.log file Changed

Unfortunately, here again it turned out not what we needed. The system probably generates file change events while the file is being saved.

Recommended Way to Use fs.watch


We examined various use cases fs.watch, but did not achieve what we wanted. However, not everything is so bad, because, in finding a solution, we learned a lot of useful things. We will make another attempt to achieve the desired. This time we use the technology of eliminating the “chatter” of events, introducing a small delay into our code, which will allow us not to respond to events about changes in the file within the specified time window.

const fs = require('fs');
require('log-timestamp');
const buttonPressesLogFile = './button-presses.log';
console.log(`Watching for file changes on ${buttonPressesLogFile}`);
let fsWait = false;
fs.watch(buttonPressesLogFile, (event, filename) => {
  if (filename) {
    if (fsWait) return;
    fsWait = setTimeout(() => {
      fsWait = false;
    }, 100);
    console.log(`${filename} file Changed`);
  }
});

The bounce control feature was created with some help from StackOverflow users. As it turned out, a delay of 100 milliseconds is enough to display just one message with a single file change. At the same time, our solution is also suitable for cases when the file undergoes quite frequent changes. This is how the output of the program now looks.

$ node file-watcher.js
[2018-05-21T00:56:50.167Z] Watching for file changes on ./button-presses.log
[2018-05-21T01:00:22.904Z] button-presses.log file Changed

As you can see, all this works fine. We found a magic formula for building a file monitoring system. If you are interested in the code of npm packages for Node that are aimed at monitoring file changes, you will find that many of them implement functions for filtering “bounce”. We took a similar approach, building a solution based on standard Node mechanisms, which allowed us not only to solve the problem, but also to learn something new.

As a result, I want to note that the function for suppressing “bounce” can be combined with checking MD5 checksums to display messages only if the file has really changed, and not display messages in situations when there were no files in the file no real changes made.

const fs = require('fs');
const md5 = require('md5');
require('log-timestamp');
const buttonPressesLogFile = './button-presses.log';
console.log(`Watching for file changes on ${buttonPressesLogFile}`);
let md5Previous = null;
let fsWait = false;
fs.watch(buttonPressesLogFile, (event, filename) => {
  if (filename) {
    if (fsWait) return;
    fsWait = setTimeout(() => {
      fsWait = false;
    }, 100);
    const md5Current = md5(fs.readFileSync(buttonPressesLogFile));
    if (md5Current === md5Previous) {
      return;
    }
    md5Previous = md5Current;
    console.log(`${filename} file Changed`);
  }
});

Perhaps all this looks a little complicated, and in 99% of cases this is not necessary, but, in any case, I think it provides some food for the mind.

Summary


In Node.js, you can monitor file changes and execute some code in response to these changes. As applied to the aquarium IoT project, this makes it possible to monitor the state of the log file into which records about feeding the fish fall.

There are many situations in which monitoring files can be helpful. It should be noted that it is fs.watchFilenot recommended to use for monitoring files , since this command, for detecting file change events, performs regular requests to the system. Instead, pay attention to fs.watchwith a function to suppress the "bounce" of events.

Dear readers! Do you use file change monitoring mechanisms in your Node.js projects?

Also popular now: