Node.js in battle (creating a cluster)
- Transfer
- Tutorial
When you use applications on node.js in production, you have to think about stability, performance, security and ease of support. This article describes my thoughts on best practices for using node.js in battle.
By the end of this guide, you will receive a system of 3 servers: a balancer (lb) and 2 application servers (app1 and app2). The balancer will monitor the availability of servers and distribute traffic between them. Application servers will use a combination of systemd and node.js clustering to balance traffic between multiple node processes on the server. You can roll out the code with a single command from your machine, and there will be no service interruptions or unprocessed requests.
All this can be represented in the form of a diagram:
Photo provided by:Digital ocean
This article is addressed to those who are just starting to deal with the issues of configuring servers for combat operation. However, you must have a general understanding of the process, know what upstart, systemd or init is, and what process signals are in unix. I suggest you try this guide on your servers (but still use my demo code). In addition, I will provide some useful configuration settings and scripts that will serve as a good starting point when setting up your own environment.
The final version of the application is here: https://github.com/jdxcode/node-sample .
In this guide I will use Digital Ocean and Fedora. However, the article is written as independent of the technology stack as possible.
I will use Digital Ocean servers with vanilla Fedora 20. I have tested the manual several times, so you should not have a problem reproducing these steps.
All Linux distributions (except Gentoo) are moving from various initialization systems to systemd. Since Ubuntu (perhaps the most popular distribution in the world) has not yet switched to it (but they already announced it), I believe that it will be wrong to teach you how to use Upstart.
systemd offers several significant advantages over Upstart, including advanced centralized logging, simplified configuration, performance, and many more features.
First you need to install node.js on a fresh server. At Digital Ocean, I only had 4 teams.
bootstrap.sh
Here we install the node via yum (which can give us an obsolete version), then we install a great package n, which can install and switch different versions of the node. We will use it to update node.js.
Now launch
I'll show you later how you can automate this step with Ansible .
Since it is not safe to run applications from root, we will create a separate web user.
To do this, run:
We have a node server, and we can move on to adding our application:
Create a directory:
Set the web owner for it:
As well as the web group:
Log in to it:
Switch to our user:
Clone the repository with the Hello world application:
This is the simplest node application:
app.js
Start him
You can access the server via IP using a browser and you will see that the application is working:
Note: you may need to run it to clear the iptables tables or open the port on the firewall. Another note: By default, the application runs on port 3000. To run it on port 80 you will need a proxy server (like nginx), but for our configuration you need to run the application server on port 3000, and the balancer (on another server) will work on port 80.
After we learned how to start the application server, we need to add it to systemd so that it restarts in the event of an accident.
We will use the following systemd script:
node-sample.service
Copy this file (from root) to Activate it: Run it: Check the status: Look at the logs: Try to kill the node process by pid and see how it starts again!
Now that we can start one process with our application, we need to use the built-in methods for clustering nodes , which will automatically distribute traffic among several processes.
Here is a script that you can use to run the Node.js application.
Just put this file next to
This script will launch 2 instances of the application, and will restart them if necessary. It also performs a seamless restart when it receives a SIGHUP signal.
Let's try it. To do this, make changes to what returns . Launch and after that in the browser you can see the new data. The script restarts one process at a time, providing a seamless restart.
For the clustered version of the application to work, you must update the systemd configuration. You can also add a setting so that systemd can perform a seamless restart on its own when it receives a command. Here is an example file for a clustered version: node-sample.service
In combat operation, you need at least 2 servers in case one of them crashes. I would not raise a real system with just one. Keep in mind: servers turn off not only when they break down, but also when it may be necessary to turn off one for maintenance. The balancer checks the availability of the servers and if he notices a problem, then excludes this server from rotation.
First install the second application server, repeating the previous steps. Then create a new server in Digital Ocean (or somewhere else) and connect to it via ssh.
Install HAProxy:
Replace the file
haproxy.cfg
Now restart HAProxy:
you should see a running application on port 80 on the balancer. You can go to see the HAProxy status page. Use to log in . For additional information on configuring HAProxy, I suggest that you familiarize yourself with the manual that I used , or with the official documentation .
Most of the server configuration guides end there, but I think that the instruction would not be complete without the deployment organization! Without automation, the process does not look very scary:
But the main minus is that this will have to be done on each server, and this is time-consuming. Using Ansible, we can roll out our code directly from our machine and properly reload the application.
People are scared of Ansible. Many people think it looks like sophisticated tools like Chef or Puppet, but it’s actually closer to Fabric or Capistrano. In the simplest case, it simply connects via ssh to the server and executes the commands. Without clients, master servers, complex cookbooks - just a team. It has excellent provisioning capabilities, but you may not use them.
Here is the Ansible file that just deploys the code:
deploy.yml
production
Run it on your development machine ( make sure you have installed Ansible ): The file is called Ansible production in the inventory file ( inventory a file ). It simply lists the addresses of all servers and their roles. A file with the yml extension is called a script ( playbook ). It defines the tasks to run. With us he gets the latest code from github. If there are changes, the "notify" task is launched, which reboots the application server. If there were no changes, then the handler will not start. If you, say, want to install npm packages, you can do it here. By the way, make sure you use if you don't commit the dependency files to the repository.
Note: if you want to use a personal git repository, you will need to install an Authorization Agent Redirection in SSH .
Ideally, we should automate the assembly of the application server so that we do not have to manually repeat all the steps each time. To do this, we can use the following Ansible script to deploy the application server:
app.yml
systemd.service.j2
It runs as follows: . And here is the same for the balancer .
Here is the final result of all these steps . As they say, to run the application you need to: update the inventory file, deploy our servers and run the deployment of the application.
Creating a new environment is easy. Add one more inventory file (ansible / production) for the tests and you can refer to it when you call .
Test your system . Even putting aside other reasons, it’s really a lot of fun trying to find a way to bring down your cluster. Use Siege to create a load. Try sending kill -9 to various processes. Turn off the server completely. Send arbitrary signals to processes. Drive a drive. Just find things that can ruin your cluster and protect yourself from% uptime subsidence.
There are no perfect clusters, and this is no exception. I would calmly put it in production, but in the future you can strengthen something:
HAProxy is currently a single point of failure, albeit a reliable one. We could remove it using DNS Failover. It is not instantaneous and will give several seconds of downtime while a DNS record is being distributed. I do not worry that HAProxy will fall by itself, but there is a high probability of human error when changing its configuration.
In case the next deployment breaks the cluster, I would set up a serial deployment in Ansible to gradually roll out the changes, checking the server availability along the way.
I think for some it will be more important than me. In this guide, we had to store server addresses in source code. You can configure Ansible to dynamically request a list of hosts in Digital Ocean (or another provider). You can even create new servers. However, creating a server on Digital Ocean is not the most difficult task.
JSON magazines are a great thing if you want to easily aggregate them and search them. I would look at Bunyan for this.
It would be great if the logs of all servers flowed in one place. You can use something like Loggly , or you can try other ways.
There are many solutions for collecting errors and logging. None of those that I tried, I did not like, so I do not presume to advise you anything. If you know a good tool for this, please write about it in the comments.
I recommend an excellent guide to launch Node.js in production from Joyent - there are a lot of additional tips.
By the end of this guide, you will receive a system of 3 servers: a balancer (lb) and 2 application servers (app1 and app2). The balancer will monitor the availability of servers and distribute traffic between them. Application servers will use a combination of systemd and node.js clustering to balance traffic between multiple node processes on the server. You can roll out the code with a single command from your machine, and there will be no service interruptions or unprocessed requests.
All this can be represented in the form of a diagram:
Photo provided by:Digital ocean
From a translator: with the spread of an isomorphic approach to building web applications, more and more developers are faced with the need to use Node.js in production. I liked this Jeff Dickey article with a practical approach and a look at this broad topic.
UPD (2018): Fixed links to the author’s github.
About this article
This article is addressed to those who are just starting to deal with the issues of configuring servers for combat operation. However, you must have a general understanding of the process, know what upstart, systemd or init is, and what process signals are in unix. I suggest you try this guide on your servers (but still use my demo code). In addition, I will provide some useful configuration settings and scripts that will serve as a good starting point when setting up your own environment.
The final version of the application is here: https://github.com/jdxcode/node-sample .
In this guide I will use Digital Ocean and Fedora. However, the article is written as independent of the technology stack as possible.
I will use Digital Ocean servers with vanilla Fedora 20. I have tested the manual several times, so you should not have a problem reproducing these steps.
Why Fedora?
All Linux distributions (except Gentoo) are moving from various initialization systems to systemd. Since Ubuntu (perhaps the most popular distribution in the world) has not yet switched to it (but they already announced it), I believe that it will be wrong to teach you how to use Upstart.
systemd offers several significant advantages over Upstart, including advanced centralized logging, simplified configuration, performance, and many more features.
Install Node.js
First you need to install node.js on a fresh server. At Digital Ocean, I only had 4 teams.
bootstrap.sh
yum update -y
yum install -y git nodejs npm
npm install -g n
n stable
Here we install the node via yum (which can give us an obsolete version), then we install a great package n, which can install and switch different versions of the node. We will use it to update node.js.
Now launch
# node --version
and you will see information about the latest version of the node. I'll show you later how you can automate this step with Ansible .
Create a web user
Since it is not safe to run applications from root, we will create a separate web user.
To do this, run:
# useradd -mrU web
Add application
We have a node server, and we can move on to adding our application:
Create a directory:
# mkdir /var/www
Set the web owner for it:
# chown web /var/www
As well as the web group:
# chgrp web /var/www
Log in to it:
# cd /var/www/
Switch to our user:
$ su web
Clone the repository with the Hello world application:
$ git clone https://github.com/jdxcode/node-hello-world
This is the simplest node application:
app.js
var http = require('http');
var PORT = process.env.PORT || 3000;
http.createServer(function (req, res) {
console.log('%d request received', process.pid);
res.writeHead(200, {'Content-Type': 'text/plain'});
res.end('Hello world!\n');
}).listen(PORT);
console.log('%d listening on %d', process.pid, PORT);
Start him
$ node app.js
. You can access the server via IP using a browser and you will see that the application is working:
Note: you may need to run it to clear the iptables tables or open the port on the firewall. Another note: By default, the application runs on port 3000. To run it on port 80 you will need a proxy server (like nginx), but for our configuration you need to run the application server on port 3000, and the balancer (on another server) will work on port 80.
# iptables -F
firewall-cmd --permanent --zone=public --add-port=3000/tcp.
systemd
After we learned how to start the application server, we need to add it to systemd so that it restarts in the event of an accident.
We will use the following systemd script:
node-sample.service
[Service]
WorkingDirectory=/var/www/node-hello-world
ExecStart=/usr/bin/node app.js
Restart=always
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=node-hello-world
User=web
Group=web
Environment='NODE_ENV=production'
[Install]
WantedBy=multi-user.target
Copy this file (from root) to Activate it: Run it: Check the status: Look at the logs: Try to kill the node process by pid and see how it starts again!
/etc/systemd/system/node-sample.service
# systemctl enable node-sample
# systemctl start node-sample
# systemctl status node-sample
# journalctl -u node-sample
Process clustering
Now that we can start one process with our application, we need to use the built-in methods for clustering nodes , which will automatically distribute traffic among several processes.
Here is a script that you can use to run the Node.js application.
Just put this file next to
app.js
and run: $ node boot.js
This script will launch 2 instances of the application, and will restart them if necessary. It also performs a seamless restart when it receives a SIGHUP signal.
Let's try it. To do this, make changes to what returns . Launch and after that in the browser you can see the new data. The script restarts one process at a time, providing a seamless restart.
app.js
$ kill -hup [pid]
For the clustered version of the application to work, you must update the systemd configuration. You can also add a setting so that systemd can perform a seamless restart on its own when it receives a command. Here is an example file for a clustered version: node-sample.service
ExecReload=/bin/kill -HUP $MAINPID
# systemctl reload node-sample.
[Service]
WorkingDirectory=/var/www/node-hello-world
ExecStart=/usr/bin/node boot.js
ExecReload=/bin/kill -HUP $MAINPID
Restart=always
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=node-sample
User=web
Group=web
Environment='NODE_ENV=production'
[Install]
WantedBy=multi-user.target
Balancing
In combat operation, you need at least 2 servers in case one of them crashes. I would not raise a real system with just one. Keep in mind: servers turn off not only when they break down, but also when it may be necessary to turn off one for maintenance. The balancer checks the availability of the servers and if he notices a problem, then excludes this server from rotation.
First install the second application server, repeating the previous steps. Then create a new server in Digital Ocean (or somewhere else) and connect to it via ssh.
Install HAProxy:
# yum install haproxy
Replace the file
/etc/haproxy/haproxy.cfg
with the following ( substitute the IP of your servers): haproxy.cfg
defaults
log global
mode http
option httplog
option dontlognull
option http-server-close
option forwardfor
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
frontend main *:80
stats enable
stats uri /haproxy?stats
stats auth myusername:mypass
default_backend app
backend app
balance roundrobin
server app1 107.170.145.120:3000 check
server app2 192.241.205.146:3000 check
Now restart HAProxy:
systemctl restart haproxy
you should see a running application on port 80 on the balancer. You can go to see the HAProxy status page. Use to log in . For additional information on configuring HAProxy, I suggest that you familiarize yourself with the manual that I used , or with the official documentation .
/haproxy?stats
myusername/mypass
Deploying code using Ansible
Most of the server configuration guides end there, but I think that the instruction would not be complete without the deployment organization! Without automation, the process does not look very scary:
- Connect via SSH on app1
cd /var/www/node-hello-world
- Get the latest code
git pull
- And reload the application
systemctl reload node-sample
But the main minus is that this will have to be done on each server, and this is time-consuming. Using Ansible, we can roll out our code directly from our machine and properly reload the application.
People are scared of Ansible. Many people think it looks like sophisticated tools like Chef or Puppet, but it’s actually closer to Fabric or Capistrano. In the simplest case, it simply connects via ssh to the server and executes the commands. Without clients, master servers, complex cookbooks - just a team. It has excellent provisioning capabilities, but you may not use them.
Here is the Ansible file that just deploys the code:
deploy.yml
---
- hosts: app
tasks:
- name: update repo
git: repo=https://github.com/jdxcode/node-hello-world version=master dest=/var/www/node-hello-world
sudo: yes
sudo_user: web
notify:
- reload node-sample
handlers:
- name: reload node-sample
service: name=node-sample state=reloaded
production
[app]
192.241.205.146
107.170.233.117
Run it on your development machine ( make sure you have installed Ansible ): The file is called Ansible production in the inventory file ( inventory a file ). It simply lists the addresses of all servers and their roles. A file with the yml extension is called a script ( playbook ). It defines the tasks to run. With us he gets the latest code from github. If there are changes, the "notify" task is launched, which reboots the application server. If there were no changes, then the handler will not start. If you, say, want to install npm packages, you can do it here. By the way, make sure you use if you don't commit the dependency files to the repository.
ansible-playbook -i production deploy.yml
npm shrinkwrap
Note: if you want to use a personal git repository, you will need to install an Authorization Agent Redirection in SSH .
Ansible for provisioning
Ideally, we should automate the assembly of the application server so that we do not have to manually repeat all the steps each time. To do this, we can use the following Ansible script to deploy the application server:
app.yml
---
- hosts: app
tasks:
- name: Install yum packages
yum: name={{item}} state=latest
with_items:
- git
- vim
- nodejs
- npm
- name: install n (node version installer/switcher)
npm: name=n state=present global=yes
- name: install the latest stable version of node
shell: n stable
- name: Create web user
user: name=web
- name: Create project folder
file: path=/var/www group=web owner=web mode=755 state=directory
- name: Add systemd conf
template: src=systemd.service.j2 dest=/etc/systemd/system/node-sample.service
notify:
- enable node-sample
handlers:
- name: enable node-sample
shell: systemctl enable node-sample
systemd.service.j2
[Service]
WorkingDirectory={{project_root}}
ExecStart=/usr/bin/node boot.js
ExecReload=/bin/kill -HUP $MAINPID
Restart=always
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier={{project_name}}
User=web
Group=web
Environment='NODE_ENV=production'
[Install]
WantedBy=multi-user.target
It runs as follows: . And here is the same for the balancer .
ansible-playbook -i [inventory file] app.yml
Final application
Here is the final result of all these steps . As they say, to run the application you need to: update the inventory file, deploy our servers and run the deployment of the application.
Test environment?
Creating a new environment is easy. Add one more inventory file (ansible / production) for the tests and you can refer to it when you call .
ansible-playbook
Testing
Test your system . Even putting aside other reasons, it’s really a lot of fun trying to find a way to bring down your cluster. Use Siege to create a load. Try sending kill -9 to various processes. Turn off the server completely. Send arbitrary signals to processes. Drive a drive. Just find things that can ruin your cluster and protect yourself from% uptime subsidence.
What can be improved
There are no perfect clusters, and this is no exception. I would calmly put it in production, but in the future you can strengthen something:
Haproxy failover
HAProxy is currently a single point of failure, albeit a reliable one. We could remove it using DNS Failover. It is not instantaneous and will give several seconds of downtime while a DNS record is being distributed. I do not worry that HAProxy will fall by itself, but there is a high probability of human error when changing its configuration.
Serial Deploys
In case the next deployment breaks the cluster, I would set up a serial deployment in Ansible to gradually roll out the changes, checking the server availability along the way.
Dynamic inventory files
I think for some it will be more important than me. In this guide, we had to store server addresses in source code. You can configure Ansible to dynamically request a list of hosts in Digital Ocean (or another provider). You can even create new servers. However, creating a server on Digital Ocean is not the most difficult task.
Centralized logging
JSON magazines are a great thing if you want to easily aggregate them and search them. I would look at Bunyan for this.
It would be great if the logs of all servers flowed in one place. You can use something like Loggly , or you can try other ways.
Error reporting and monitoring
There are many solutions for collecting errors and logging. None of those that I tried, I did not like, so I do not presume to advise you anything. If you know a good tool for this, please write about it in the comments.
I recommend an excellent guide to launch Node.js in production from Joyent - there are a lot of additional tips.
That's all! We built a simple, stable cluster of Node.js. Let me know if you have any ideas how to improve it!
From a translator: thank you for being here. This is my first time trying myself as a translator. I’m sure that I didn’t translate everything correctly, so I ask you to send error messages, as well as typos and design problems by internal mail.