12 tips for scaling Node.js

Node.js is already successfully operating on a global scale, as evidenced by the deployed applications on it by companies such as Netflix, Reddit, Walmart and Ebay. However, it has its own set of problems when scaling; both in terms of scaling people working on a single code base, as well as in terms of vertical and horizontal scaling in the cloud. In addition to my personal experience scaling Node.js when working in companies such as Reddit and Netflix, I spoke with some experts working in Microsoft Azure, and came up with some tips for you on scaling Node.js in your company.

Write a quality Node.js

The sooner you start using linters, formatting and type checking tools in your code, the better.

These things can be complicated when typing in the middle of a project due to the potentially large amount of refactoring that may be needed, it can also pollute your git history, but in the end these tools will help you make the code readable.
If you still do not use them, immediately turn your eyes towards ESLint and Prettier . ESLint protects your code from bad patterns, Prettier also helps to automatically format your code before pull request.

A more substantial solution is to add tools, such as Flow or TypeScript, to your code base. These tools allow you to catch more subtle errors, such as calling a function with a numeric parameter instead of a string one, or calling the .filter method on objects instead of an array. Despite the complexity and the need to train your team, these tools deserve your attention: they can speed development through Intellisense and prevent runtime errors due to type protection.

Write tests

Tests have always been a difficult question for developers. Some thoroughly believe in test-driven development, while others rarely write tests at all. But there is a middle ground:

Identify key modules and write comprehensive unit tests for them. Pay special attention to “happy ways”: boundary cases and scenarios in which errors can potentially occur. For other modules, write one or two unit tests covering “happy paths” and possibly common cases that you were able to detect.
Minimum UI test . The UI is constantly changing and it is often impractical to spend a lot of time running tests for code that will change frequently.
Write tests to detect bugs . Whenever you find and correct an error in the code, write a single test that will catch this error in the future.
Write several integration tests to make sure that all the parts match each other.
Write even less end-to-end tests . Cover key paths on your site, for example, if you are creating an e-commerce site, you may need to write a test to enter the site, add it to the cart and check the product list. These tests are expensive to maintain, so think about keeping a small core of such tests that you can motivate yourself to support.

The starting point for writing tests is the ability to confidently deploy new code. Write tests so many that they are no less than you yourself feel it, but try to write no more than the above list.

Stateless Design

The key to writing scalable Node.js is that your servers do not have to store states for someone or something. This will prevent horizontal scaling. Move the state to another application and solve the problem elsewhere (for example, Redis, Etcd, ...). This is worth thinking in advance. Then it will be very difficult to unravel if you have not done it before. This will also help if you ever decide to decompose monoliths into microservices.

Statistics: for development - Node.js, for production - CDN

I wish companies would see an error in this error. Maintaining your static assets from your web application (in particular, through something like webpack-dev-server or Parsel's dev server) is a great developer experience because it reduces the injection cycle when writing code. However, you should never maintain your statics through Node.js. It must be shipped separately via CDN, for example Azure CDN.

The return of statics from Node.js is unnecessarily slow, since the CDNs are more scattered and, therefore, physically closer to the end user, besides the CDN servers are highly optimized for small resources. Static maintenance with Node is also unnecessarily expensive, since the time of the Node.js server is much more expensive than the time of the CDN server.

Start Deploy early

I do not know you, but when I deblow something for the first time, it never works. This is usually because I forgot to send the correct secret keys or hardcoded the path to the local host. Small problems that work locally, remotely refuse to do it. These problems can accumulate, and something that used to be easily corrected, if, of course, it is too early to find them, can turn into a huge pile of incomprehensible errors that it is simply impossible to properly catch.

By the way, Visual Studio Code allows you to solve this kind of problem . It allows you to deploy your application directly to Azure with one click. This is a fairly simple way to check for problems when deployed in a different environment.

Deploy 2 servers at once

This advice comes from my hard-won knowledge and the sea of heartache. The essence of the advice is that there are few differences between the deployment of two servers and ten servers, and there is not much difference between the deployment of ten servers and a hundred servers. However, there is simply a huge difference between deploying one server and two servers.
Like the issue of deploying servers without state storage, starting with two servers, you can quickly overcome your problems with scaling so that when a sudden increase in traffic occurs, you are ready to scale.

Do not be afraid of queues

Modern databases cope with a certain amount of reading and writing on their own, without your help. When you test your idea, do not hesitate to rely on your database to handle the small to medium load.

Premature scaling is more likely to kill you than save you. But, at some point, your application will grow, that you will not be able to write to the database as well when faced with problems with read and write throughput. For some applications that have a lightweight recording or, if you choose a database such as Cassandra, which handles a massive scale by itself, this will be a problem later, for others it will be a bit earlier.

If such a problem arises, or potentially soon, you will have options for choosing the technologies with which you will go further. One of these technologies can be a message queue. The actual standard at the moment is Apache Kafka, which allows you to organize your messages in topics, and then applications to subscribe to this topic. For example, you can accumulate messages in the application, listening to a specific topic, and then write the data to your database in batch so that it does not get clogged all the time. In addition, Kafka runs easily on Azure .

Microservices for scaling

As your application grows, natural logical divisions begin to appear. One part of the application can process payments, while the other part will serve the API for your frontend. When making logical divisions, consider making them separate microservices. But be careful, as the implementation of microservices is also associated with great complexity. But it's worth it. For example, each microservice can have its own metric. Evaluating them, you can scale them independently.

Use containers

Your application may work well locally, but if you try to deploy, you may have serious problems. To avoid this problem, you can use tools like Docker and Kubernetes. Docker, which you can represent as a mini-instance (container) of Linux or Windows, in which you can run the application; and Kubernetes as a tool that connects all your containers together in the cloud.

Kubernetes can be a complex beast, but a beast that solves a difficult problem. If you are an inexperienced DevOps sorcerer, then you may have difficulties, so I recommend starting with Draft . If you are familiar with Yeomanfor Javascript projects, you can rate Draft as a similar tool, but for Kubernetes projects: a tool that creates a framework for your project. From there, you can use the Helm tool to install additional pieces of architecture that you need to build (for example, nginx, more Node.js, MongoDB, Kafka, etc.) servers, almost like npm for Kubernetes.

As soon as you understand the Kubernetes ecosystem, the cloud will become a child's play for you in the future.

Collect metrics

If you do not know how to answer the question “How does my application work?”, Then you will have problems or will be there soon. After all, various indicators over time will help you to continuously improve the state of your application. From the point of view of costs for the future, and from the point of view of user convenience in terms of improving response time. You should definitely keep up with indicators such as slow paths, page views, session time and other key indicators that are important to your business.

There are many ways to collect these indicators. Services such as New Relic and AppDynamics will provide you with invaluable information on how to improve your application.

If you are working with Azure, Application Insights also copes well with this need, and other tools, such as CI / CD, are also easy to connect.

CI / CD will save you from so much pain

How many times did you spoil the deployment with FTP and knock down your server for a few minutes? It was with me. You should never trust yourself in deploying production code. How to do this with Visual Studio Code is pretty cool, but it is intended primarily for development or for demonstration purposes. When you are ready to create a production level system, you must use continuous integration and continuous deployment (frequent integration and continuous deployment).

Continuous integration is the practice of software development, which consists of merging working copies into a common main branch of development several times a day and performing frequent automated project builds to quickly identify potential defects and solve integration problems.

Continuous deployment is committed to accepting your code, which transmitted the CI, runs the necessary steps to build, container or package them and sends them to the server. A good practice is to have several levels to check. Perhaps you will first go to the internal dev server to first see it in a low-risk environment. You can check it first before sending it to the QA environment, where your QA engineers or perhaps an external service will confirm that everything works as expected. From there, you can go to a staging environment in which your application is still only internal, but works using production data and settings, so you can check it in the production environment itself before sending it directly to production. You can also allocate a small group of servers to check for new code: you and only a small percentage of real traffic should be sent to these servers to make sure that nothing breaks when working with real users. If it breaks, you know where to look for the problem. If not, you can move from a small group of users to all.

Many vendors and open source projects address these needs. Jenkins, Travis and CircleCI are great options for CI. Azure has its own CI / CD service called Azure Pipelines, and it is quite intuitive to use, and again it easily connects to the Azure integrated ecosystem.

Keep secrets

Any application inevitably has some secrets. These can be keys and secret strings from credentials, databases, and much more. It would be very bad if they turned into the wrong hands. However, they are required to run the application. So what do we do? Usually in development we will use tools like dotenv to save the configuration file locally and be able to read it through process.env in Node.js. This is great for developers, but terrible for production.

Instead, it is useful to use some kind of secret management tool. Fortunately, Kubernetes has an embedded system.and it's pretty simple to use. You provide Kubernetes with secrets on the side of the container, and then it will share them with your application as an environment that makes attack much more difficult.

Another tool that deserves your attention is Azure Key Vault . What's great about Key Vault, even though Microsoft can't read your keys (only you have the ability to decrypt them) Azure will keep track of your journals and keep track of any questionable uses of your keys to warn you of any compromises.

Conclusion

Node.js, like any other platform, needs to be scaled. And, like any other platform, it has its own tasks and scaling features that you should know about and which should be taken into account when designing large projects.

Original article: “Eleven Tips to Scale Node.js” ( En ).

I suggest in the comments to share tips that you can give on scaling Node.js. It will be interesting to hear.

Tags: