Practical application of transformation of AST-trees on the example of Putout

Tutorial

Introduction

Every day when working on the code, on the way to implementing useful functionality for the user, there are forced (inevitable, or simply desirable) changes to the code. This may be refactoring, updating a library or framework to a new major version, updating JavaScript syntax (which is not uncommon recently). Even if the library is part of a working draft - changes are inevitable. Most of these changes are routine. There is nothing interesting for the developer on the one hand, on the other it does not bring anything to the business, and on the third, in the update process, you need to be very careful not to break the wood and break the functionality. Thus, we come to the conclusion that it is better to shift this routine onto the shoulders of the programs, what would they all do themselves, and the person, in turn, whether everything is properly controlled. This is what today's article will be about.

AST

For programmatic processing of the code, it is necessary to translate it into a special presentation with which it would be convenient for the programs to work. Such a representation exists, it is called Abstract Syntax Tree (AST).
In order to get it, use parsers. The resulting AST can be transformed as you like, and then to save the result you need a code generator. Let us consider in more detail each of the steps. Let's start with the parser.

Parser

And so we have the code:

a + b

Usually parsers are divided into two parts:

Lexical analysis

Splits the code into tokens, each of which describes a part of the code:

[{
    "type": "Identifier",
    "value": "a"
}, {
    "type": "Punctuator",
    "value": "+",
}, {
    "type": "Identifier",
    "value": "b"
}]

Syntax analysis.

Builds a syntax tree of tokens:

{
    "type": "BinaryExpression",
    "left": {
        "type": "Identifier",
        "name": "a"
    },
    "operator": "+",
    "right": {
        "type": "Identifier",
        "name": "b"
    }
}

And here we already have the very idea with which you can work programmatically. It should be clarified that there are a large number of parsers JavaScript, here are some of them:

babel-parser - the parser that uses babel;
espree - the parser that uses eslint;
acorn is the parser on which the previous two are based;
esprima is a popular parser that supports JavaScript up to EcmaScript 2017;
cherow is a new player among JavaScript parsers, claiming to be the fastest;

There is a standard JavaScript parsers, it is called ESTree and determines which nodes should parse as it should.
For a more detailed analysis of the implementation process of the parser (as well as the transformer and generator) you can read super-tiny-compiler .

Transformer

In order to convert an AST tree, you can use the Visitor pattern , for example, using the @ babel / traverse library . The following code will output the names of all the JavaScript code identifiers from the variable code.

import * as parser from"@babel/parser";
import traverse from"@babel/traverse";
const code = `function square(n) {
    return n * n;
}`;
const ast = parser.parse(code);
traverse(ast, {
    Identifier(path) {
        console.log(path.node.name);
    }
});

Generator

You can generate code, for example, using @ babel / generator , thus:

import {parse} from'@babel/parser';
import generate from'@babel/generator';
const code = 'class Example {}';
const ast = parse(code);
const output = generate(ast, code);

And so, at this stage, the reader had to get a basic idea of what is needed to transform JavaScript code, and with what tools this is implemented.

It is necessary to add such an online tool as astexplorer , it combines a large number of parsers, transformers and generators.

Putout

Putout is a code transformer with plug-in support. In fact, it is a cross between eslint and babel , combining the advantages of both tools.

As it eslintputoutshows problem areas in the code, but in contrast, it eslintputoutchanges the behavior of the code, that is, it is able to correct all errors that it can find.

As well as babelputoutconverts the code, but tries to change it minimally, so it can be used to work with the code that is stored in the repository.

Another worth mentioning is the prettier , it is a formatting tool, and it differs radically.

Jscodeshiftputout is not very far from it , but it does not support plugins, does not display error messages, and also uses ast-types instead of @ babel / types .

Appearance history

In the process of work it helps me a lot with my hints eslint. But sometimes you want more from him. For example, to remove a debugger , fix test.only , and also delete unused variables. The last point formed the basis putout, during the development process, it became clear that this is not very simple and many other transformations are much easier to implement. Thus, it putoutgradually grew from one function to the plugin system. Removing unused variables is still the most difficult process, but this does not prevent us from developing and maintaining many other equally useful transformations.

How Putout Works from the Inside

Work putoutcan be divided into two parts: the engine and plugins. This architecture allows you to not be distracted by the transformation when working with the engine, and when working on plug-ins you will focus on their purpose.

Built-in plugins

The work putoutis based on the plugin system. Each plugin represents one rule. Using the built-in rules, you can do the following:

Find and remove:
- unused variables
- debugger
- call test.only
- call test.skip
- call console.log
- call process.exit
- empty blocks
- empty patterns

Find and split variable declarations:

// былоvar one, two;
// станетvar one;
var two;

Convert esmto commonjs:

// былоimport one from'one';
// станетconst one = require('one');

Apply destructuring:

// былоconst name = user.name;
// станетconst {name} = user;

Combine unstructure properties:

// былоconst {name} = user;
const {password} = user;
// станетconst {
    name,
    password
} = user;

Each plug-in is built according to the Unix Philosophy , that is, they are as simple as possible, each performs one action, making them easy to combine, because they, in essence, are filters.

For example, having the following code:

const name = user.name;
const password = user.password;

It is first converted into the following using apply-destructuring :

const {name} = user;
const {password} = user;

After that, using merge-destructuring-properties is converted to:

const {
    name,
    password
} = user;

Thus, plug-ins can work both separately and together. When creating your own plugins, it is recommended to adhere to this rule, and implement a plug-in with minimal functionality that does only what you need, and the plug-in and user plug-ins take care of the rest.

Usage example

After we are familiar with the built-in rules, we can consider an example of use putout.
Create a file example.jswith the following contents:

const x = 1, y = 2;
const name = user.name;
const password = user.password;
console.log(name, password);

Now run putout, passing as an argument example.js:

coderaiser@cloudcmd:~/example$ putout example.js
/home/coderaiser/example/example.js
 1:6   error   "x" is defined but never used            remove-unused-variables
 1:13  error   "y" is defined but never used            remove-unused-variables
 6:0   error   Unexpected "console" call                remove-console
 1:0   error   variables should be declared separately  split-variable-declarations
 3:6   error   Object destructuring should be used      apply-destructuring
 4:6   error   Object destructuring should be used      apply-destructuring
 6 errors in 1 files
  fixable with the `--fix` option

We will receive information containing 6 errors, discussed in more detail above, now we will correct them, and see what happened:

coderaiser@cloudcmd:~/example$ putout example.js --fix
coderaiser@cloudcmd:~/example$ cat example.js
const {
  name,
  password
} = user;

As a result of the correction, unused variables and calls console.logwere removed, and destructuring was also applied.

Settings

The default settings are not always and not everyone can come up, therefore, putoutsupports the configuration file .putout.json, it consists of the following sections:

Rules
Ignore
Match
Plugins

Rules

The section rulescontains a system of rules. The rules, by default, are set as follows:

{
    "rules": {
        "remove-unused-variables": true,
        "remove-debugger": true,
        "remove-only": true,
        "remove-skip": true,
        "remove-process-exit": false,
        "remove-console": true,
        "split-variable-declarations": true,
        "remove-empty": true,
        "remove-empty-pattern": true,
        "convert-esm-to-commonjs": false,
        "apply-destructuring": true,
        "merge-destructuring-properties": true
    }
}

In order to enable it remove-process-exitenough to put it in truethe file .putout.json:

{
    "rules": {
        "remove-process-exit": true
    }
}

This will be enough to report all the calls process.exitfound in the code, and delete them if the parameter is used --fix.

Ignore

If you need to add some folders to the list of exceptions, it’s enough to add a section ignore:

{
    "ignore": [
        "test/fixture"
    ]
}

Match

In case of need of an extensive system of rules, for example, include process.exitfor the directory bin, it is enough to use the section match:

{
    "match": {
        "bin": {
            "remove-process-exit": true,
        }
    }
}

Plugins

In the case of using plugins that are not embedded and have a prefix putout-plugin-, they must be included in the section pluginsbefore activating in the section rules. For example, to connect the plugin putout-plugin-add-hello-worldand enable the rule add-hello-world, it is enough to specify:

{
    "rules": {
        "add-hello-world": true
    },
    "plugins": [
        "add-hello-world"
    ]
}

Putout engine

The engine putoutis a command line tool that reads settings, parses files, loads and launches plugins for execution, and then writes the result of the plugins.

It uses the recast library , which helps to accomplish a very important task: after parsing and transformation, collect the code in a state as close as possible to the previous one.

For parsing, a ESTreecompatible parser is used (currently babelwith a plugin estree, but changes are possible in the future), and tools for transformation babel. Why exactly babel? It's simple. The fact is that this is a very popular product, much more popular than other similar tools, and it develops much more rapidly. Each new proposal in the standard EcmaScript can not do without a babel-plugin . There babelis also a book called Babel Handbook in which all the features and tools for circumventing and transforming an AST tree are described quite well.

Own plugin for Putout

The plugin system is putoutquite simple, and very similar to the eslint plugins , as well as the babel plugins . True, instead of one function, the putoutplugin should export 3. This is done to increase the reuse of the code, because duplicating the functionality in 3 functions is not very convenient, it is much easier to put it into separate functions and just call it in the right places.

Plugin structure

So the Putoutplugin consists of 3 functions:

report - returns the message;
find - looks for places with errors and returns them;
fix - corrects these places;

The main point to remember when creating a plugin for putoutthis is its name, it should start with putout-plugin-. Then there can be the name of the operation which carries a plug, such as plug-in remove-wrongmust be named as follows: putout-plugin-remove-wrong.

The same should be added to package.json, in the section the keywordswords: putoutand putout-plugin, and to peerDependenciesindicate "putout": ">=3.10", or the version that will be the last at the time of writing the plugin.

Example plugin for Putout

For example, let's write a plugin that will remove a word debuggerfrom the code. Such a plugin is already there, it is @ putout / plugin-remove-debugger and it is simple enough to consider it now.

It looks like this:

// возвращаем ошибку соответствующую каждому из найденых узловmodule.exports.report = () =>'Unexpected "debugger" statement';
// в этой функции ищем узлы, содержащией debugger с помощью паттерна Visitormodule.exports.find = (ast, {traverse}) => {
    const places = [];
    traverse(ast, {
        DebuggerStatement(path) {
            places.push(path);
        }
    });
    return places;
};
// удаляем код, найденный в предыдущем шагеmodule.exports.fix = (path) => {
    path.remove();
};

If the rule is remove-debuggerenabled .putout.json, the plugin @putout/plugin-remove-debuggerwill be loaded. First, finda function traversewill be called which by means of the function will bypass the nodes of the AST-tree and save all the necessary places.

The next step putoutwill turn to reportfor getting the message you want.

If the flag --fixis used, the function of fixthe plug-in will be called , and the transformation will be performed, in this case, the node is deleted.

Sample plugin test

In order to simplify the testing of plugins, the @ putout / test tool was written . At its core, this is nothing more than a wrapper over a tape , with several methods for convenience and ease of testing.

The test for the plugin remove-debuggermay look like this:

const removeDebugger = require('..');
const test = require('@putout/test')(__dirname, {
    'remove-debugger': removeDebugger,
});
// проверяем что бы сообщение было именно таким
test('remove debugger: report', (t) => {
    t.reportCode('debugger', 'Unexpected "debugger" statement');
    t.end();
});
// проверяем результат трансформации
test('remove debugger: transformCode', (t) => {
    t.transformCode('debugger', '');
    t.end();
});

Codemods

Not any transformation needs to be used every day, for one-time transformations it is enough to do everything the same, only instead of publishing to npmplace in a folder ~/.putout. At startup, it putoutwill look into this folder, pick up and start transformations.

Here is an example transformation that replaces the connection tapeand try-to-tape with a supertape call : convert-tape-to-supertape .

eslint-plugin-putout

Finally, it’s worth adding one thing: putouttrying to change the code minimally, but if it happens to a friend that some formatting rules break, you are always ready to help eslint --fix, and for this purpose there is a special plugin eslint-plugin-putout . It can brighten up many formatting errors, and of course it can be configured in accordance with the preferences of developers on a particular project. Connect it easily:

{
    "extends": [
        "plugin:putout/recommended",
    ],
    "plugins": [
        "putout"
    ]
}

So far there is only one rule in it: one-line-destructuringit does the following:

// былоconst {
    one
} = hello;
// станетconst {one} = hello;

There are still a lot of included rules eslintthat can be found in more detail .

Conclusion

I want to thank the reader for paying attention to this text. I sincerely hope that the topic of AST transformations will become more popular, and articles about this fascinating process will appear more often. I would be very grateful for any comments and suggestions related to the future direction of development putout. Create an issue , send a pool of requests , test, write what rules you would like to see, and how to convert your code programmatically, we will work together to improve the AST transformation tool.

Tags: