We write custom transformer AST on TypeScript

Original author: Kevin Saldaña
  • Transfer

The TestMace team is back with you. This time we are publishing a translation of an article on TypeScript code conversion using the compiler. Enjoy reading!


Introduction


This is my first post, and in it I would like to show a solution to one problem using the TypeScript compiler API . To find this very solution, I delved into numerous blogs for a long time and digested the answers on StackOverflow, so to protect you from the same fate, I will share everything that I learned about such a powerful, but poorly documented toolbox.


Key concepts


The basics of TypeScript compiler API (parser terminology, transformation API, layered architecture), abstract syntax tree (AST), Visitor design pattern, code generation.


Small recommendation


If this is your first time hearing about the AST concept, I would highly recommend reading this article by @Vaidehi Joshi . Her entire series of articles from basecs came out great, you'll love it.


Task description


At Avero, we use GraphQL and would like to add type safety in resolvers. Once I came across graphqlgen , and with it I was able to solve many problems regarding the concept of models in GraphQL. I will not delve into this issue here - for this I plan to write a separate article. In short, the models describe the return values ​​of the resolvers, and in graphqlgen these models communicate with the interfaces through a kind of configuration (YAML or TypeScript file with type declaration).


During operation, we run gRPC microservices , and GQL for the most part serves as a facade. We have already published TypeScript interfaces that are in accordance with proto contracts , and I wanted to use these types as models, but I ran into some problems caused by support for exporting types and the way in which the description of our interfaces is implemented (piling up namespaces, a large number of links).


According to the rules of good taste for working with open source code, my first step was to refine what has already been done in the graphqlgen repository and thereby make my meaningful contribution. To implement the introspection mechanism, graphqlgen uses the @ babel / parser parser to read a file and collect information about interface names and declarations (interface fields).


Every time I need to do something with AST, I first open astexplorer.net and then start acting. This tool allows you to analyze the AST created by various parsers, including babel / parser and the TypeScript compiler parser. With astexplorer.net, you can visualize the data structures that you have to work with and become familiar with the types of AST nodes of each parser.


Take a look at the example of the source data file and the AST created on its basis using babel-parser:


example.ts
import { protos } from 'my_company_protos'
export type User = protos.user.User;

ast.json
{
  "type": "Program",
  "start": 0,
  "end": 80,
  "loc": {
    "start": {
      "line": 1,
      "column": 0
    },
    "end": {
      "line": 3,
      "column": 36
    }
  },
  "comments": [],
  "range": [
    0,
    80
  ],
  "sourceType": "module",
  "body": [
    {
      "type": "ImportDeclaration",
      "start": 0,
      "end": 42,
      "loc": {
        "start": {
          "line": 1,
          "column": 0
        },
        "end": {
          "line": 1,
          "column": 42
        }
      },
      "specifiers": [
        {
          "type": "ImportSpecifier",
          "start": 9,
          "end": 15,
          "loc": {
            "start": {
              "line": 1,
              "column": 9
            },
            "end": {
              "line": 1,
              "column": 15
            }
          },
          "imported": {
            "type": "Identifier",
            "start": 9,
            "end": 15,
            "loc": {
              "start": {
                "line": 1,
                "column": 9
              },
              "end": {
                "line": 1,
                "column": 15
              },
              "identifierName": "protos"
            },
            "name": "protos",
            "range": [
              9,
              15
            ],
            "_babelType": "Identifier"
          },
          "importKind": null,
          "local": {
            "type": "Identifier",
            "start": 9,
            "end": 15,
            "loc": {
              "start": {
                "line": 1,
                "column": 9
              },
              "end": {
                "line": 1,
                "column": 15
              },
              "identifierName": "protos"
            },
            "name": "protos",
            "range": [
              9,
              15
            ],
            "_babelType": "Identifier"
          },
          "range": [
            9,
            15
          ],
          "_babelType": "ImportSpecifier"
        }
      ],
      "importKind": "value",
      "source": {
        "type": "Literal",
        "start": 23,
        "end": 42,
        "loc": {
          "start": {
            "line": 1,
            "column": 23
          },
          "end": {
            "line": 1,
            "column": 42
          }
        },
        "extra": {
          "rawValue": "my_company_protos",
          "raw": "'my_company_protos'"
        },
        "value": "my_company_protos",
        "range": [
          23,
          42
        ],
        "_babelType": "StringLiteral",
        "raw": "'my_company_protos'"
      },
      "range": [
        0,
        42
      ],
      "_babelType": "ImportDeclaration"
    },
    {
      "type": "ExportNamedDeclaration",
      "start": 44,
      "end": 80,
      "loc": {
        "start": {
          "line": 3,
          "column": 0
        },
        "end": {
          "line": 3,
          "column": 36
        }
      },
      "specifiers": [],
      "source": null,
      "exportKind": "type",
      "declaration": {
        "type": "TypeAlias",
        "start": 51,
        "end": 80,
        "loc": {
          "start": {
            "line": 3,
            "column": 7
          },
          "end": {
            "line": 3,
            "column": 36
          }
        },
        "id": {
          "type": "Identifier",
          "start": 56,
          "end": 60,
          "loc": {
            "start": {
              "line": 3,
              "column": 12
            },
            "end": {
              "line": 3,
              "column": 16
            },
            "identifierName": "User"
          },
          "name": "User",
          "range": [
            56,
            60
          ],
          "_babelType": "Identifier"
        },
        "typeParameters": null,
        "right": {
          "type": "GenericTypeAnnotation",
          "start": 63,
          "end": 79,
          "loc": {
            "start": {
              "line": 3,
              "column": 19
            },
            "end": {
              "line": 3,
              "column": 35
            }
          },
          "typeParameters": null,
          "id": {
            "type": "QualifiedTypeIdentifier",
            "start": 63,
            "end": 79,
            "loc": {
              "start": {
                "line": 3,
                "column": 19
              },
              "end": {
                "line": 3,
                "column": 35
              }
            },
            "qualification": {
              "type": "QualifiedTypeIdentifier",
              "start": 63,
              "end": 74,
              "loc": {
                "start": {
                  "line": 3,
                  "column": 19
                },
                "end": {
                  "line": 3,
                  "column": 30
                }
              },
              "qualification": {
                "type": "Identifier",
                "start": 63,
                "end": 69,
                "loc": {
                  "start": {
                    "line": 3,
                    "column": 19
                  },
                  "end": {
                    "line": 3,
                    "column": 25
                  },
                  "identifierName": "protos"
                },
                "name": "protos",
                "range": [
                  63,
                  69
                ],
                "_babelType": "Identifier"
              },
              "range": [
                63,
                74
              ],
              "_babelType": "QualifiedTypeIdentifier"
            },
            "range": [
              63,
              79
            ],
            "_babelType": "QualifiedTypeIdentifier"
          },
          "range": [
            63,
            79
          ],
          "_babelType": "GenericTypeAnnotation"
        },
        "range": [
          51,
          80
        ],
        "_babelType": "TypeAlias"
      },
      "range": [
        44,
        80
      ],
      "_babelType": "ExportNamedDeclaration"
    }
  ]
}

The root of the tree (a node of type Program ) contains two operators in its body - ImportDeclaration and ExportNamedDeclaration .


In ImportDeclaration, we are particularly interested in two properties - source and specifiers , which contain information about the source text. For example, in our case, the value of source is equal to my_company_protos . It is impossible to understand by this value whether this is a relative path to a file or a link to an external module. This is exactly what the parser does.


Similarly, source information is contained in ExportNamedDeclaration . Namespaces only complicate this structure, adding arbitrary nesting to it, as a result of which more and more QualifiedTypeIdentifiers appear . This is another task that we have to solve in the framework of the chosen approach with the parser.


But I haven’t even reached the resolution of types from imports yet! Given that the parser and AST by default provide a limited amount of information about the source text, then to add this information to the final tree, it is necessary to parse all imported files. But each such file can have its own imports!


It seems that solving the tasks with the help of the parser, we get too much code ... Let's take a step back and think again.


Imports are not important for us, just as the file structure is not important. We want to be able to resolve all type properties protos.user.Userand embed them instead of using import references. And where to get the necessary type information to create a new file?


Typechecker


Since we found that the solution with the parser is not suitable for obtaining information about the types of imported interfaces, let's look at the process of compiling TypeScript and try to find another way out.


Here's what immediately comes to mind:


TypeChecker is the foundation of the TypeScript type system, and it can be created from an instance of Program. He is responsible for the interaction of characters from various files with each other, setting types of characters and conducting semantic verification (for example, error detection).
The first thing TypeChecker does is collect all the characters from different source files into one view, and then create a single character table, merging the same characters (for example, namespaces found in several different files).
After initializing the initial state, TypeChecker is ready to provide answers to any questions about the program. These questions may be:
Which symbol corresponds to this node?
What type of symbol is this?
What characters are visible in this part of the AST?
What signatures are available for declaring a function?
What errors should be output for this file?

TypeChecker is exactly what we needed! Having access to the symbol table and API, we can answer the first two questions: What symbol corresponds to this node? What type of symbol is this? By merging all common characters, TypeChecker will even be able to solve the problem with the piling up of namespaces, which was mentioned earlier!


So how do you get to this API?


Here is one example that I could find on the net. It shows that TypeChecker can be accessed through the Program instance method. It has two interesting methods - checker.getSymbolAtLocationand checker.getTypeOfSymbolAtLocationthat look very similar to what we are looking for.


Let's start working on the code.


models.ts

import { protos } from './my_company_protos'
export type User = protos.user.User;

my_company_protos.ts

export namespace protos {
  export namespace user {
    export interface User {
      username: string;
      info: protos.Info.User;
    }
  }
  export namespace Info {
    export interface User {
      name: protos.Info.Name;
    }
    export interface Name {
      firstName: string;
      lastName: string;
    }
  }
}

ts-alias.ts
import ts from "typescript";
// hardcode our input file
const filePath = "./src/models.ts";
// create a program instance, which is a collection of source files
// in this case we only have one source file
const program = ts.createProgram([filePath], {});
// pull off the typechecker instance from our program
const checker = program.getTypeChecker();
// get our models.ts source file AST
const source = program.getSourceFile(filePath);
// create TS printer instance which gives us utilities to pretty print our final AST
const printer = ts.createPrinter();
// helper to give us Node string type given kind
const syntaxToKind = (kind: ts.Node["kind"]) => {
  return ts.SyntaxKind[kind];
};
// visit each node in the root AST and log its kind
ts.forEachChild(source, node => {
  console.log(syntaxToKind(node.kind));
});

$ ts-node ./src/ts-alias.ts
prints
ImportDeclaration
TypeAliasDeclaration
EndOfFileToken

We are only interested in declaring a type alias, so we rewrite the code a bit:


kind-printer.ts
ts.forEachChild(source, node => {
  if (ts.isTypeAliasDeclaration(node)) {
    console.log(node.kind);
  }
})
// prints TypeAliasDeclaration

TypeScript provides protection for each type of node, with which you can find out the exact type of node:



Now back to the two questions that were posed earlier: Which symbol corresponds to this node? What type of symbol is this?


So, we got the names entered by the type alias interface declarations by interacting with the TypeChecker character table . While we are still at the very beginning of the journey, but this is a good starting position from the point of view of introspection .


checker-example.ts
ts.forEachChild(source, node => {
  if (ts.isTypeAliasDeclaration(node)) {
    const symbol = checker.getSymbolAtLocation(node.name);
    const type = checker.getDeclaredTypeOfSymbol(symbol);
    const properties = checker.getPropertiesOfType(type);
    properties.forEach(declaration => {
      console.log(declaration.name);
      // prints username, info
    });
  }
});

Now let's think about code generation .


Transformation API


As stated earlier, our goal is to parse and introspect the TypeScript source file and create a new file. The AST -> AST conversion is so often used that the TypeScript team even thought of an API for creating custom transformers !


Прежде чем перейти к основной задаче, опробуем создать простенький трансформер. Особая благодарность Джеймсу Гэрбатту (James Garbutt) за исходный шаблон для него.


Сделаем так, чтобы трансформер менял числовые литералы на строковые.


number-transformer.ts

const source = `
  const two = 2;
  const four = 4;
`;
function numberTransformer(): ts.TransformerFactory {
  return context => {
    const visit: ts.Visitor = node => {
      if (ts.isNumericLiteral(node)) {
        return ts.createStringLiteral(node.text);
      }
      return ts.visitEachChild(node, child => visit(child), context);
    };
    return node => ts.visitNode(node, visit);
  };
}
let result = ts.transpileModule(source, {
  compilerOptions: { module: ts.ModuleKind.CommonJS },
  transformers: { before: [numberTransformer()] }
});
console.log(result.outputText);
/*
  var two = "2";
  var four = "4";

Самая важная его часть — это интерфейсы Visitor и VisitorResult:


type Visitor = (node: Node) => VisitResult;
type VisitResult = T | T[] | undefined;

Главная цель при создании трансформера — написать Visitor. По логике вещей, необходимо реализовать рекурсивное прохождение каждого узла AST и возвращение результата VisitResult (один, несколько или ноль узлов AST). Можно настроить преобразователь таким образом, чтобы изменению поддавались только выбранные узлы.


input-output.ts
// input
export namespace protos { // ModuleDeclaration
  export namespace user { // ModuleDeclaration
    // Module Block
    export interface User { // InterfaceDeclaration
      username: string; // username: string is PropertySignature
      info: protos.Info.User; // TypeReference
    }
  }
  export namespace Info {
    export interface User {
      name: protos.Info.Name; // TypeReference
    }
    export interface Name {
      firstName: string;
      lastName: string;
    }
  }
}
// this line is a TypeAliasDeclaration
export type User = protos.user.User; // protos.user.User is a TypeReference
// output
export interface User {
  username: string;
  info: { // info: { .. } is a TypeLiteral
    name: { // name: { .. } is a TypeLiteral
      firstName: string; 
      lastName: string;
    }
  }
}

Здесь можно посмотреть, с какими именно узлами мы будем работать.


Visitor должен выполнять два основных действия:


  1. Замена TypeAliasDeclarations на InterfaceDeclarations
  2. Преобразование TypeReferences в TypeLiterals

Решение


Вот так выглядит код Visitor-а:


aliasTransformer.ts
import path from 'path';
import ts from 'typescript';
import _ from 'lodash';
import fs from 'fs';
const filePath = path.resolve(_.first(process.argv.slice(2)));
const program = ts.createProgram([filePath], {});
const checker = program.getTypeChecker();
const source = program.getSourceFile(filePath);
const printer = ts.createPrinter();
const typeAliasToInterfaceTransformer: ts.TransformerFactory = context => {
  const visit: ts.Visitor = node => {
    node = ts.visitEachChild(node, visit, context);
    /*
      Convert type references to type literals
        interface IUser {
          username: string
        }
        type User = IUser <--- IUser is a type reference
        interface Context {
          user: User <--- User is a type reference
        }
      In both cases we want to convert the type reference to
      it's primitive literals. We want:
        interface IUser {
          username: string
        }
        type User = {
          username: string
        }
        interface Context {
          user: {
            username: string
          }
        }
    */
    if (ts.isTypeReferenceNode(node)) {
      const symbol = checker.getSymbolAtLocation(node.typeName);
      const type = checker.getDeclaredTypeOfSymbol(symbol);
      const declarations = _.flatMap(checker.getPropertiesOfType(type), property => {
        /*
          Type references declarations may themselves have type references, so we need
          to resolve those literals as well 
        */
        return _.map(property.declarations, visit);
      });
      return ts.createTypeLiteralNode(declarations.filter(ts.isTypeElement));
    }
    /* 
      Convert type alias to interface declaration
        interface IUser {
          username: string
        }
        type User = IUser
      We want to remove all type aliases
        interface IUser {
          username: string
        }
        interface User {
          username: string  <-- Also need to resolve IUser
        }
    */
    if (ts.isTypeAliasDeclaration(node)) {
      const symbol = checker.getSymbolAtLocation(node.name);
      const type = checker.getDeclaredTypeOfSymbol(symbol);
      const declarations = _.flatMap(checker.getPropertiesOfType(type), property => {
        // Resolve type alias to it's literals
        return _.map(property.declarations, visit);
      });
      // Create interface with fully resolved types
      return ts.createInterfaceDeclaration(
        [],
        [ts.createToken(ts.SyntaxKind.ExportKeyword)],
        node.name.getText(),
        [],
        [],
        declarations.filter(ts.isTypeElement)
      );
    }
    // Remove all export declarations
    if (ts.isImportDeclaration(node)) {
      return null;
    }
    return node;
  };
  return node => ts.visitNode(node, visit);
};
// Run source file through our transformer
const result = ts.transform(source, [typeAliasToInterfaceTransformer]);
// Create our output folder
const outputDir = path.resolve(__dirname, '../generated');
if (!fs.existsSync(outputDir)) {
  fs.mkdirSync(outputDir);
}
// Write pretty printed transformed typescript to output directory
fs.writeFileSync(
  path.resolve(__dirname, '../generated/models.ts'),
  printer.printFile(_.first(result.transformed))
);

Мне нравится, как выглядит моё решение. Оно олицетворяет всю мощь хороших абстракций, интеллектуального компилятора, полезных инструментов для разработки (автодополнение VSCode, AST explorer и т.д.) и крупиц опыта других умелых разработчиков. Его полный исходный код с обновлениями можно найти здесь. Не уверен, насколько полезным оно окажется для более общих случев, отличных от моего частного. Я лишь хотел показать возможности набора инструментальных средств компилятора TypeScript, а также переложить на бумагу свои мысли по решению нестандартной задачи, которая долго меня беспокоила.


I hope that my example will help someone simplify their lives. If the topic of AST, compilers and transformations is not fully understood by you, then follow the links to third-party resources and templates that I provided, they should help you. I had to spend a lot of time studying this information in order to finally find a solution. My first attempts at private Github repositories, including 45 // @ts-ignoresand assert s, made me blush with shame.


Resources that helped me:


Microsoft / TypeScript


Creating a TypeScript Transformer


TypeScript compiler APIs revisited


AST explorer


Also popular now: