Transferring 30,000 lines of code from Flow to TypeScript

Transfer

We recently moved 30 thousand lines of JavaScript in our MemSQL Studio system from Flow to TypeScript. In this article I will explain why we ported the code base, how it happened and what happened.

Disclaimer: my goal is not a criticism of Flow at all. I admire the project and think that there is enough room in the JavaScript community for both types of type checking. In the end, everyone will choose what suits him best. I sincerely hope that the article will help in this choice.

First I will bring you up to date. We at MemSQL are big fans of static and strong JavaScript typing to avoid common problems with dynamic and weak typing.

Speech about common problems:

Errors of type in runtime due to the fact that different parts of the code are not consistent with implicit types.
Too much time is spent writing tests for such trivial things as checking type parameters (checking in runtime also increases the size of the package).
There is not enough editor / IDE integration, because without static typing it is much more difficult to implement the Jump to Definition function, mechanical refactoring, and other functions.
There is no possibility to write code around data models, that is, first to design data types, and then the code basically “writes itself”.

These are just some of the benefits of static typing, even more listed in a recent article on Flow .

At the beginning of 2016, we implemented tcomb to implement some type security in the runtime of one of our internal JavaScript projects (a disclaimer: I did not do this project). Although runtime checking is sometimes useful, it doesn’t even offer all the advantages of static typing (combining static typing with type checking in runtime may be suitable for certain cases, io-ts allows you to do this with tcomb and TypeScript, although I have never tried ). Understanding this, we decided to implement Flow for another project that we started in 2016. At the time, Flow seemed like a great choice:

Support for Facebook, which did an amazing job of developing React and growing the community (they also developed React using Flow).
Approximately the same ecosystem of JavaScript-development. It was scary to refuse Babel for tsc (TypeScript compiler), because we lost the flexibility of switching to another type check (obviously, the situation has changed since then).
No need to type the entire codebase (we wanted to get an idea of statically typed javascript before going all-in), but only part of the files. Pay attention that now both Flow and TypeScript allow it.
TypeScript (at that time) lacked some basic functions that now exist, these are lookup types , default parameters for generic types , etc.

When we started working on MemSQL Studio at the end of 2017 , we were going to cover the entire application with types (it is written entirely in JavaScript: both the frontend and the backend are executed in the browser). We took Flow as a tool that was successfully used in the past.

But my attention was attracted by Babel 7 with TypeScript support . This release meant that switching to TypeScript no longer requires switching to the entire TypeScript ecosystem and you can continue to use Babel for JavaScript. More importantly, we could use TypeScript only for type checking , and not as a full-fledged "language".

Personally, I think that separating type checking from code generator is a more elegant way to use static (and strong) typing in JavaScript, because:

We share the problems of code and typing. This reduces the stops on type checking and speeds up development: if for some reason type checking is slow, the code will still be generated correctly (if you use tsc with Babel, you can adjust it to the same behavior).
Babel has great plugins and functions that the TypeScript generator doesn't have. For example, Babel allows you to specify supported browsers and automatically generate a code for them. This is a very complex function and it makes no sense to support it in parallel in two different projects.
I like JavaScript as a programming language (apart from the lack of static typing), and I have no idea how long TypeScript will exist while I believe in ECMAScript for many years. Therefore, I prefer to write and “think” in JavaScript (note that I say “use Flow” or “use TypeScript” instead of “write on Flow” or “on TypeScript”, because I always represent them with tools, not programming languages).

Of course, this approach has some drawbacks:

The TypeScript compiler can theoretically perform type-based optimizations, but here we lose this opportunity.
The configuration of the project is a little more complicated with an increase in the number of tools and dependencies. I think this is a relatively weak argument: a bunch of Babel and Flow never let us down.

TypeScript as an alternative to Flow

I noticed a growing interest in TypeScript in the JavaScript community: both online and with other developers. Therefore, as soon as I learned that Babel 7 supports TypeScript, I immediately began to explore potential transition options. In addition, we encountered some flaws in Flow:

Lower quality of the integration of the editor / IDE (compared to TypeScript). Nuclide - Facebook's own IDE with the best integration - is outdated.
There is a smaller community, which means fewer type definitions for different libraries, and they are of lower quality (at the moment the DefinitelyTyped 19,682 stars in the GitHub repository and only 3070 in the repository flow-typed ).
Lack of a public development plan and poor interaction between Flow Team on Facebook and the community. You can read this employee comment on Facebook to understand the situation.
Large memory consumption and frequent leaks - at some of our developers, Flow sometimes took up almost 10 GB of RAM.

Of course, we had to learn how TypeScript suits us. This is a very difficult question: studying the topic involved thorough reading of the documentation, which helped to understand that for every Flow function there is an equivalent of TypeScript. Then I explored the open-source TypeScript development plan, and I really liked the features that are planned for the future (for example, partial derivation of the type arguments we used in Flow).

Transfer of more than 30 thousand lines of code from Flow to TypeScript

First, Babel should be updated from 6 to 7. This simple task took 16 man-hours, because we decided to simultaneously update Webpack 3 to 4. The task was complicated by some outdated dependencies in our code. The vast majority of JavaScript projects will not have such problems.

After that, we replaced the Babel Flow preset with a new TypeScript preset, and then for the first time launched the TypeScript compiler on all our source files written with Flow. The result is 8245 syntax errors (tsc CLI does not show real errors for the project until all syntax errors have been fixed).

At first this number scared us (very), but we quickly realized that most of the errors are due to the fact that TypeScript does not support .js files. After studying the topic, I learned that TypeScript files must end with either .ts or .tsx (if they have JSX). It seems to me a clear inconvenience. In order not to think about the presence / absence of JSX, I simply renamed all the files to .tsx.

There are about 4000 syntax errors left. Most of them are associated with the import type , which with the help of TypeScript can be replaced simply with import, as well as with the difference in the designation of objects ( {||}instead of {}). Quickly applying a pair of regular expressions, we left 414 syntax errors. Everything else had to be corrected manually:

The existential type , which we use to partially deduce the arguments of a generic type, should be replaced by explicit arguments or the unknown type , to tell TypeScript that some arguments are unimportant.
The type of $ Keys and other advanced types of Flow have a different syntax in TypeScript (for example, it $Shape“”corresponds Partial“”to TypeScript).

After correcting all the syntax errors, tsc finally said how many real type errors in our codebase are only about 1,300. Now it was time to sit down and decide whether to continue or not. In the end, if the migration takes weeks, then it is better to stay on Flow. However, we decided that transferring the code would require less than one week of one engineer’s work, which is quite acceptable.

Please note that at the time of the migration, we had to stop all the work on this code base. Nevertheless, in parallel, you can start new projects - but you have to keep in mind potentially hundreds of type errors in the existing code, which is not easy.

What kind of mistakes?

TypeScript and Flow in many ways handle JavaScript code differently. So, Flow is stricter with respect to some things, and TypeScript - with respect to others. A deep comparison of the two systems will be very long, so let's just study some examples.

Note: all references to the TypeScript sandbox assume "strict" parameters. Unfortunately, when you share a link, these parameters are not stored in the URL. Therefore, they must be set manually after opening any link to the sandbox from this article.

invariant.js

A very common feature in our source code turned out to be invariant. Just to quote the documentation:

var invariant = require('invariant');
invariant(someTruthyVal, 'This will not throw');
// No errors
invariant(someFalseyVal, 'This will throw an error with this message');
// Error raised: Invariant Violation: This will throw an error with this message

The idea is clear: a simple function that generates an error by some condition. Let's see how to implement and use it on Flow:

type Maybe<T> = T | void;
functioninvariant(condition: boolean, message: string) {
  if (!condition) {
    thrownewError(message);
  }
}
functionf(x: Maybe<number>, c: number) {
  if (c > 0) {
    invariant(x !== undefined, "When c is positive, x should never be undefined");
    (x + 1); // works because x has been refined to "number"
  }
}

Now load the same snippet into TypeScript . As you can see from the link, TypeScript gives an error, because it cannot understand that it is xguaranteed not to remain undefinedafter the last line. This is actually a known problem - TypeScript (so far) does not know how to do such an inference through a function. However, this is a very common pattern in our code base, so we had to manually replace each invariant instance (over 150 pieces) with another code, which immediately gives an error:

type Maybe<T> = T | void;
functionf(x: Maybe<number>, c: number) {
  if (c > 0) {
    if (x === undefined) {
      thrownewError("When c is positive, x should never be undefined");
    }
    (x + 1); // works because x has been refined to "number"
  }
}

Not much compared to invariant, but not such an important issue.

$ ExpectError vs. @ ts-ignore

Flow has a very interesting function, similar to @ts-ignore, except that it gives an error if the next line is not an error. This is very useful for writing “tests for types” that ensure that type checking (whether TypeScript or Flow) finds certain type errors.

Unfortunately, there is no such function in TypeScript, so our tests have lost some value. I look forward to implementing this function in TypeScript .

Common type errors and type inference

Often TypeScript allows more explicit code than Flow, as in this example:

type Leaf = {
  host: string;
  port: number;
  type: "LEAF";
};
type Aggregator = {
  host: string;
  port: number;
  type: "AGGREGATOR";
}
type MemsqlNode = Leaf | Aggregator;
functionf(leaves: Array<Leaf>, aggregators: Array<Aggregator>): Array<MemsqlNode> {
  // The next line errors because you cannot concat aggregators to leaves.return leaves.concat(aggregators);
}

Flow prints the leaves.concat (aggregators) type as an Array <Leaf | Aggregator> , which can then be cast to Array<MemsqlNode>. I think this is a good example where the Flow is a little smarter, TypeScript needs a little help: in this case we can apply the type approval (type assertion), but it is dangerous and should be done very carefully.

Although I have no formal evidence, but I believe that Flow is far superior to TypeScript in type deduction. I really hope that TypeScript will reach the Flow level, since the language is developing very actively, and many recent improvements have been made in this particular area. In many places in our code, TypeScript had to help us a little through annotations or type assertions, although we avoided the last, as far as it's possible). Consider another example (we had more than 200 such errors):

type Player = {
    name: string;
    age: number;
    position: "STRIKER" | "GOALKEEPER",
};
type F = () =>Promise<Array<Player>>;
const f1: F = () => {
    returnPromise.all([
        {
            name: "David Gomes",
            age: 23,
            position: "GOALKEEPER",
        }, {
            name: "Cristiano Ronaldo",
            age: 33,
            position: "STRIKER",
        }
    ]);
};

TypeScript will not allow you to write this, because it will not allow you to declare { name: "David Gomes", age: 23, type: "GOALKEEPER" }a type object Player(see the exact error in the sandbox). This is another case where I find TypeScript not smart enough (at least compared to Flow, which understands this code).

There are several options for fixing this:

State "STRIKER"how "STRIKER"TypeScript understands that a string is a valid type enumeration "STRIKER" | "GOALKEEPER".
Declare all objects as Player.
Or what I think is the best solution: just help TypeScript, without using any type statements, by writing Promise.all<Player>(...).

Here is another example (TypeScript), where Flow is again better in type deduction :

type Connection = { id: number };
declare functiongetConnection(): Connection;
functionresolveConnection() {
  returnnewPromise(resolve => {
    return resolve(getConnection());
  })
}
resolveConnection().then(conn => {
  // TypeScript errors in the next line because it does not understand// that conn is of type Connection. We have to manually annotate// resolveConnection as Promise<Connection>.
  (conn.id);
});

A very small, but interesting example: Flow considers it a Array<T>.pop()type T, and TypeScript thinks it will T | void. A point in favor of TypeScript, because it makes double check the existence of the element (if the array is empty, it Array.popreturns undefined). There are several other small examples like this where TypeScript is superior to Flow.

TypeScript definitions for third-party dependencies

Of course, when writing any JavaScript application, you will have at least a few dependencies. They should be typed, otherwise you will lose most of the static type analysis capabilities (as described at the beginning of the article).

Libraries from npm can be supplied with a Flow or TypeScript type definition, with or without both. Very often (small) libraries are not supplied with either one or the other, so you have to write your own type definitions or borrow them from the community. Both Flow and TypeScript support standard definition repositories for third-party JavaScript packages: it is flow-typed and DefinitelyTyped .

I must say that DefinitelyTyped we liked much more. With flow-typed, I had to use the CLI tool to introduce type definitions for various dependencies into a project. DefinitelyTyped combines this feature with the npm CLI tool, sending packets @types/package-nameto the npm package repository. This is very cool and has greatly simplified the input of type definitions for our dependencies (jest, react, lodash, react-redux, these are just a few).

In addition, I had a great time replenishing the DefinitelyTyped database (do not think that type definitions are equivalent when porting code from Flow to TypeScript). I already sent several pull requests, and nowhere have problems. Just clone the repository, edit the type definitions, add tests - and send a pull request. The GitHub-bot DefinitelyTyped marks the authors of the definitions you edited. If none of them provides a review within 7 days, then the pull-request comes to the consideration of the maintainer. After merging with the main branch, a new version of the dependency package is sent to npm. For example, when I first updated the @ types / redux-form package, version 7.4.14 was automatically sent to npm. so it is enough to update the package.json file to get new type definitions. If you can’t wait for the pull-request, you can always change the type definitions that are used in your project, as described in one of the previous articles .

In general, the quality of type definitions in DefinitelyTyped is much better due to the larger and more prosperous TypeScript community. In fact, after transferring the project to TypeScript , our type coverage increased from 88% to 96% mainly due to better definitions of third-party dependency types, with fewer types any.

Lintting and tests

We switched from the eslint linter to tslint (with eslint for TypeScript it seemed harder to get started).
For tests on TypeScript ts-jest is used . Some of the tests are typed, while others are not (if you type too long, we save them as .js files).

What happened after fixing all typing errors?

After 40 man-hours of work, we reached the last typing error, postponing it for a while using @ts-ignore.

After reviewing the code review comments and fixing a couple of bugs (unfortunately, we had to change the runtime code a bit to correct the logic that TypeScript could not understand), the pull request was gone, and since then we have been using TypeScript. (And yes, we fixed that last one @ts-ignorein the next pull request).

In addition to integrating with the editor, working with TypeScript is very similar to working with Flow. Flow server's performance is slightly higher, but this is not a big problem, because they issue errors for the current file equally quickly. The only difference in performance is that TypeScript a little later (by 0.5−1 s) reports new errors after saving the file. The server startup time is about the same (about 2 minutes), but this is not so important. Until now, we have not had any problems with memory consumption. It seems that tsc constantly uses about 600 MB.

It may seem that the type inference function gives a great advantage to Flow, but there are two reasons why this does not really matter:

We converted Flow code base to TypeScript. Obviously, we only got code that Flow can express, but TypeScript is not. If the migration was happening in the opposite direction, I am sure that there would be things that TypeScript displays / expresses better.
Type inference is important in helping to write more concise code. Still, other things are more important, such as a strong community and the availability of type definitions, because weak type inference can be corrected by spending a little more time on typing.

Code statistics

$ npm run type-coverage # https://github.com/plantain-00/type-coverage
43330 / 45047 96.19%
$ cloc # ignoring tests and dependencies
--------------------------------------------------------------------------------
Language                      files          blank        comment           code
--------------------------------------------------------------------------------
TypeScript                      330           5179           1405          31463

What's next?

We are not finished with improving static type analysis. There are other projects in MemSQL that will eventually switch from Flow to TypeScript (and some JavaScript projects that will start using TypeScript), and we want to make our TypeScript configuration more restrictive. Currently, we have the strictNullChecks option turned on , but noImplicitAny is still disabled . We will also remove a couple of dangerous type assertions from the code .

I am glad to share with you all that I learned during my adventures with JavaScript typing. If any particular topic is interesting, please let me know .

Tags: