The future of WebAssembly as a “skill tree”

Original author: Lin Clark, Till Schneidereit, Luke Wagner

Transfer

Some people somehow misunderstood WebAssembly. There are those who believe that once browsers already support WebAssembly (since 2017), then everything is ready. Not even close yet, only MVP is ready (minimally viable product). I can guess where this error comes from: after the release of MVP, its developers have promised to maintain backward compatibility at the level of “any code written now will work in the future.” But this does not mean that the development of WebAssembly is complete, not at all! Many features are being developed right now and are planned for development in the near future. And when they are implemented - everything will change very much.

All these features you can try to imagine in the form of a tree of skills in any game. We have a couple of “basic” (already implemented features) and a whole tree with a lot of branches and leaves that will eventually open, giving us more and more power. Let's look at what we already have now and what we still have to open. ( Under the cat a lot of pictures, traffic )

Minimally Viable Product (MVP)

At the very beginning of the creation of WebAssembly is Emscripten , which made it possible to compile C ++ code into JavaScript code. This made it possible to transfer to the world of the web a large number of C ++ libraries, without which the launch of higher-level code would be impossible. The generated JS code was far from ideal and worked slowly (compared to its native version). But still, Mozilla engineers found a couple of ways to make it faster. The main one was the selection of a subset of the language that could be performed at speeds comparable to the speeds of native code execution. This subset is called asm.js .

Developers of other browsers noticed and appreciated the speed of asm.js, got support for itall major browsers. But this was not the end of the story. It was just the beginning. There were still opportunities to work faster. But they already went beyond Javascript. It turned out that the native code (for example, in C ++) needed to be compiled not into Javascript, but into something else. In something new, created specifically as a quick alternative to JS. And so it appeared WebAssembly.

What is included in the first version of WebAssembly? What was enough to get the proud title of "minimally viable product"?

Skill: target compiler platform

The programmers who worked on WebAssembly understood that their task was not to support C or C ++ alone. The task was to enable compiling code in any language in WebAssembly. It was supposed to be such an “assembler” that should be executed in the browser, just like the computer code of the desktop application is executed, for example, on the x86 platform. But this new language should not rely on any specific platform, its goal should be a higher level abstract platform, the concrete implementations of which would already depend on the instruction set used on this hardware.

Skill: fast code execution

Everything had to work fast. Otherwise, why bother to plot this whole story? In the end, the user should be able to run real “heavy” applications, be able to play top games in the browser, etc.

Skill: compactness

It is important not only the speed of the code, but also the speed of its loading. Users are accustomed to desktop applications that run very quickly (because they are installed locally and have all the necessary resources at hand). Web applications also run relatively quickly, because they do not load so many resources at once. And this puts before us a new task: if we want to create a new type of web application with a code base as large as the classic desktop, but downloadable from the Internet, the code should be as compact as possible.

Skill: memory access

Our new applications will also need to work with memory a little differently than JavaScript code does. Need direct access to the memory blocks. This is due to the peculiarity of the work of languages C and C ++, in which there are pointers. A pointer is, roughly speaking, a variable that contains an address in memory. An application can read the data at this address, change it, and even use pointer arithmetic to “walk” through memory forward from the specified address. A huge amount of code in C / C ++ uses pointers to increase the efficiency of its work, creating a target platform for such code is impossible without the support of pointers.

But we cannot allow a piece of code downloaded from the Internet to have direct access to the memory of our process - this is too dangerous. We will have to create an environment that, on the one hand, allows native code compiled in WebAssembly to believe that it has direct memory access, but on the other, to strictly restrict the area in which it is allowed to manipulate data.

For this, WebAssembly uses a “linear memory model”. This is implemented using TypedArrays - something like an array in JavaScript, but containing only a sequential set of bytes in memory. When you want to put something in it, you use the access to the element by index (which can be an address in memory). Thus, this array “pretends” as a block of memory for C ++ code.

New achievement!

So, with all of the above, people will finally be able to launch a desktop application in a browser with about the same performance as if it were native. That's about this set of features and was called "minimally viable product" (MVP).

At this stage, some applications really could already be built for WebAssembly and earn in the browser. But there was still a long way ahead.

Heavyweight desktop applications

The next important step should be the ability to run really large desktop applications. Do you already have a full version of Photoshop running in a browser? And you did not install it, just opened the link - and now you are 100% full of the power of this product, at native speed, the latest version with all updates and corrections, on any device.

And we are not so far from this - examples are already beginning to appear. For example, AutoCAD. And also Adobe Lightroom. But let's be frank - not everything is still ready in the current implementation of WebAssembly to run truly large applications. Bottlenecks are explored and corrected right here at the moment when you are reading this article.

Skill: multithreading

Obviously, we need multithreading. Modern computers have many cores. We need to be able to use them.

Skill: SIMD

In addition to multi-threading, there is another technology that allows for more efficient implementation of parallel data processing. This is SIMD: processing one instruction of several data blocks at once. An important aspect for WebAssembly to work really fast.

Skill: 64-bit addressing

Another important feature of modern hardware architecture, which is not yet present in WebAssembly, is support for 64-bit memory addressing. Everything is simple: with 32-bit addresses you can use only 4 GB of memory (which is very small for large programs), but with 64-bit addresses - already up to 16 exabytes (this is very much for modern software). Of course, not only the theoretical maximum is important, but also practical (how much memory the OS will give you). But on most modern devices already 4 or more GB of RAM and this number will grow.

Skill: streaming compilation

We need not only to quickly execute applications. We also need to reduce the time lag between the start of its boot on the network and its start. Streaming compilation allows you to start processing a WebAssembly file even before its final loading. We parse the instructions on how they are loaded over the network. Thus loading and compilation go in parallel. In the Firefox code, we managed to achieve a compilation speed higher than the download speed — that is, the processing time of some N byte code turned out to be less than the load time of this code over the network. Other browser developers are also working on streaming compilation.

A thing related to streaming compilation is using two compilers.. One of them (above) works quickly and allows you to immediately run the downloaded code. He, however, does not perform all of his theoretically possible optimizations, since it takes more time. Such optimizations are performed by another compiler running in the background. As soon as he finishes his work, one version in memory replaces the other and then works instead of it.

This is how we get both the quick start of the application and its efficient operation.

Skill: caching

If we have already downloaded and compiled some WebAssembly code with an optimizing compiler, then it does not make sense to do the same when loading this code in another tab (or the next time you open the browser, provided the application remains unchanged). Compiled code can (and should) be cached and then used from the cache.

Skill: other improvements

Now there are a lot of discussions about what other improvements are possible and what the efforts of developers should be focused on. Something will definitely be implemented, something will not immediately, something will not at all. I, with your permission, will define all these points in the general class “other improvements”, and what will be included in it will be understood over time.

Where are we now?

Somewhere around here:

Multithreading

For multithreading, we have an almost complete plan , but one of its key parts ( SharedArrayBuffers ) was forced off at the beginning of this year. It will be turned on soon and we can continue.

SIMD

Actively developed right now.

64-bit addressing

For wasm-64 , we have a fairly clear idea of how everything should work. We were based on x86 and ARM approaches.

Stream compilation

In Firefox, it was added back in 2017, other browsers are working on it.

Using two compilers

In Firefox, this was added back in 2017, and in other browsers in 2018.

Implicit HTTP caching

In Firefox, development is almost complete , there will be a release soon.

Other improvements

There is a discussion.

As you can see, most of the items are still under active development. And nevertheless, we can already see applications running on WebAssembly today, since there are already enough opportunities for someone else. As soon as all the above features are ready - we will open another “new achievement” and even more new applications will receive support for WebAssembly.

Javascript Interaction

WebAssembly was created not only as a platform for games and heavyweight applications. It can be used for regular web development. We realize that today there are very large web applications written in Javascript and very few people will decide to take and completely rewrite them into WebAssembly. The important point here is that it is not necessary. Most likely, most of these applications work quite well and only in some bottlenecks, you may feel a lack of performance in calculations, or data throughput, or lack of functionality due to the lack of a JS version of some library. We want to give developers the opportunity to rewrite only these “bottlenecks” in WebAssembly, leaving the rest of the code in the usual JS. And it is already possible. For example,86 times .

But in order to make such a practice mass and comfortable, we need to implement something else.

Skill: quick calls between JS and WebAssembly

Calling a WebAssembly from JS should work very quickly. By adding a small WebAssembly module, the programmer should not feel any performance loss, even if this module is called very often. This is not the case in MVP (since the goal of MVP was not to maximize the performance of such calls). This problem has yet to be fixed. In Firefox, we have already ensured that some calls to JS-> WebAssembly are already faster than non-line calls to JS-> JS . The developers of other browsers are also working on this task.

Skill: fast data exchange

This task is related to the previous one: it is important not only to quickly call the WebAssembly code from JS, but also to quickly transfer data between them. There are certain problems with this. For example, the fact that WebAssembly understands only numbers. There are no objects in it, but they are in JS. So, we need some kind of broadcast layer. It already exists, but is not yet productive.

Skill: integration with ES-modules

Now using a WebAssembly module looks like a special API call that will return a module for you to use. But this means that the WebAsse assembly is not really part of the JS module graph of the web application. In order to have all the functions (like export and import) available to the ES module, the WebAssembly module must be able to integrate with the ES modules.

Skill: integration into development

Just to be able to import and export - does not mean yet to become a full-featured module. We need a place where WebAsse modules could be distributed. What will be analog of npm for WebAssembly? Hmm ... what about the npm itself? And what will be the equivalent of a webpack or Parcel for WebAssembly? Hmm ... how about webpack and parcel?

WebAssembly modules should not be different from regular modules, which means they can be distributed through the same infrastructure. But we need tools to integrate them into this infrastructure.

Skill: Backward Compatibility

There is one more important thing that we must provide. Everything should work well even in older browsers. Even in those that have no clue about WebAssembly. We have to guarantee that by writing once the code for WebAssembly, the developer will not have to write the second version of that same code in Javascript simply because the site must open in IE11 too.

Where are we now?

Somewhere here:

Quick calls between JS and WebAssembly

Already implemented in Firefox, is working in other browsers.

Fast data exchange

There are several suggestions. For example, extend the type system in WebAssembly with references to JS objects. This is possible, but will cause the need to write additional code (for example, to call JS-methods), which does not work too fast. To solve this problem, in turn, there are also several suggestions.

There is one more aspect concerning data exchange. This includes tracking how long data can be stored in memory. If you have any data in memory that the JS code must have access to, then you should leave it there until the JS code reads it. But if you leave them there forever, then we will get a memory leak. How do I know that data can already be deleted (JS-code has already read them)? To date, this responsibility lies with the programmer - everything is released manually. As soon as the JS code has finished reading the data, it should call something like the “free” function. But this approach is morally obsolete and often leads to mistakes. To solve this problem, we introduced the concept of WeakRefin javascript. This makes it possible to read the data on the side of the JS code, and when the garbage collector is triggered, it is possible to correctly clear the memory in the WebAssembly module.

All this is still in development. Meanwhile, in the Rust Ecosystem , tools were created that automate the writing of such code for you, replacing the parts that are not yet implemented with their own implementation. One of these tools deserves special mention. It is called wasm-bindgen. When he notices that your Rust code is trying to get or return JS objects or DOM objects - it automatically creates a JS layer that will be able to interact with your Rust code. This layer can also interact with the WebAssembly module written in any other language, so not only Rust programmers can use this tool.

Integration with ES-modules

The plan of work in this area has been around for quite some time. We are actively working on it in conjunction with the developers of other browsers.

Development integration

Already, there are tools like the wasm-pack in the Rust ecosystem, allowing you to automatically pack everything you need for release into npm. And there are people using this tool to create their own modules.

backward compatibility

For backward compatibility, we have the wasm2js tool. It allows you to turn a wasm file into an equivalent .js file. This Javascript code will not be fast, but it will work on any browser (including, not supporting WebAssembly).

As you see, we are very close to receiving this “achievement”. And as soon as we do this, the path to two more will open.

JS frameworks and JS compiled languages

The first of these is the ability to rewrite popular heavyweight JS frameworks on WebAssebly.

The second is to enable Javascript-compiled programming languages to replace it with WebAssembly. We are talking about languages like Scala.js , Reason , Elm .

For both of these tasks, WebAssembly must support a number of new high-level features.

Skill: garbage collector

We need integration with the browser-based garbage collector for a variety of reasons. First, let's recall the task of rewriting JS frameworks (or their parts). This may be necessary. For example, in React we have an algorithm for comparing DOM-trees, which can be rewritten to Rust with effective multithreading. We can also speed up something better by allocating and freeing memory. In a virtual DOM, instead of creating a set of small objects that the garbage collector would later need to track and delete, one could use a special memory allocation scheme. For example, allocate a block of memory at once, place all objects in it, and then delete it with one call. This will both speed up code execution and save memory.

But we also have to constantly interact with the JS-code. You cannot just copy data back and forth all the time, it will be inefficient. This means that we need the ability to integrate with the garbage collector running in the browser in order to be able to work with objects whose lifetime is determined by the Javascript virtual machine. Some JS objects will need to refer to linear memory blocks created in WebAssembly modules, and some linear memory blocks will contain references to JS objects.

If at the same time cycles are created (and they will be), then this will create problems for the garbage collector. He will not be able to determine what else is used and what is no longer. WebAssembly needs tighter integration with the browser's garbage collector to avoid this.

It will also help languages such as Scala.js, Reason, Kotlin and Elm - they can be compiled into Javascript, which means they use its garbage collector. If WebAssembly will use it too, then the code in these languages can be compiled for WebAssembly and should not notice any difference in terms of the nuances of the garbage collector (it will be the same).

Skill: exception handling

We need exception handling. Yes, some languages, like Rust, do not use exceptions. But many others use it. At the moment you can replace exception handling with a code with no exceptions - but it works slowly. Thus, now when developing for WebAssembly, it is better not to use exceptions initially.

In addition, there are exceptions in Javascript. Even if you don’t use them in your code, some standard function can throw it away and you need to do something about it. If your WebAssembly-code will cause a JS-code, and that will throw an exception - we cannot process it correctly. The code on Rust, for example, will simply crash. This needs to be changed, we need a normally working exception handling script.

Skill: debugging

Another thing that JS developers are used to is good debugging tools. All modern browsers have tools for convenient analysis of Javascript code. We need the same level of support for WebAssembly.

Skill: tail calls

To support functional languages, we need a thing called tail calls . I will not delve into this topic, somewhat simplifying it can be said that the tail call is a way in some cases to call a function without allocating a frame for it in the stack. Since this feature is crucial for functional languages, we want to have its support in WebAssembly.

Where are we now?

Somewhere here:

Garbage collection

In order to implement garbage collection, work is currently underway in two directions: these are Typed Objects for JS and, in fact, the garbage collector for WebAssembly . Typed Objects will allow to describe the clear structure of the object. There is already a vision of how this should work, and it will be discussed at the upcoming TC39 meeting. Accordingly, the GC for WebAssembly will be able to access the above structure for its own purposes. Work is already underway on its implementation.

As soon as both parts are completed, we will get a system from interacting JS and WebAssembly, which is able to understand at all levels what the object consists of and effectively use its internal data. We already have a working prototype. The prototype, however, cannot simply be taken and released - we have to spend some time on standardization and editing. We expect that it will reach release somewhere in the 2019th year.

Exception Handling

Work on exceptions is now at the research and development stage. We consider various proposals, try to implement them, see how efficiently they work.

Debugging

For debugging, there is already some support in the Firefox developer tools. But the ideal is still far away. We want to show the developer its source code and current position in it, and not just assembly instructions. We have to develop and implement support for symbol files, which will allow each code instruction to be associated with a source line. Right now we are working on the specification of this mechanism.

Tail calls

Work in progress .

When all of the above is completed - we can assume that we have achieved the achievement of “JS frameworks and languages compiled in JS.” So, this was the plan for getting “tools” in the browser. What about what's happening outside the browser?

Outside the browser

Perhaps you are confused by the combination of the words "outside the browser." Do we have something other than a browser when we talk about the web? But the "web" we have right in the name "WebAssembly". But in reality, HTML, CSS, and JavaScript are just the tip of the iceberg.

Yes, they are visible best of all, because it is they who form the user interface. But there is another very important part of the web - this connection. Connections of everything with everything.

I can link to your page right now. I don't need your permission or someone else's permission. I just make this link, add it to my site. Anyone can go through it - and your content will be shown, the code you write will start. This simplicity of creating connections and navigating through them has created our Internet for what it is. Now we have social networks and other sites that, in fact, expand the concept of “links” with the ability to dock anything: people, devices, businesses, etc.

But with all these links and links there are two problems.

First, what should the link lead to? If you go somewhere and the site offers you some code that must be executed in your browser - this code must be cross-platform. It should compile into something and run on a Mac, on Windows, on an android. Everywhere. Portability of downloadable code is an important part of the concept of web connectivity.

But just download and run the code a little. You need to understand that we know nothing about this code. We do not trust him enough to give full control over the user's computer. Suddenly this is malicious code? He can do something bad. And here we need some kind of security model. We need a sandbox where you can put unfamiliar code, give it some controllable tools for the job, but keep everything critically important and unsafe away from it.

So, two aspects of the concept of "connection": portability and security. We know that we can definitely run the code and that it will definitely not harm us. Why do I insist on these concepts and how does this view of things differ from the view of the web as a combination of HTML, CSS and Javascript? Because this approach radically changes the view on what WebAssembly is.

On the one hand, you can think of WebAssembly as "another tool available in a modern browser." And it is.

But portability and security of code execution open other doors for us.

Node.js

How can WebAssembly help Node? Bring portability.

Node provides a fairly large level of portability that uses Javascript. But there are still many cases where the performance of the JS code is not enough or just the necessary JS code has not yet been written, but there is its native version. And then Node uses native modules. They are written in languages like C and need to be compiled for the particular platform on which your Node runs.

Native modules can either be compiled during installation, or you can immediately get ready for one of the popular platforms. Both approaches are possible, but this is just a choice of two evils: either an extra headache for the user, or the author of the module.

If we imagine that these modules will be on WebAssembly, then they will not need to be compiled at all. Portability will allow you to run them on any platform, as soon as Javascript code. But they will work with the performance of native versions.

And here in the world of Node comes happiness in the form of full portability of everything and everywhere. You can transfer the Node application from Linux to Windows - and everything will continue to work without any recompilation. But at the same time WebAsse-module has no access to system resources (it works in its sandbox). But native (and even non-native) Node modules do not work in the sandbox, they have access to everything - this is the Node ideology. Thus, in order for WebAsse-module to get the same features - you need an additional layer of access to OS resources. Something like POSIX-functions (not necessarily they, given only as an example of a relatively stable and sufficient interface access to resources).

Skill: portable interface

So, what do Node developers need to use WebAssembly modules? Some kind of interface to access its functions. It would be good to standardize it. Well, that not only Node could call these functions, but also in general anyone. Wanted to use WebAsse an assembly-module in the application - connected and we used. Something like "POSIX for WebAssembly". PWSIX (portable WebAssembly system interface)?

Where are we now?

There is a document describing the mechanism for providing a path to a module by its name. This will probably be used by both browsers and Nod (they will be able to provide different paths). While active development is not conducted, but there is a lot of discussion.

Most likely it will be implemented in some form. This is good because it opens up a number of possibilities.

CDN, Serverless, and Edge Computing

Examples include things like CDN, Serverless, Edge Computing. Cases when you put your code on someone else's server that cares about its availability to clients. Why might WebAssembly be needed here? Recently there was an excellent report on this topic. In short, there is a need to run code from different (not trusting each other) sources within the same process. This code must be isolated from each other and from the OS. Solutions like the JS virtual machine (SpiderMonkey or V8) work somehow, but do not provide the necessary performance and scalability. And WebAssembly - gives.

What is needed to make it work?

Skill: runtime

Need a runtime environment and some companies create their own. We already have WebAssembly compilers (such as Cranelift ) - they are fast and efficiently use memory. But the code generated by it cannot live in a vacuum - it needs to rely on something, somehow interact with the environment. Now some companies, like Fastly, write this runtime on their own. But this is not a very good approach - after all, many companies will need it and they will do the same job over and over again. We could do it once, add it to the standard - and save everyone a lot of resources.

Where are we now?

Somewhere here:

There is no runtime standard yet. This does not prevent the existence of several independent runtime environments already used in real projects. For example, wavm and wasmjit.

We also plan to release a runtime built on top of Cranelift, it will be called wasmtime. And as soon as we have something standardized and working, this is an open opportunity to develop a number of things, such as, for example, ...

Ported command line utilities

WebAssembly can be used not only in the browser, but also in traditional operating systems. Let's not talk about the kernel (although there are brave souls who aim at it), but the WebAssembly-code may well work in user mode. And this makes it possible to create command-line utilities, which, once assembled, will be guaranteed to work equally well under any OS.

Internet of things

The “Internet of Things” usually means low-power devices (like wearable or various sensors / controllers in smart homes). Restrictions on the available processor resources and RAM negatively affect the ability to run JS-code there, but WebAssembly is another matter. Optimizing compilers like Cranelift and a runtime environment like wasmtime will shine in such conditions, because they were once written for resource saving tasks. In absolutely extreme cases, WebAssembly even allows you to compile your module into the native binaries of the target platform. Well, again, portability - all these IoT-devices today are very much and they are built on different platforms. With WebAssembly, you will not have to worry about it - the developed code will run everywhere.

findings

Let's roll back a bit and take a look at our “skill tree” again. I started this article by saying that some people do not understand why WebAssembly is not yet complete. As you can now understand, his path has barely begun. Yes, MVP already opens up some possibilities. We can already compile something in WebAssembly and run it in a browser. But there is still a lot of work ahead - support for everything needed by heavy applications and high-level languages, the replacement of JS frameworks and all these “out of the browser” things I was talking about. When all this is ready - we will see a new web. High-performance, more ambitious, more portable. There will no longer be this type of software that cannot be written for execution in the browser: games, blockchain, Internet of things, command line utilities - everything will start.

WebAssembly is not completed. He just started.

Tags:

webassembly

The future of WebAssembly as a “skill tree”

Minimally Viable Product (MVP)

Skill: target compiler platform

Skill: fast code execution

Skill: compactness

Skill: memory access

New achievement!

Heavyweight desktop applications

Skill: multithreading

Skill: SIMD

Skill: 64-bit addressing

Skill: streaming compilation

Skill: caching

Skill: other improvements

Where are we now?

Multithreading

SIMD

64-bit addressing

Stream compilation

Using two compilers

Implicit HTTP caching

Other improvements

Javascript Interaction

Skill: quick calls between JS and WebAssembly

Skill: fast data exchange

Skill: integration with ES-modules

Skill: integration into development

Skill: Backward Compatibility

Where are we now?

Quick calls between JS and WebAssembly

Fast data exchange

Integration with ES-modules

Development integration

backward compatibility

JS frameworks and JS compiled languages

Skill: garbage collector

Skill: exception handling

Skill: debugging

Skill: tail calls

Where are we now?

Garbage collection

Exception Handling

Debugging

Tail calls

Outside the browser

Node.js

Skill: portable interface

Where are we now?

CDN, Serverless, and Edge Computing

Skill: runtime

Where are we now?

Ported command line utilities

Internet of things

findings

Also popular now: