“You only need to gently generate the LLVM IR.” Egor Bogatov on Mono and .NET Core

Egor Bogatov is a Microsoft developer from the Mono team who works on Mono and integrates it with .NET Core. We talked with him about how to work inside Xamarin and Microsoft, about the love of game devs. We discussed why SSD is the best friend of the developer, and the use of reports at conferences does not always correspond to their complexity. As always, the interview is conducted by Oleg Chirukhin ( olegchir ) from the JUG.ru Group.

Introduction: about encrypted demos and how to get into Xamarin

- Let's tell Habra who you are, what you do.

“I’m a developer, I’ve been working on the .NET stack for ten years, I’ve worked a little bit in Java and wrote quite a bit under Android.

He worked in different companies: he started from an outsourcing, then moved to the grocery, such as Viber and Playtika. Then I poflanil a little bit, including Java, and went to work in Xamarin.

- How did you get there?

- I have long been fond of .NET and Mono. I liked C #, but didn’t like Microsoft’s policy, which tied him strongly to Windows. Therefore, I have been following the cross-platform implementation since its inception.

I actively followed Mono, Xamarin, as soon as it appeared: I liked the concept itself. Participated in their contests and several times took second places. I was noticed and offered to work as a contractor, and Miguel de Icaza himself wrote that was a complete surprise for me, because for me he was a legend man.

- How did you start?

- Miguel suggested that I write a demo that included a chat with end-to-end encryption for mobile platforms. I had experience with chat applications and I was fond of the topic of encryption, so initially they took me to the backend, but I said that I can develop it for Android. After that, I was engaged in various third-party projects Xamarin - I was not allowed in the runtime and the components themselves.

Miguel has many interesting projects. Sometimes it seems to me that this is a group of people under one name. Well, one person cannot fumble around, answer everyone, be aware of everything.

Several times I made demos for it for large conferences such as Xamarin Evolve and MS Build - this is the biggest developer konfa from Microsoft. And what was their commercial significance of these demos, what are they for? Just advertising technology among potential customers. For example, one of the demo showed an example of how you can easily embed 3D visualization into a regular application on any platform and this opportunity interested several serious companies in this area.

About work: tasks and the eternal dispute "remote or office"

- And now what are you doing?

- I was transferred to the runtime team, that is, directly to Mono. My main duty is to use Mono and .NET Core, that is to be somewhere between two runtimes. This allows me to better understand .NET, because I am researching and examining all these types, starting from the most basic and ending with the complex ones. In two years, we managed to build up a good base of experience and get to know all the key developers.

- Do you work from home?

- We have a small Microsoft office in Minsk. I occasionally visit there, but mostly I work from home.

- And which is better: to work in the office or at home?

- To work at home you need a lot of self-discipline. Periodically I try to combine this with travel, but it does not work out very effectively. For example, I need a full-fledged healthy desktop computer with three monitors. On a laptop, I can not work comfortably.

- Do you need it for monitors or is the power of a computer important?

- Both for monitors and for power. I occasionally need to compile different runtimes: mono, coreclr, corert, spin virtualke, etc. For this, I need a full-fledged top processor, not a misunderstanding cut by TPD, and, of course, a fast SSD.

- So if you want to work on the Mono code, do you need a normal PC?

- Mono includes the source code. NET and. NET Core in the form of submodules, so in the end there is a huge number of files that need to somehow quickly move, so the most important thing is a fast SSD. We must take anything from Samsung 960 Pro and up. Bottleneck is always in IO.

- Describe your working day

- I work remotely from Minsk. The main part of my team is in the USA, although there are a few people in Europe, there are people in Japan, Australia, even in Africa. Such a distributed command. We communicate mainly in Slack, a couple of times a week we hold rallies. Periodically meet in Boston or Redmond.

The tasks are mostly quite abstract. For example, to port types from a specific namespace. I can take something parallel, go to GitHub and fix some bugs. Periodically I do something for .NET Core - I try to optimize or clean something.

- Where do tasks come from, how is it organized? Any endless backlog?

- Tasks get users and team leads, once a month we have a week of bug fixing: you only do bug fixing for a whole week, you throw other things.

At other times, it is also advisable not to forget about the bugs, but you need to adhere to the main goals, for example, my goal is to port the main types from mscorlib and make Mono / Xamarin compliant with NET Standard 2.1. Porting types usually looks like throwing away an old implementation and replacing it with a reference to code in a .NET Core submodule with adaptation.

About Microsoft, axes and "betrayal"

- Well, yes, the license allows. And in general you are in the same company.

- Yes, it allows. We have done this before. Mono was part of some distros, I think there was even Mono in Ubuntu and GNOME. Miguel was told that he would let everyone under the monastery.

“Yes, I remember, Stallman called him a traitor.”

- They were afraid that at any moment Microsoft lawyers might appear and sue everyone, which Microsoft, fortunately, did not do.

- Well, yes, Microsoft did the exact opposite - it became Linux to use at home.

- Microsoft is now completely different with the new CEO, the focus on cloud technologies has led us into the world of open-source and all that we couldn’t think of before. It is now possible to download Ubuntu WSL from the Marketplace with one click, add MS SQL Server to Linux and develop .NET from under macOS.

- So you can safely write code under open licenses and no one will say anything?

- Yes of course. Naturally, before putting any internal project into an open source, a small bureaucracy is needed, but in general, I did not meet any prohibitions on the use of something.

- Do you have tasks that require three platforms at once?

- I have a whole set: one computer on Windows, MacBook with macOS and a laptop with Fedora. Also a whole placer virtualok, including WSL. Most often, bugs are divided into two types - Windows and non-Windows, which are played on both macOS and Linux.

Understanding .NET Core and Mono

- What are the directions that you like, and they can be developed in the .NET Core and in Mono?

- Personally, I like the big emphasis on performance and cross-platform. Productively constantly improving in combat conditions ranging from Bing to public benchmarks like TechEmpower, in which .NET Core shows itself from a very good side along with solutions based on Go, Java and C ++. Many people still have a stereotype about .NET, like Windows-only technology with a brake virtual machine - we are successfully fighting this stereotype.
Our team pays great attention to AOT scripts and using LLVM as a backend for generating machine code. LLVM is a very powerful tool with a huge amount of optimizations. It is only necessary to gently generate LLVM IR with a minimum number of safe-points, so as not to interfere with this optimization. Personally, I not so long ago even wrote my simple LLVM transformation pass.

I am also pleased that C # and .NET are mainstream along with C ++ in game dev thanks to Unity and some other engines that have C # scripting.
There is a potentially interesting direction - compiling C # in a Web Assembly for a browser.

- I don’t know how in .NET, but sometimes you have to drag a bunch of standard libraries to compile. In Java, you run Hello World, and you have 2,000 classes loaded. The browser will load a huge amount of megabytes. What do you think about this?

- The minimum size of runtime Mono with the base library is about two megabytes. But even Apple has this problem: applications written in Swift are dragging each of their runtime. So far, Mono-wasm technology is damp and based on runtime, which was AOT compiled into a WASM + interpreter for user code. By the way, runtime is now being rewritten from C to C ++, I hope this will not affect the size as a result.

- Have you tried to rewrite Mono to C # instead of pluses or C?

- The idea sounds good, but it would require simply unrealistic resources and we have some progress in this. The .NET Core team has reached such a level of C # and .NET that the positive code is replaced by C #, so as not to bathe with cross-platform and at the same time it does not lose performance. A recent example is the translation of the implementation of parsing and converting numeric types and the entire Decimal has been rewritten to C #. I am very happy and greatly simplifies the work of migrating the code.

About Garbage Collector

- I saw the .NET Core GC, which scare children, because it is one and a half megabytes of source code in C ++! One and a half megabyte, Karl! How many books are there ?!

- Yes, yes, while 47 contributors were noted in this file according to the githabu. I am not an expert on the Garbage Collector, but in general GC has a fairly general theory of the type of the Mark-n-Sweep algorithm, which is complicated by generations and attempts to avoid the full stops of the world and to do everything parallel to the main execution flow.

- Do you have plugins or the ability to change the Garbage Collector or is it alone?

- In Mono, there are several implementations, and in .NET Core, not so long ago, a public API was made that allows you to take a couple of headers, write your HZ, and connect it to any application in the same environment variable. As an example, there is an article on how to write ZeroGC for .NET Core. In the world of containers, where it is not necessary to clean up the garbage behind you, it may be relevant. In general, this allows someone, for example, to take the current implementation and optimize it as much as possible for, say, gamedev, so that stopping the world and running through all the objects does not stink FPS, for example, or optimize memory consumption, I think Samsung did for tizen a couple of modifications in GC.

- The fact that Microsoft let go of total control over everything is good, because GC and JIT are a very good control tool.

- Yes. Look at the .NET Foundation - this is not just Microsoft anymore. There Google, Red Hat, Samsung, Intel, in general, all the companies that previously seemed to be next to Microsoft will not. Is that Apple is missing.

About IDE support

- About the IDE: how good is it in the tuling, in the compiler itself and in support of the IDE? Now there are all sorts of things like Swift, where the compiler gives very few opportunities to integrate the internal structure, cache, something else. And this is an endless pain, because when you make some kind of your own tuling, you need to reinvent the whole world. How good is this with Mono? Do you have your IDE?

- The compiler C # Roslyn was originally written as not only the C # compiler in IL, but also the IDE backend and analyzer, it can even digest bad code. Based on its output, you can simply show some views and do something, and he will directly say: “Show the menu there”, “offer refactoring”, “here’s a preview of the changes” and so on. such a refactoring and offer the user ". That is, this compiler directly allows you to quickly make your IDE.

In fact, you simply implement a set of interfaces for your GUI, and you already have an IDE that supports a large set of refactorings and the like.

In general, many modern languages allow you to get AST - an abstract tree of code expressions. For example, Clang makes it possible to get an abstract tree from a plus code, by the way, we use this feature to generate C # bayindings to C ++ and Objective C code.

- Have you tried using Visual Studio Code for something?

- Well, I would say that this is my main tool.

- Let's say the dude wants to open the Mono repository and download it. What does he need for this?

- On Windows, just open a runtime and solution solution and crash both. Thanks to the efficient parallelization, msbuild should handle in 5 minutes. On macOS and Linux, the usual approach is used through Makefiles.

On the preparation of reports and a few spoilers

- You come to the report on DotNext, but what will it be about?

- My report will consist of a set of interesting examples of micro-optimizations applied in the .NET Core by developers and third-party contributors, which, I think, can be useful to application programmers. I will also pay attention to unsuccessful examples to optimize something, for example, when contributors want to optimize a particular case, but it turns out to be sideways in the form of regression in others. Separately, there will be a dozen slides on the newly created API to SIMD.

The guys from Intel along with Microsoft-guys brought out a low-level API for SIMD in C #, which allows you to write super-fast algorithms without relying on the compiler, which many people think can be able to optimize and vectorize everything on its own.

- In general, it is theoretically impossible.

- Yes, nowhere to get away from self-insertion intrinsikov. I doubt that in any language you can describe the multiplication or transposition of matrices on simple types and wait for the compiler to produce the most efficient SSE / AVX instructions at the output. By the way, I already applied these C # intrinsics inside .NET Core to optimize System.Numerics.Matrix with SSE and optimized the GetHexDigits function with Lzcnt. Can be used as an example of using API in your projects.

- When people come who are developing something core, some more come to those who are also interested in participating in it. Is there any newbie way?
Any first time contributor gets a lot of attention and help, many simple tasks or bugs that do not require extensive knowledge and high priorities can be labeled with a special label on GitHub - “up-for-grabs”, or “good first issue”.

- You can go to the repository, find issues using these labels and choose a close one in spirit. For example, there are quite a few problems about covering tests with some pieces of code. Increasing coverage with tests - this is rightly the ideal first task. Also, a good way is to quantify something, compare it with other runtimes, and try to figure out why a particular code works slower than in .NET 4.x, for example string.GetHashCode. Benchmarking has a large number of speeches and blog posts from Andrey Akinshin and Adam Sitnik about a very convenient tool - BenchmarkDotNet, which will show you the speed of code execution, compare it with other runtimes, tell the memory and show the assembler code with a simple hand movement.

Those. the minimum set of actions is to view all pull-requests and tasks, subscribe to people like Matt Waren and Ben Adams on Twitter, go to corefx and coreclr channels in the giter and read the BenchmarkDotNet documentation.

- Yes. I have now filtered the up-for-grabs tag, there are about 600 issues here, some without any comments at all and you can take them.

- Yes, that's it. More recently, a hackathon was conducted for the .NET Core team. We identified a couple of dozen issues, and in a day they needed to be fixed and get a prize for it.

- It is perfectly. I told a lot of interesting things, now I want to try to kill some issue myself. True, I don't know C #, that's the problem.

- C #, as I want to believe, rather predictable language in spite of more sugar and having experience in Java or C ++, I think you can fairly quickly begin to even optimize something in runtime, experience in other languages will even help you see on things from the other side.

- I am now looking at the .NET Core repository and it looks very decent. And people really communicate in the comments, straight discussion pass.

- Yes, quite active. There are both 100 and 200 comments. And you can learn from the base class library, there are quite a lot of interesting tasks that anyone can take.

- Thank you very much for the answers! See you on DotNext.

This time, a minute of advertising will be unusual, because while we were preparing the interview, the conference tickets were over. Want to see the reports and did not have time to buy a ticket? Live webcast is still available on the site .

If you have questions or an incredible desire to visit DotNext 2018 Moscow in person, write to us at tickets@dotnext.ru (maybe someone will return the ticket and we can help you).

Tags: