Interview with C # legend Eric Lippert

Original author: DotNetCurry Magazine
  • Transfer
Material taken from DotNetCurry magazine devoted to technologies based on the .NET platform.

Dear readers, we are very pleased to see Eric Lippert in this issue of DNC. Eric does not need to be introduced to people familiar with C #, but for the rest, Eric is known for his work on the C # language compiler development team. He has devoted a significant part of his career to Microsoft, working in various positions. Before joining Microsoft, Eric worked for Watcom. Our “oldies” remember Watcom as a company that created very good compilers for the C ++ and Fortran languages. Eric currently works at Coverity to help create static code analysis products.

DNC: Hello Eric, we are very glad to see you here with us.

EL: Thanks. I'm glad to be here.

DNC: You worked at Microsoft for a long 16 years. Describe your journey (if you can call it that) from an internship to working on VBScript, JScript, VSTO (Visual Studio Tools for Office) and becoming the main developer of the C # language compiler team.

EL:I grew up in Waterloo, was interested in science and mathematics from an early age, so it was natural for me to go to the University of Waterloo. In addition, I had relatives in the state, and I already knew a number of professors, and as you said, as a student, I worked for UW, a subsidiary of Watcom. UW had an excellent training program, one of the largest in the whole world, through which I managed to get three internships at Microsoft, in the team of developers of the VisualBasic language. They gladly renewed my job offer when I finished my internship and I remained in the tool development unit throughout my career at Microsoft.

DNC:Before you started your internship at Microsoft, it is fair to assume that in the early years you received a lot of good advice from senior engineers. Which of these tips was the best programming tip you've ever received?

EL:I received a lot of good advice from senior engineers throughout my career, not only at the beginning; Microsoft encourages formal and informal mentoring. I recently spoke somewhere about the best advice in my career that I got it at Microsoft: in principle, as I became a specialist in the subject field, I answer as many user questions as I can. But to say which of the programming tips was the best is not so simple. I have learned so much from Microsoft, a world expert in programming language design, performance analysis, and many other things, that it’s hard for me to name one thing.

One thing I remembered before I got to Microsoft. One day, many years ago, Brian Kernigan made a presentation on programming in UW. On one slide, a code was shown with which something was wrong. It was incorrect, because the comment on the code and the code itself did not match each other. Kernigan asked a question: what actually works - code or comments on it? I am still asking myself this rhetorical question when I try to understand code that contains an error; it’s often the case that the comments are misleading because they are outdated or simply originally poorly written, it often happens that the comments are correct and you don’t even need to delve into the code that contains the error. Kernigan's report completely changed my attitude to commenting on code. From now on, I'm trying to write comments,

DNC: When your team started developing C #, what were your main goals? Are you happy with what language C # has become?

EL: To be honest, I started working on C # when the basic concepts of C # 3.0 were seriously developed and developed. I have been following the C # since its inception, over 10 years ago, but I was not part of the C # 1.0 or C # 2.0 team.

When we talk about goals, I try to distinguish between “business” goals and “technical” ones; they are closely related but different. From a business perspective, the main goal of C # was and is to create a rich language that would allow you to get all the benefits of the .NET platform and, moreover, improve the modern view of the Windows ecosystem as a whole.Better tools entail more productive developers, more productive developers create better applications for their users, better applications make the platform more attractive and everyone wins.

In terms of language design, there are a number of basic principles that developers come back to again and again. The language should be modern, practical, general-purpose language, which is used by professional programmers who develop software.

C # 1.0 started out as a fairly simple, modern programming language. Obviously, he felt the influence of C and C ++; The language design team sought to alleviate some of the flaws in C / C ++, while at the same time providing access to insecure code. But 10 years later, the language grew, adding features such as generalized types, sequence generators, functional closures, query expressions, the ability to interact with dynamic languages, and later, a significant improvement in support for writing asynchronous code. I am excited about how the language has evolved over the past 12 years, and it is an honor for me to be part of some of the most exciting changes. I look forward to continuing to work on the C # ecosystem.

DNC:As a member of the C # language development team, what negotiations did you have with the Windows OS team? At what level is the operating system when developing certain features of the language? Do you work largely alone or do you have more collaborative efforts?

EL:Each time in a different way. In the distant past, this was usually the first; when developing C # 5.0, the Windows team was heavily involved. Personally, I had little contact with the Windows team throughout my work on C # 5.0, but the C # language project management team was almost always with my colleagues from the Windows team throughout my work on Windows RT. There were several technical issues that required extra care to ensure that there was as little mismatch as possible between C # developers and the Windows RT programming model. In particular, it was important that “async / await” meets the needs of Windows RT developers using C #.

However, this is true relatively recently. C # historically has not had direct interaction with the Windows team, since it is based on a managed CLR environment, and also uses the BCL class library to provide access to the functionality of the operating system. Since the C # team realized that the CLR and BCL teams would act as an intermediary between the services of the operating system, they could concentrate more on designing a language that would use all the power of CLR and BCL, and let these teams interact with operating system.

DNC:We heard about your podcast with the StackExchange team in which you mentioned things that are at the top of your list - “if I had Gene to fix in C # ...”. It was about unsafe covariance of arrays. Could you tell our readers about this?

EL: Of course. First, let's define what the term covariance means. A proper definition will require a course on category theory, but we do not have to go so far as to understand the meaning of the term. The idea of ​​covariance, as the name implies, is when one statement is reduced to another whose truth is preserved when some transformation is made over the original statement.

In C # there is the following rule: if T and U are reference types and T are reducible to U by reference transformation, then T [] can be converted to U [] by reference transformation too. This rule is said to be covariant, since the statement “T is reducible to U” you can lead to the statement “T [] is reducible to U []” while the truth of the statement is preserved. This is not true for everything; for example, you cannot conclude that Listreducible to List , just because T is reducible to U.

Unfortunately, covariance of arrays weakens type safety in the language. The language is type-safe when the compiler catches errors such as, for example, assigning an integer to a variable with a string type, that is, the program will not be compiled until all type-mismatch errors are fixed. Array covariance, an example of such a situation when type mismatch cannot be caught at the compilation stage, but can only be checked at runtime.

static void M(Animal[] animals)
    animals[0] = new Turtle();
static void N(Giraffe[] giraffes)

Because the conversion of arrays is covariant, an array of giraffes can be converted to an array of animals. And since the turtle is an animal, we can put it in an array of animals. But this array actually contains giraffes. This type mismatch will result in throwing an exception at runtime.

The covariance of arrays has 2 negative consequences. First, the assignment of a value to a variable should always be checked at the compilation stage, but in this case this is not possible. And secondly, this means that every time you assign an array element (a type that is an unsealed reference type) to a non-empty reference, the environment checks the actual type of array elements with the type assigned to the reference. This check takes time! In order for the covariance of arrays to work, the correct program will have to work more slowly with each access to the elements of the array.

// If I had a Gene that could fix any code, I would remove the unsafe covariance of arrays completely.

The C # development team has added type-safe covariance to C # 4.0. If you are interested in how this is done, I wrote a long series of articles on the implementation of this functionality; you can read them here

If I had a Gene that could fix any code, I would completely remove the unsafe covariance of arrays and fix all code using type-safe covariance added in C # 4.0.

DNC: Before we get to your current job, tell us a little about what static code analysis is.

EL:Static analysis refers to the analysis of a program based only on its source code. It differs from dynamic analysis, which analyzes the program at runtime. Compilers perform static analysis, while profilers perform dynamic analysis. Compilers use static analysis for 3 things: first, to determine if the program is the correct program, and if not, output the appropriate error messages. Secondly, in order to translate the correct program into some other language, usually in byte code, or machine language, however, it can be any other high-level language. And thirdly, to identify designs that are true, but the use of which is doubtful, and to issue appropriate warnings.

In Coverity, we usually deal with the third type of static analysis, we assume that the code is syntactically correct; This will be checked by the compiler, and we will conduct a much deeper analysis to identify questionable structures and present them to your attention. The sooner you find an error, the cheaper it is to fix it.

// I spent about 15 thousand hours carefully studying the design and implementation of the C # compiler.

There are other things you can do with static analysis, for example, Coverity also makes a product that uses static analysis to find changes in code that don't have matching unit tests.

DNC: How does your deep C # knowledge benefit Coverity?

EL:Basically in two ways. First of all, C # is a huge language; its specification takes about 800 pages. Ordinary developers, of course, do not need to know the entire language in detail in order to use it effectively, but compiler creators certainly should. I spent about 15 thousand hours carefully studying the design and implementation of the C # language compiler, and I also studied with the language designers Anders Halesberg, Neil Gafter and Eric Meyer, so I have a pretty solid understanding of what a good static C # language analyzer does. Secondly, I saw thousands of lines of C # code containing errors. I know what kind of mistakes C # developers make, which helps us to identify places where we need to make special efforts in static analysis.

DNC:Since you worked at Coverity, have you encountered a situation in which you thought - “mmm, this will help make C # more statically typed (provable)”?

EL: Some functionality of the language complicates static analysis, but at the same time makes the language more powerful. For example, virtual methods complicate static analysis, because the meaning of virtual methods is that the real method will be selected based on the real type at runtime.

As I said earlier, C # was designed with the drawbacks of C / C ++ languages. The C # 1.0 developers did a good job; it’s quite difficult to organize a buffer overflow or create a memory leak or use a variable before its initialization or accidentally use the same name for completely different things. But what really was instructive for me after moving to Coverity was that most of the erroneous constructs that Coverity tests in C / C ++ are equally well applicable in modern languages ​​like Java and C #.

DNC: We heard you developed one of the first fan pages of The Lord of the Rings. Tell us more about how this happened, as well as about your interests in books and movies.

EL:My father read me the Hobbit when I was very young; from then on, I became interested in the writer Tolkien; I collected his biography and that was my hobby when I was a teenager. When I studied mathematics at UW in the early 1990s, the World Wide Web (WWW) was something new; one day I was surfing the internet and I found a fan page for the original Star Trek series. I thought it would be a great idea to create something similar for Tolkien, I searched the entire Internet, which did not take me much time in 1993 and found all the FTP sites, news groups and much more about Tolkien, and created a simple a web page that was just a collection of links and put it on the server of the computer science club GOPHER. With the growth of the web, more and more people associated with it sent me the addresses of their pages. I continued to add more and more links until there were too many of them. I stopped supporting this page and, in the end, my membership came to an end. I think you can still find it in the Internet archive.

The fact is that at that time companies such as Open Text and Google began to index the Internet and their search algorithms took into account such things as: how long the page existed, how often it changed over time, and how many external links there were. Even after I stopped actively supporting it, the page performance was at a high level. As a result, for many years my name was the first to make a request for Tolkien. When the film was presented, many people did just that. As a result, I ended up giving an interview to several newspapers, received an e-mail from one of Tolkien’s grandchildren, Jeopardy (the American name for the “My Game” program), one day they called me asking “Who are the Ents?” I had a lot of fun.

However, in those days I read very little science fiction and fantasy. Most of my free time reading is devoted to non-fiction books.

I love to watch movies and invite friends to watch them, our nightly movie selections can be very different. One month we watch Oscar-nominated films, another month horror films.

// One way to determine that the language is really getting big is when users send a request for new functionality, and you already have it.

Here are three small features that not many people know about.

  • You can add the prefix “global:” to the namespace name to force the name resolution algorithm to start the search from the namespace marked with the word global. This is useful in situations where you have a conflict between global space and local space or type. For example, if your code is in the poorly named namespace “Foo.Bar.System”, then accessing “String” in “Foo.Bar.System” will result in an error. If you add the prefix “global: System.String”, then String will be searched in the global namespace System.
  • C # prohibits "failing" from one section of a switch statement to another. What people do not know is that you do not have to use the word break in each section. You can make an operator fail from one section to another using labels. You can also complete the switch section with goto, return, throw, yield break, continue, or even an infinite loop.
  • And the last smart function that few people know: you can combine the use of null-combining operators to get the value of the first non-empty element in a sequence of expressions. If you have variables x, y, and z of type int? then the result of the expression x ?? y ?? z ?? - 1 will be the first of x, y or z is not a null number or -1 if they are all null.

I often talk about unusual features of the language on my blog, so visit it if you want more examples.

Also popular now: