23derevo October 15, 2015 at 09:04

“It takes 5-10 years of experience to become a good systems engineer” - interview with Alexei Shipilev from the Java Performance Team

In anticipation of the Java conference Joker 2015 , which will begin tomorrow, I will publish a large interview with Alexei Shipilev , engineer of the Java Performance Team from Oracle, one of the coolest and most famous performance specialists in the world. And of course, a great speaker.

We talked with Alexei in detail:

upcoming changes to the String class;
about who is actually developing OpenSource;
about system developers and their career;
about technology exchange, “scientific” and “product” development;
about the complexity of low-level tasks;
about the development of the Java community and benchmark war;
about mutable vs immutable;
about Unsafe;
about JMH, benchmarks and narrow specialization.

Here is a video of our conversation. More than an hour long, you can listen on the road.

Below under the cutscene is the transcript of our conversation for those who do not really enjoy the video.

About changes to String

- Alexei, you used to talk a lot about Performance, and recently you have been doing a lot of String class. Tell me, please, what is the reason for this?

- I also talk about String from the point of view of Performance, because I participate in projects that are aimed at optimizing strings. There are many small optimizations in this area every day, but we are making two major changes.

The first is Compact Strings. The symbols of most of the strings that are in Java applications fit almost in ASCII. This means that for each character in String, you can spend 1 byte, not 2, as required by the char specification. Today inside String, inside storage is a char array. It is highly optimized and people expect high performance from it. Therefore, in order to try to compress these lines, you need to make two representations of this String: one representation is a typical one, char array, and the second is a byte array in which each byte corresponds to a specific character. And there you need a lot of performance work that will provide, as we have written in the release criteria, “non-regression”. So that users who switch from Java 8 to Java 9, not only do not experience pain, but experience the happiness and performance boost of this our feature.

And there are many different moving things about how the “library” works, how and what is done with string concatenation, how runtime processes this whole thing. In general, there are many small details.

“But not dumb at all?” The String class - it runs through the entire JDK

- Dumb! In fact, you can even see that under the original proposal from the guys from the Class Library Team, there are some funny comments. One is from Martin Buchholz of Google, who is known for being one of the JSR-166 maintainers. Martin once said something like “this is a difficult task, good luck!”. The second comment is from me, something in the spirit: “Well, I guys do not believe that it can be done, because the devil knows ... String is such a class. It’s really dangerous to touch him. ”

Now we have spent six months carefully prototyping this change, accurate measurements, developing an understanding of how this all affects performance, choosing the right code generation strategy ... And now I'm pretty happy because I understand that this change, which seemed dangerous, became clear after all this work. Knowing all the pros and cons helps, and with it everything is fine.

- And what will happen now to those who, to speed up the performance of their code, brazenly climbed into the char array, which lies in the string?

- A surprise awaits them. By specification, no one guarantees that a char array is inside the string. I think that people who get into the guts in order to win some kind of performance already initially subscribe to the fact that they must follow all the changes.

- And what is “no performance regression”? Is there a set of tests, are there any criteria for evaluating it?

- There is a formal and informal attitude to this. The informal attitude is that we have a large community that is accustomed to the fact that a major release works at least as slowly as the last major release. Because otherwise problems arise, how to migrate to the new version. Therefore, when you are developing a new feature, you remember that you cannot just fail a performance, you must win somewhere.

And there are formal criteria that indicate whether it is possible to squander performance by so many percent on such and such workloads. But this is the internal affairs of the development team, not OpenJDK. Rather, the attitude is just that - we understand that the community wants Java to not regress. We also want Java not to regress. On this unspoken assumption our development rests.

- Do you somehow communicate with large vendors, with manufacturers of enterprise servers, whose lines on lines and lines are driven?

- It's pretty simple. We work in a large corporation, we have a bunch of internal customers. And they are quite enough. It just seems that open source is suspended in the air, which is developed by itself, by hacker enthusiasts ...

Usually the opposite is true. There are large players who finance the development for some of their internal reasons. Why did Oracle buy Java from Sun and start investing in its development?

The fact is that when you invest in a platform, you give big bonuses to your development organization. If you have a product in which there is a performance problem, and you know that it is localized in runtime, you can come to your developers and say: “Guys, here's a bug! Now fix it quickly. ” And the developers understand that they are paid a salary for this and go to repair it.

About system developers and their career

“You are a super-famous man.” Surely a bunch of headhunters write to you every day on LinkedIn and try to outbid you. Why are you still at Oracle?

- You know, I’ve been trying to answer this question for myself for a long time and constantly answer others. I read in a blog or in some book that if you look at how the product layers are located in our industry, we see that there is hardware, there are operating systems, libraries, applications, etc. And if you look at the population of developers in all these layers, you get an inverted pyramid.
Below are people who are involved in a low-level foundation. And these people are very few. For objective reasons - there is a sufficiently high threshold for entry, you need a lot of experience and a good education so that you can do something reasonable there.

- In other words, to improve the Intel processor, you need ...

- You need to learn a lot and a lot. Much more than, for example, to write an Android application. To become a good systems engineer, you need 5-10 years of industrial experience. When you have already gained this experience, it seems that your place in this ecosystem is just at this low level. You can go from this level and write some kind of enterprise applications, of course, but this means that you are substituting all those people who are next to you here, delving into it all. There is a significant staff shortage below. There is so much work - just darkness. I am more or less soberly trying to assess the ratio of the work that we can do and the one that is needed. And this ratio is one to 10, or even to 100.

- That is, there are really not enough people?

- Yes, and this is not a feature of Java. This is a feature of this system level. There are few people, a lot of work, so there are good interesting puzzles. This makes it possible to choose what tasks you want to do. From 100 tasks that you have, you can choose tasks that give:

a) community profit;
b) the profit of your company;
c) profit for you personally.

- In general, this is a very interesting question - the question of money. Typically, the fewer people, the more expensive they are. I have a feeling that in our industry, somehow this is all arranged differently. So?

- No, system engineers are just expensive. Good system engineers are very expensive. And most importantly, there supply is substantially less than demand. And significantly less competition. There’s no such thing that I’m sitting in my garage and sawing my startup, praying that someone in the next garage wouldn’t come up with exactly the same idea of a startup and come out with it ...

In my area, all the people who are doing something similar to my case, you can read in one letter. And there is no particular competition. Therefore, say, fun industry conferences like the same JVMLS are a fun sight.

When you read a programmatic forum, there is, for example, litigation - “which language is better?”, “Which platform is better?” etc. And when you are at JVMLS, when people who really work at the "lower level" are sitting next to them, there is agreement and mutual understanding. Because everything is in the same boat, everyone is wildly overloaded, everyone has similar problems ...

This is such a great psychotherapy session. There is no “beating with a heel in the chest", there is no allegation that "we did better, and you all suck." There, naturally, understanding and fraternity modulo some personal likes and dislikes.

- And where can a modern student or a recent graduate learn system programming? I have a feeling that in modern universities they teach in textbooks and patterns of the 80s.

- And this is not very bad, because fundamental science remains fundamental. And it does not depend on market conditions. There are classic books, textbooks that have become classic in the programs of universities. It may not be in Soviet and non-Russian universities. You can just take the program of some Stanford or MIT, and see what they read, according to what textbooks they read it.

- For example, Computer Science 101 ?

- Computer Science 101 is a regular introductory course. And there are specialized courses and textbooks for them. If there is no way to watch a course online, you can just find a textbook, which is usually a classic work. And learn from it directly. In Russia, I know quite a few schools that do this. In Novosibirsk, for example, there was at one time a school that academician Ershov still did, a school of compilers.

- Does it relate to Excelsior in a modern way?

- Excelsior, I think, was born there precisely because there were people with sufficient expertise. They could get bored and make these kinds of products ...

And if after textbooks I want practice, then in most companies that are engaged in this kind of programming, there are open positions for interns and juniors. In practice, you can feel the difference between what is written in the textbook and what is happening in reality.

“Did it happen about the same with you?”

- Yes. As a student, I got an intern in the Intel team, which was engaged in Java runtimes. On the one hand, I studied at the university and read just the books of some Muchnik, Hennessy Patterson , Tanenbaumand others, and on the other hand, I had a real product at work in which I could try to transfer the idea from the textbook. Or just read the project and understand that this is what you read about two months ago in the textbook. Such is the theory and practice.

This is similar to what is called the “PhysTech system”: when the first three courses are given to students the fundamental foundations (mathematics, physics, computer science), and then they are sent to conditionally basic departments in scientific institutes, industrial production, where they are under the wing of a practitioner supervisors do real scientific work.

In our industry, the effective tool is about the same - the student needs to get a base at the university, and then join some organization in which to start using their knowledge in practice. Yandex, ABBYY, Sberteh and other companies, as I understand it, are moving in exactly the same direction. They want to take students and train them.

- The problem is known. Deficit in the industry is growing faster than academic institutions manage to supply personnel. Industry growth is around 15% per year. And that is a lot.

- I would, nevertheless, share the industry of applied software and the industry of system software. The size of the application software industry is inflating or contracting depending on what is happening on the market: investor willingness to invest in specific areas or startups.

And as far as I can see, there is always a demand for system-level programming, because this is the foundation on which everything works. And there is no big fluctuation in demand, because people who make platforms are always needed.

About technology exchange, “scientific” and “product” development

- Is Java actively “tyrit” technologies, features and ideas from other languages?

- I would not call it “tyry”, because in technologies, for example, runtimes, a significant part of the development is done either at the academy or in the semi-academic R&D Labs, which publish scientific articles or technical reports on how “they tried this idea. Burnt / not burnt. " People who implement runtimes read these articles: "Yeah, that suits us, let's try to implement this with us." It is clear that one article will be read by people who write runtimes for different languages. The main foundation where this kind of development takes place is the semi-scientific-semi-industrial alloy R&D.

- Do you have such laboratories in Russia now?

- For such work requires a unique set of skills. I do not know the complete laboratories that would be based in Russia, rather the individual people who participate in such developments.
If we are talking about R&D laboratories, which are funded by the industry, then the line between research and applied implementation is quite unstable. Often these are the same people.

- It is known that many engineering problems are ultimately solved by scientists, and many scientific problems are solved by engineers. In applied mathematics, this happens all the time.

“There are people at Oracle Labs who are more science-oriented, who try solutions in isolation from the product.” And their result is technical reports or articles. And there are product teams who, in an attempt to improve the product, take ideas from such scientific articles. At the same time, quite often some decisions are born inside the product itself, which they then notice and try to develop the scanners. This is such a symbiotic process. This is why R&D is funded by large corporations.

- That is, the feeling that companies are investing in this?
- This is not a feeling, I just see it.

About Who Moves OpenSource

- A little bit back to the question of open source, and in particular Java. On the one hand, it is developed and moved by the vendor, i.e. Oracle, on the other hand, is a community that exists separately from organizations. There are Doug Lee in the Java ecosystem that Concurrency drives a lot, but it doesn’t work in Oracle. How unique is this situation when the regional leader is outside the organization?

- Not unique. For example, things that are associated with ports on alternative architectures are not actively developed by Oracle. ARM64, for example, is mainly launched by Red Hat, because Red Hat is interested in this thing. There is still conditional Intel, conditional AMD, which are also interested in making improvements to the code generators.

“Can you see them at all?” Do people from Intel and AMD come to you, do they say "here you have the optimization for our latest processor"?

“They can't say that.” They say: “Look, if we co-generate like this, it will be better. Here are our performance data. ” And if the compilers say that this change is really successful, then all this is accepted.

“And what percentage of people in a Java organization do low-level tasks?”

- I never counted. If it’s rude, I would say that the people with whom I more or less often come in contact in my small area of JDK’s activity are probably about fifty. Frankly, I don’t know how many people are involved in some other features. You can, of course, look at the workchart and evaluate it, but there will be many who do the development.

- Hundreds of people?

“Hundreds two, three, I think.”

- In your opinion, is a low-level piece a large percentage of people?

- Yes, big. But still, this task is so complex that these people are still not enough.

About the complexity of low-level tasks

- And why is the task difficult?

- Mostly because there are so many moving parts. There are a lot of things you need to know, which you need to be prepared for. You write a compiler, for example, and you need to know the processor errats. To know, firstly, that they exist at all (and for many, the news is that there are bugs in the processors), that you will need to look at these bugs to understand that your compiler is not always guilty of non-standard behavior. You should know, be able and understand that the bug that you are fixing now may be connected with some kind of wild interference of past code transformations that were made up to the part for which you are type responsible. You need to know this stack all in depth.

Ranthimes in themselves are products in which components can be separated into a large cell, but in fact, these components are very closely related to each other. If you want to fix bugs, this often means that you need to fix bugs in different components, and if you are engaged in performance, then you will definitely be engaged in a bunch of components at the same time.

I’m happy as a child when my performance patch takes five lines in any one file, because this is a cool performance change, it is obviously correct and helps. But big, good performance changes usually require changes in all the small places of the whole big product. Therefore, this product needs to be known, and the product is huge.

- Does this mean that it is poorly designed?

- Not.

“Why then do such things happen?” Locality is considered to be one of the criteria for good design: in order to eliminate a problem, you want to dig in one place, not ten.

“It all works fine on paper.” In reality, two things happen: firstly, when you start to chase performance, it turns out that it is necessary that abstractions take place in some separate places, because this is how the gain is obtained. And secondly, bugs pop up.

You know something about the platform, how the platform behaves, that the processor, for example, accurately sends from register to register when you say “move” to it. If you rely on this assumption, you can write a beautiful compiler, but then suddenly it turns out that there is an error in the processor. What will you do to fix this? You drive a crutch into the code generator. Because it is a practical solution to a practical problem.

- And if the client has a farm, and there are 1000 broken processors on the farm ...

- If you read the sources of HotSpot JVM, you can see various horrors there and most importantly - many of these horrors are signed. For example, one may come across comments in the spirit of "this code is written ugly, but it is written ugly for some reason."

- And is this rule generally observed?

- Usually respected. When you fix such bugs, you are expected to write why you actually did such a Jesuit thing.

And in such places usually something is written like: “A naive person could have suggested that it could have been written differently. But here it is impossible to write otherwise, because the transformations in this place will make this graph of such and such a form. In general, go and read the bug from this link, there you will see a fifteen-page epic about why this thing does not work the way it should work. ” That is such small details.

Hardware transactional memory

- Returning to errata'm and processors. The classic book of Herlich and Shavit has a separate chapter on Hardware Transactional Memory. Could you tell us a little more about what transactional memory is?

- The rationality of transactional memory is that there is a synchronization problem when you need to make a coordinated change in several places in memory. When you need to make an atomic change in one place in memory, then you just do an atomic operation.

Another thing is when you need to make some kind of non-trivial transformation, which you do several readings, several records, you need to make this whole block atomic. You can make a lock, say that “we captured the lock here, we did everything under the carpet, released the lock,” and this works from a functional point of view.

Another thing is that on these locks will be what? Contention! Therefore, I want to get some kind of hardware mechanism, which you can say “at the beginning of the transaction, remember what we had, then under the carpet I will do something in the state of the machine, and when I commit the transaction, all this state of the machine will atomically appear to everyone else ". This is all a transaction. I have not just individual records and readings, I have a whole transaction that immediately publishes all this state or does not publish anything at all.

- This is similar to CAS , which changes the link.

- Yes, you can do it with CAS, but there are problems there - you need to do the wrappers there, which you will be CAS-based, and here you can do it as if from a bare memory.

- And the wrappers entail Allocation, Memory Traffic ...

- Yes. HTM helps to avoid all this, to do without unnecessary creation of wrapping objects. You can say that, they say, “now I started the transaction, I made 10 stories in memory, I commits the transaction, and I have either all of these 10 stories visible, or none visible.”

- How is the database?

- Yes. Such memory is therefore called transactional, because it is a transaction. But this requires hard support. Because there were already software implementations of transactional memory.

Software implementations - they are significantly slower, therefore, hardware support is needed. It is necessary somehow to explain to the hardwar that if I execute the command “start the transaction”, at that moment the machine says “ok, I’ll estimate that everything that is done after the start of this transaction is not visible to anyone yet, but I will publish everything on the commit”.

- Is it right in the assembler code that there are instructions like “start a transaction”?

- xstart. Then you say: xcommit, and the hardwar tells you whether he succeeded in committing the transaction or failed. xabort. Everything is fine. Everyone was waiting for the hard transactional memory to appear. Azul has been making transactional memory in Vega for a long time, and they have achieved, as they say, success with the exploitation of this HTM hash memory in Java.

- Only now Vega died.

“Vega is not what she died.” Gil Tene , CTO Azul Systems, said that x86_64 became so close to Vega in performance characteristics that it became economically unprofitable to support Vega.

- Yes, iron support is expensive.

“Yes, and why?” When there is a hardware vendor who has a factory and who sells you everything for pennies. Compared to the cost of our own microprocessor production, these are mere pennies.

And everyone was happy, and began to make their own small prototypes. Even we. The natural place where you can use HTM in Java is when you have, say, a synchronized trivial block, in which there are 2, 3, 4 stores. You start a transaction at the input, omit it at the output. The semantics are exactly the same.

- That is, JIT suddenly says that now we do not use locks, but use such cunning transactional instructions?

- And it was even done almost in the eight. But then suddenly, like a bolt from the blue, it turned out that some dudes found a bug in the hardware implementation of this very HTM in Haswell. And since this bug, as I understand it, was already in silicon, it cannot be fixed.

- That is, shemka curve?

- Yes. Therefore, Intel apologized to everyone and released a microcode update in which HTM was turned off. Since HTM is an optional feature, and to use it, you must check the processor flag. You can release an update of the processor microcode, which will say that "I do not support." This is a hardware manufacturer's mistake, but such things happen sometimes.

- With Pentium III there was a famous recall of processors. Apparently, this happens every 10 years.

- Not with Pentium III, but just with Pentium, in my opinion. Despite the fact that many people, like me, bought Haswell in order to try TSX, and then it suddenly turned out that your processor turned into a pumpkin ...

- Did you update the microcode in the end or score it?

- It automatically updates.

- And how does this happen?

- Update the operating system at boot time. She has a special bag in which the microcode lies.

- That is, for this BIOS do not need to update?

- As I understand it, it is done through BIOS / UEFI, the operating system tells the processor: "Here is your new microcode."

- Does everyone already have this?

- Yes, this is a normal strategy in terms of functionality, because there was a bug that damages memory. Better slower, but more correct.

- And what are Intel's future plans for HTM?

- They have a new revision. In my opinion, called Skylake. Haswell was touted as “a processor that supports transactional memory storage”, and Skylake was touted as “a processor that finally supported transactional memory correctly.” Let's see how it will be this time.

About the development of the Java community and benchmark war

- When Sun worked with the community, it was a complete feeling that Sun owns everything. With the purchase of Oracle, everyone was scared that it would be even worse. And now I have a feeling that Oracle, on the contrary, has invested heavily in the development of the Java community. How true is this feeling?

- When you talk about such things, you need to separate the subjective sensation from the objective. Sun, of course, worked great with the community, at conferences talked about how it is developing. However, if you read the reviews on some JavaOne that Sun conducted, that Sun annually announces with the same words that they release such a feature. You can simply see what and how by objective criteria. How many years has the seven been released?

- 5 years.

- When was the seven released?

- in 2011.

- When did Oracle buy Sun?

- In 2009-2010.

- Here. Because Oracle, having come and bought Sun, said: “Guys, stop trying to eat the whole elephant, let's eat the elephant in parts. Here we have the basic set of features that go into JDK 7 and we are releasing JDK 7, go ahead. "

- You rather talked about the release model, but I'm more interested in the community.

- It’s hard for me to compare, because when I worked at Sun, I didn’t work very well with the community.

“But you were at Intel, and you did Harmony there, as I understand it.” So you saw everything from that side. And that was before the JDK open source, in my opinion.

- Yes. OpenJDK was released in 2007, and we did Harmonyfrom 2004 to 2008. Here the story is such that few people were interested in developing the runtime itself until some time. Because there were quite a few vendors who made their own JVM.

There was an Oracle that made JRockit and a stack built on it. Then there was and remains IBM, which makes the J9. There was SAP and others who made their own JVMs. That is, there was competition between vendors.

I participated in the so-called "benchmark wars." When you have Java as a standard, as a specification, there are several different implementations of it from different vendors, and you show on benchmarks which of these implementations is cooler. This led to quite large improvements in runtimes, this cannot be denied.

However, in these benchmark wars there is one small minus - after a certain point, these benchmark wars turn into savings on all sorts of matches that do not help real applications, but help specific benchmarks. And this is normal. That is, you first optimize something that helps everyone, and then there is nothing left to do, like in a competition, to optimize things that you in your right mind would not optimize at all. For example, put a cache in front of the HashMap that caches the first twenty thousand longs. It is clear that this is not the optimization that the average user needs, it needs a specific benchmark.

Now, in my personal sense, benchmark wars at the software level have fallen to nothing, they are still happening at the hardware level. And now we have OpenJDK as such an industry-forming project. This is a collaboration of many companies. If before everyone had a proprietary implementation, everyone picked something in their own sandbox with a screwdriver, now there is a common implementation, in which a bunch of common infrastructure for runtime is implemented. And so the sane strategy is to develop one implementation.

- Are there any projects similar to OpenJDK in terms of impact on the industry?

- GCC, LLVM and similar big projects. After all, OpenSource is born to save company money so that every company does not invent a bicycle. On the one hand, it is beneficial to use other people's developments, and on the other, each company has its own and shares them with others. Therefore, it seems to me that the community with respect to runtimes began to develop with the advent of OpenJDK very much.

Returning to your question. I do not think that the rapid development of the community was associated with the purchase of Sun by Oracle. It was rather connected with the structure of work. When you have a proprietary implementation, you are less likely to make the vendor hear you. And if you have open source, you can safely make some changes at home, if you turn out to be good, you can give them to the community. This is the model that LinkedIn, Twitter and others are trying to do. That is, it’s rather not about the relationship of companies, but about the approach to development, about the structure of development.

About mutability and immunity

- Recently, a very fashionable approach to development on immutable objects has appeared. But immutability generates a huge amount of location, traffic, load on the GC and a blow to performance. So do we need immutability or not? Who is right?

- No one. No silver bullet, forget about it. A fully locked application is bad. And a fully immutable application is bad. Bad for the reasons you talked about. Loki leads to various kinds of performance problems, problems with correctness, with deadlocks and livelocks . With immutability, you can make applications that are correct by construction, but on real hardavar, with real data, with real time-lapse implementation, they will give many performance effects due to the large number of allocations.

Where the right boundary between how to cross these approaches lies is an open question. I understand well and, rather, adhere to the point of view expressed by those who write in Scala and other expressive short languages. They say, and in this they are right, that a very small part of this program is responsible for the program’s performance, therefore the primary task that you solve in development is how quickly you write the correct program. Therefore, these people say: “Let's write everything on Scala, make a good design, and this program will be monitored, readable. And we will know what her parameters are. And then in the right places we hack as we need ”- and this is quite a normal approach.

In Java, we do exactly the same. You write in idiomatic Java, then you start to perform something. You hammer in here, demolish objects and switch to primitive arrays there, remove any accesses at all, uncover Unsafe and begin to arrange extravaganza and bacchanalia ... That is exactly the same story - languages are needed in order to more conveniently solve target tasks. Different languages have different goals, so there is no perfect language. I do not think this is bad. I do not think this approach is “hipster”. It is quite normal that if you write in a language that allows you to quickly write the correct programs, and then they also give you the opportunity to download them so that they are also fast - for God's sake.

“I'll explain to you where the story of hipsters came from.” At the Java Tech Days conference in 2011, where we met, we were talking about debunking performance legends. Now I have a feeling that this whole story about immutability has also become a City Legend, and the people who talk about it do not look very hard at the root ...

- People love binary answers. “Loki is bad, and CAS solutions are good.” Or "Java is bad, and Scala is good." For me, this is a litmus test. As soon as a person tells me that something is the perfect solution for everything at all - a red flag rises in my head, and I begin to carefully ask the person if he really thinks so. Because when you gain experience, you understand that there are no perfect solutions.

The wisdom of an experienced developer is to look at a problem and understand which tool to solve this problem. And for this we are paid a lot of money, and not for the fact that we are sitting in the office, lounging in an office chair and writing on the forums. We are expected to be able to select a tool for a specific task, on which this problem can be solved cheaply, angrily, and correctly. That’s what professionalism is all about. People, of course, will repeat about the errors, and this is normal, because we have the principles of saving thinking in our heads.

Of course, you have established general rules, but it’s very easy to forget that these are not rules that work in 100% of cases. These are rules that have limits of applicability, they are based on your complete or incomplete understanding of the problems, which are based on the current state of the industry, etc.

One of the tasks that I set myself as a performance artist is to find answers to performance questions, and not just say with memorized phrases “this is good, but this is bad”. Not. You have to understand when “A” is good, and when “B” is good. And the case that you are asked about is case “A” or is it case “B”? You need to choose the right thing. If I didn’t have to choose, I could be replaced by a simple script that says yes or no. Nature is complex. Programming (and this is still part of nature, no matter how we are taught that this is an abstract thing) is also difficult.

About Unsafe

- touched on the Unsafe topic. Then it got bombed in the summer . Can you tell me what is your position on this issue?

- The trick is that if you have any kind of speculation, then you can safely write it on your blog, collect a bunch of votes and a bunch of discussions on the forums. And if you are an authority, you cannot give any kind of flame answer, you have to think about it, weigh all the pros and cons. This is a completely different level of discussion.

What I observed in the summer was real hysteria, when people who did not understand what was going on began to sharpen the pitchfork and set fire to the torch in order to go and wet the developers.

The fact is that Unsafe has always been a private interface. This is such a garbage dump that is needed so that the standard class library gets access to low-level chips. That it was possible to get unsafe from somewhere and start using it is a historical accident. There were no such levels of protection in Java that would forbid it.

And people began to write on all sorts of StackOverflow and in their blogs: “See what evil developers are hiding from us! What wealth is there, how we can use it drop dead! ” And they started to use. Despite the fact that Javac tells them that it is a proprietary API and can be removed in the next release ... And when it “suddenly” appears that it will be removed in the next release, they are somehow surprised!

I understand the people who argue that if there were no access to Unsafe, some products in the Java ecosystem would not have been born. I also understand that reflexes can hack some parts of the JDK. But I don’t understand the situation when your product and business with a huge profit depend on a private non-standard API, and you sincerely wonder when this API leaves, your business is ruined. My question at this moment arises: “Where are your damn business analysts who made such a mistake? You had 8 years to offer a solution that would be standardized, in order to keep your business, so that a standard solution appears for your particular task. ”

This is a normal approach to product development, this is what the community should do. If you want some feature, then you are poured into its development financially, by time, by engineers, by anything. Open source is not a place where they do free work for you. Open source is a platform where you are given the opportunity to work together, make your own features and use the features of others. This means that you will do some features with your own hands, and not the vendor will do everything for you.

- In my understanding, there are 2 problems, one of which strengthens the other. Firstly, the lack of distinct evangelism that there is a Java specification and an implementation. For many, it was a revelation that “Java 8” is a specification, and “JDK 8” is a specific implementation. Secondly, all this has led to the fact that now people still very poorly share all these concepts. Wasn’t it a mistake on the part of Sun and Oracle that they didn’t immediately start hitting hard?

- Sorry, what does it mean to “beat hard”? javac tells you: "sun.misc.Unsafe is proprietary API and can be removed in future release." Your Release Notes says what you can and cannot do with sun.misc classes. It says that these are proprietary APIs that are not guaranteed to work. Any Oracle employee has told you for years that Unsafe cannot be used because it is a private API. Many people who used unsafe in their projects said that if your product runs on unsafe, but you don’t have a fallback strategy when unsafe is not available to you, you have the wrong product.

- But many use Unsafe. I’m sawing a startup, looking at what’s on the market, and putting, for example, Hazelcast. And it seems to me that I have a contract with them, that they are my vendor ... They did not think, and I built my business on this. It turns out that idiots in this situation are everything. Now it is clear that it was necessary to talk more about this ...

- It seems to me that the industry has long denied the existence of such a problem. The same dude who writes all kinds of articles in Hazelcast about how he heroically bends Oracle to save sun.misc.Unsafe a year ago at the Joker conference, he sat with us for a beer and said: “Well, you’ll never drink unsafe! ” And we told him that Unsafe will leave. And suddenly, a year later, it turned out to be some kind of hyper-news.

And in general, you must always remember the difference between private and public. If I’m a developer, I need to make a private implementation, I use the means of the language to make it private, and there, if you crawled into this API with your dirty little hands through reflection, then you yourself are vicious Pinocchio. I wrote my program so that it could not be accessed by normal means. This is my private implementation, and for this I get the right to do anything with this private API.

- You have raised a very correct topic. And the problem was that in the design of the language, if we dig deep, for a long time there was no normal tool to make normal encapsulation. And what is not forbidden is allowed ... Here, in my opinion, it is too late to be surprised and something needs to be done.

- At the same time, I was pleased and disappointed by one answer, on one of the threads on Mechanical Sympathy about Unsafe. A dude came there who said: “You know, our product had exactly the same problem. No matter how we wrote in our documentation and blogs, we could not guarantee that people do not use our private APIs. As a result, we had to, firstly, not give any documentation for this API, and, secondly, even change the implementation from release to release. ”

I’m reading this answer and thinking: is this what you want? If you violate the rules of decency by climbing into private APIs and demanding their stability, then you will push the developer of this private API to hide them even more. Although there are legitimate cases and sane ways to use a private API, for example, to circumvent some kind of bug. That is, using this private API abnormally, you can calmly deprive yourself of the opportunity to make its “regular” use.

About JMH, benchmarks and narrow specialization

- Then this is the question. You have been actively developing Java Microbenchmark Harness (JMH) for the past three years . Most of the commits you make, this is almost your project, which is now under the auspices of OpenJDK. So?

- This is a project that is serviced by a performance team.

“And there is one problem with him.” Say "it's faster than that. And this is the proof ”- this is not proof, the analysis does not end there. It seems that most people around this simply do not understand.

- With JMH, the story is as follows. You always need to ask yourself the question - who benefits. Why does a performance team do JMH? She does it in order to facilitate our own activities, because we have studies that inevitably lead to benchmarks. In order to do this research, we need tools, but all this is not limited to tools. Because the main thing that you can get from your experiment is not numbers. The main thing you can do is extract knowledge from numbers. As a rule, in order to extract more or less reliable knowledge, you need a system of theories, each of which must be confirmed by these experiments. There is nothing new in this - you take the philosophy of science and it is exactly the same done.

Performances in relation to the product are like natural scientists in relation to nature. We build high-level models and do this through experiment. JMH helps to do experiments, and the main word in this phrase is “helps.” But does not experiment for you. It helps you not to step on the obvious rake in benchmarking in order to save time and deal with the non-obvious specific things of your experiment. So you can write quickly [start - approx. red] benchmark, spend the rest of the time understanding how to write this benchmark correctly. And do not spend hours trying to fix a stupid mistake with dead code elimination, for example. That's what it is for.

It helps to solve light problems that still need to be solved in each experiment. It does not help solve the specific problems of a particular experiment. The analysis is needed in order to understand which insight you can give from this data, whether it is possible to trust this insight, what place this insight is in the general system of your knowledge.

We often sin this because when you write a blog, you make a benchmark of some feature - and we show experiments there that target this feature specifically. In fact, there is still a chain of these links behind this experiment. For example, we know that this harness behaves normally on this machine, that we calibrated this machine, that it does not fall into thermal shutdown. Or we know that such and such an approach to testing works - we validated it separately. In principle, this can also be written on a blog, but then it will not be a blog, but a whole book about what stage experiments we have done before, in order to make sure that the experiment we are doing now can be trusted. You can’t just put a benchmark and assume by default that the benchmark is correct. This does not happen. The benchmark is correct,

- Now look what happens: I need to find out about the performance, but I'm not a very big dock in this. I refer to certain authorities, to you, for example. I’m doing a benchmark, measuring, building a hypothesis about a performance model, about specific instructions. I take the assembler listing, I find (or do not find) these instructions from my application there - and voila, I checked my hypothesis. The trick is this: to understand what kind of instruction to look for in this assembler listing, you need to have a background, you need to be able to solve the same problem without JMH, which I am trying to avoid using JMH. So?

- If you want an answer to a question in complex topics that you don’t understand, you must find a person who understands this topic and ask him.

My company teaches me one simple thing: if, for example, I answer a letter with a legal question and start writing “I am not a lawyer, but ...”, then I must close this letter, pick up the phone and call a real lawyer, who knows the correct answer to this question. Exactly the same story with performance. You can say: “Of course, I’m not a performance artist and I’m not a big dock in this thing, but it seems that the performance data shows this.” But at this moment you should do one of two things: either you should become a performance artist and interpret these data normally, or you should go to a person who understands this issue and ask his opinion.

This is one of the reasons why large organizations that are seriously investing in performance have performance teams. Because they store people who know how to answer such questions. Not because they are very smart, but because they have experience, a system of knowledge. Their job is to keep in mind this general knowledge. And to say that I close my eyes, look at this data, they will confirm my assumption - this is confirmation bias .

For example, I will not speculate on which of the application servers is better. I do not know the answer to this question, I understand this, like a pig in oranges. Of course, I worked with application servers, but there are a lot of unobvious things there. And there are experts who are brewing this, they know what is happening there, and they can be asked for advice. And for me it will be the height of vanity to write, for example, on a blog: “But I launched something on Glassfish, something on Weblogic, Glassfish fell with me with such an implementation, and Weblogic with such an implementation. And from this I pretend that Weblogic is better or Glassfish is better. " Well this is stupid!

For some fragmentary information, I am trying to extrapolate a huge amount of knowledge that I should have, in theory, gained for decades. So the maximum that you can do is to become a professional in the field in which you need answers, or ask professionals in this field. Everything else is unsteady land.

Tags: