“Engineers' Work - Giving Claims” - Interview with Sergey Kuksenko of the Java Performance Team

Imagine that you came to a meeting of JUG.ru or CodeFreeze, or for example to a Java conference at which Sergey Walrus Kuksenko , a developer from the Java Performance Team, just spoke . And now, for some reason, all the other listeners fled, and you and Sergey were left alone. And suddenly, he is in no hurry, and he has a free hour to answer your questions, of which there are so many ...

Meet: today we have an absolute exclusive - a great interview with Sergey Kuksenko ! From the interview you will learn:

how the java performance team works
in which areas of Java is active performance work being conducted
why do you need hardcore at jigs and conferences
what a performance engineer should know
what is highway and where does the border go
what is happening right now with java strings
in which direction do tuning of runtimes evolve

- Sergey, you often make presentations with us, and your reports each time become more and more difficult and more complicated from the point of view of the material that you are telling, and go deeper and deeper into all kinds of iron things. When you tell, you look in the eyes of people - do you see some kind of understanding there, or does understanding get lost as you deepen?

Kuksenko: This is the most important feedback that I generally try to catch on my reports. The fact is that when I see the response of the audience, I feel some pleasure from the fact that my report was not wasted, and on average, if I see at least a dozen active listeners, I think that the report was a success. This happens quite often.

- So you have a certain percentage for which you are trying to hold on?

Kuksenko:Yes, I bring the information, I need it to reach someone, so that I make a presentation not just for myself.

- And how to make this choice, when in the audience you have people with different backgrounds, they have a different level of understanding? Do you say in advance that “I will tell a fixed complexity”, or do you adjust the complexity?

Kuksenko: I announce in advance that there will be a rather complicated report, and if I have information about the audience, and I know that there will not be enough interested people, then I will not conduct the report, I will not participate in such weak conferences.

- How do you find out information about the audience?

Kuksenko: As a rule, after the fact.

- As the organizer of the conference, what can I tell you about the audience that can affect your report?

Kuksenko: I do not know the answer to this question. As a rule, I evaluate the level of the audience on my visits to the conference and decide for myself whether I will go here one more time or not.

- You have been performing in Russia for four years now, no less. Have you started in 2012?

Kuksenko: Somewhere since 2011.

- Do you see that the audience is growing? Not by quantity, but by the degree of understanding of what you are talking about?

Kuksenko:Honestly, no. I see a little feedback - I see that the audience is tired, I see that people get tired of the complex reports that we report from year to year. Well, at least that seems to me. I see that quite interesting people whom I remember at various conferences leave this area. Either their career begins, or they switch to managers, or something else. And I do not see a serious replacement yet.

Perhaps the effect of the first reports worked, when people had some hunger for technical things, they asked a lot of questions. Now I see a decrease in such interest from the audience in complex technical reports.

- This is interesting, because I really like that you and Lesha (I mean Alexei Shipilev - author's note) do not do self-copying. In 2011-2012, there was a topic of performance, and then it grew a bit.

Kuksenko: We all told. Why repeat it? Since 2012, nothing has changed in this area.

- How active is the performance science now, and maybe performance in Java, in particular?

Kuksenko: I will not say anything for performance science. I do not know what is happening in science. Something is probably developing. Performance is not so much a science as applied engineering, when we have something real, and this real must be shoved into some kind of time frame, requirements, etc.

Java makes changes in various areas. There is no global plan: "By 2020, universal happiness will come, and we will perform any operation in one picosecond." We have a product, we find some places, we tighten it up here, here ... New things are invented, adaptation to new hardware is underway, etc., that is, the usual process.

- How fast is iron changing so that people who are engaged in performance can use new iron chips?

Kuksenko:Intel rolls out a new microarchitecture with interesting enough chips every two years, but if you start to look at more or less serious studies, it turns out that some kind of super-duper new feature came out, then a new architecture ... And how much we got the average nature of performance in terms of average temperature throughout the hospital? Well, plus 6%.

Periodically new features of the Software appear. The Software part does not keep up with this. She is trying to catch up, especially at the Java level, where we must present more or less architecturally independent solutions, we still have to chase this.

- Vectorization and vector instructions have existed for many years in modern processors. Often, the Java virtual machine is accused of not using them much.

Kuksenko:Here the problem is twofold, in that, as a rule, all these instructions are sharpened for specific scenarios and user cases, and the question of defining these user cases from an abstract code that is not sharpened for this architecture remains open. Automatic vectorization is an issue resolved for simple cases. But a step to the left, a step to the right - and vectorization algorithms stop working.

It’s clear why this is required in the hardware - because there are millions of developers who can sharpen a platform for a specific platform in a fast way to a la assembler level using some Intrisics and other things, native code, and get the necessary things. Our goal in Java is a slightly different goal, although we do not disdain it, and the most key things are sharpened in exactly the same way.

- Can you give an example when you sharpen manually?

Kuksenko: Firstly, string operations. Most of them are tailored for specific hardware, and soon there will be some small updates in this area, there will be more, more and more. This is one area.

The second area, which is always classically sharpened manually, is the area of cryptography, because we have new instructions in hardware that focus on faster and more secure cryptography (for example, the random generator is real in the latest Intel glands), etc. And it is obvious that from the point of view of cryptography in the field of security, ignoring these capabilities of iron would not be very good. But here with the same cryptography, the key question is not the issue of application performance, but the issue of application security. Using a real random generator, we increase entropy and increase our protection, using cunning vectorized commands for the encoding itself, we allow ourselves to cram into the same time frame, for example, more complex encoding algorithms, which increases security. Here, performance is mediated. And here a big question arises, how much does a real user need vectorization.

- That is, it is not at all a fact that is needed? It's just that at all conferences this is a favorite question, when something concerns Java and performance - just a matter of vectorization. Everyones likes to poke a finger, and say that ...

Kuksenko: And how often do sishniki really achieve a good vectorization of their products? Throughout the history of presentations at the level of JPoint and Joker, at the last Joker I received one single feedback when I showed that “I have an example. So we overclock it. And here a small vectorization happened. ” A man came up on the sidelines and said: “Yes, cool, you showed wonderful. "I wanted to disperse one place, now I will seek to make it vectorized." For all this time, the one and only person who really needs it, and he knows his tasks, has come across to me.

About performance hardcore

- When you talk about performance basics, it’s clear that more or less everyone needs it. But when you tell very advanced things - for example, the case when the CPU is 100% loaded, as in your last reports, then what part of people really need it? The feeling that every hundredth.

Kuksenko: If not less. We focus on this audience.

How do you feel about the educational aspect of such reports?

Kuksenko: I do not consider them as educational, I consider them as more familiarization, and just show that "There, there is a lot of everything that can be done." And the person who needs it will understand that he can move in this direction and not stagnate.

- I spoke a couple of days ago with a friend Oleg Bunin, a person who did the Highload conference, I asked him what Highload was in his opinion, and he said such a thing: “Highload starts when it becomes important to you what is inside of you occurs in the gland. As long as you don’t care how you are running, it’s not Highload, as soon as it starts to worry, it’s Highload. ” How can you comment on this statement?

Kuksenko:For me, so Highload is another buzzword. As I always said in my reports, productivity is a binary metric: the client is either dissatisfied or satisfied, and then all our inner kitchen begins on how to measure this, how much money we will spend on it, we will buy more iron, or vice versa , twist the knobs of the software, etc. That is, the question of where the Highload ends, where the Highload begins is the question of where we draw the borders of the colors. The spectrum of the rainbow is continuous, but we say that "This is green, and this is red", but in reality we cannot clearly draw on the border that "Here begins green, and here red." Here, green and red are obvious to everyone except color blind. So with Highload, and with all things. But there is a small requirement that we voice in all our reports: “If you want to engage in the performance of your application, if you need it, you must imagine how everything works from top to bottom, the whole stack: how your proposal works, how the application server, operating system, hardware, Ethernet wire, etc. work. And if you know how all this works, then you can achieve something and get some winnings in this place. ”

- You and Lesha began the famous series of presentations in 2011-2012 with a story about what software engineering is and what is performance engineering, and what is the difference, this was one of your first slides. And on the toolset? As a Java engineer, I can roughly imagine what a typical Java engineer has, and maybe even a Java enterprise engineer. And what is the performance tool of an engineer’s performance? What kind of performance-engineer tools, in particular do you use in everyday work?

Kuksenko: bash! First of all, it depends on the tasks. The classic toolset, if you work with an external benchmark - this is some kind of profiler that clings to it. The profiler can be any, they are all the same, by and large. And it is clear that there is an advantage in Oracle products like VisualVM or Mission Control.

If you have to move to a lower level, then you begin to use as standard what is called Oracle Solaris Studio Performance Analyzer (long name) - a fairly effective tool. And the second, for small things, for an interview, for understanding the point of view, is Linux perf. Practice shows that lately I usually use this bunch: perf for an overview, and Oracle Solaris Studio Performance Analyzer for more or less serious digging. Other utilities are unnecessary. Maybe JFR occasionally see something that doesn't work out. JFR, Recorder and Mission Control, but just look at what does not go beyond the level of Java.

- What about the amplifier, what do you think?

I can’t say anything about Amplifier, because I have never used Amplifier. I used this product five to six years ago, when it was simply called Intel VTune, and from the point of view of working on Intel hardware, the Windows platform was ideal at that time. But there were problems with working under non-Intel hardware, and with working with something other than Windows in those years. Now I heard that the guys in VTune Amplifier made a very serious advance in automatic analysis, in the sense that it stops just showing tons of different numbers, numbers, etc., trying to highlight key problems.

- That is, in fact, give an interpretation?

Kuksenko:Yes, he is trying to do some accounting, a classification of the problems that he finds. I saw this out of the corner of my eye at the presentation, in practice I did not use it.

- That is, it is not clear how true this is, how much is marketing?

Kuksenko: If you believe the presentation, then this is true, and it should work, but this is a binding for Intel hardware.

- 10 years ago we had AMD, like a player everywhere - laptop, desktop, servers - now we hear less and less about AMD, but ARM and ARM architecture have appeared very seriously in recent years. How often do you deal with ARMs of any kind in your daily work, and what can you say, are there any differences with them?

Kuksenko:In everyday work, I dealt with ARMs zero times. I only once did my experiments on ARMs. When I made one of my presentations, I was curious to make parallel measurements on ARMs. Then I took these measurements, and simply compared it with Intel architectures. Therefore, since I don’t really work with ARMs, I have nothing to say about it yet - there is no basis, although the platform is quite promising and quite aggressively crowding Intel out of various niches.

- It feels like it is developing much more interesting, because what you talked about about 6% growth is a feeling that Intel has stagnated a little. The feeling is that the performance is not growing. Energy efficiency may be growing, something else like encryption appears, but it seems like the growth is in breadth. A laptop five years ago is from the point of view of performance relevant.

Kuksenko:In terms of performance, which is used by the end user, it turns out to be exactly the same. Just like you could watch your films on Youtube five years ago, now you watch your films on Youtube, you don’t see any difference. Like five years ago you went to Twitter and wrote some mail, and now. You do not notice the difference. The fact is that base platforms have long achieved the performance required by the end user. Of course, all sorts of 3D, Blu-ray, super-high definition images begin, but this is a separate area, not in the area of computer performance. But really, thanks to my colleague Lesha Shipilev, he recently evaluated the retrospective of the industry on his Twitter. He wrote that he tried on a benchmark whose score he remembers well from our work at Intel 10 years ago. And he notes that over the 10 years this benchmark has become 50 times faster on his laptop than 10 years ago on the server machine. I think a 50-fold increase in productivity in 10 years is quite normal progress.

- That is, in fact, is there development? How significant is this? Do you remember an example what, what is this benchmark?

Kuksenko: I remember him well, but I think that we will not discuss it. I think that when you interview Alexei Shipilev, you better ask him about the relevance of this benchmark, because he wrote a replacement for this benchmark.

Thong

- Let's talk about String. It is no secret that various studies are being carried out there now, related to how one can record more compactly there. It is no secret that in recent years this class has begun to change. Replacing substring (), and what was in JDK 7u6, etc. How dumb is it to change the base class of the platform? How much work with this class differs from work with any other, how much more complicated are some changset's accepted there? Because it is a very visible area, and it is surprising that active work is taking place there now.

Kuksenko:It’s not a question here: “Let’s come up with some kind of changelog and look further”, but here’s a question of the winnings we get from this. And since, if we have a class that we know that occupies 50% of the memory in any application, if not more, then let's still do something with it. It’s time to do it, especially since we had old developments, experiments that were not hidden.

- Why did you work in this area before, but are you intensifying it now? Over the years, a lot of questions have accumulated for thongs.

Kuksenko:I would not say that the work in this area stood. They were done. Perhaps they were not always brought to the end, and remained experimental, but at the same time they were available at the level of play. A vision just formed how it should be done. It is very simple to do optimizations when you have 10 patterns of use, and everyone jumps in these 10 patterns.

You sit, parse these 10 patterns, optimize behavior for them, and get a win.
When your class is used in a thousand, a million different ways, and you have a huge cost of error in this place, it is obvious that you will not immediately rush and write something from the condo. Here, first you need to see how the gain is for these, and for those that do not get worse, this get better, and at the same time nothing breaks, etc. That's all the actions, all the work.

Oracle never hid the fact that the greatest costs for man-hours for the development of Java, and Sun, too, they are not in the field of development, they are in the field of QA.

- For the current day, according to your estimates, without switching to personalities, how powerful is Java in our QA, in Oracle Java, in OpenJDK? That is, do you often miss something serious there, in your opinion? Or is it so hard to say?

Kuksenko:It's hard for me to say. I try not to go into this area because it is too big. We have wonderful specialists, I think they will respond much better. We have QA architects who know more about the state of affairs.

- In JDK 7 update 6, substring () was changed. Interesting: the change, it would seem, is small, but very visible. And the consequences of this change: how long did the Customers resort to you and kick you for this changer, or did you work with them beforehand? It is very interesting how the analysis is done for such changes. Relatively speaking, relatively speaking, this is an understandable trade-off.

Kuksenko:It’s not even a lot of trade-offs, and how much it was clear that the Customers will resort ... Firstly, they never resort to us personally, we practically do not work with specific Customers. I know that there are some customers who run and shout: “We will not switch to JDK 7 Update 6, because we don’t know what this change is, we don’t want to check it,” etc., but this is not what we just took and made this change with substring, with the removal of the Offset field and so on, this is one of the necessary steps that move us to the final goal - lightweight thongs, the advantages and benefits of which are obvious.

- What is it?

Kuksenko:Now the line is our object and array. If we glue them together, we get a string that will have all the benefits of the location, which will take up less memory, because the headers are not needed there, and this is especially important for short lines, and so on. A lot of wins, a lot of benefits, there are a huge number of academic experiments in this area, our beloved Linz University did something. That is, there are people who say that "You need to do this, we even made the proof of the concept and got such and such winnings." And it is clear that we are also moving in this direction, we just can’t afford to make the proof of the concept, we must make the final decision, so we are moving, maybe slowly. And it was obvious that this change was an important necessary step on the way there, to how we see it. But let's take steps in parts.

About the performance of the Stream API

- Java 8 was released a year ago. You talked a lot about Streams, about the Stream API, about Bulk Data Operations, which are part of this entire project. Do you actively use all this in everyday development? And the second one. Those hopes that were entrusted to him from the point of view that he would give an advantage in particular on multiprocessor machines, that this is another level of abstraction, another approach to programming - how did they come true, in your opinion? Do Stream's provide any real performance boost?

Kuksenko:If I write code that is not fixed by external requirements, then I always write under Java 8, I always write using Stream'ov, although I already use JDK 9 for such purposes. I am very pleased that, being present at various conferences, I interview people, and I noticed that a fairly large number of people switched to JDK 8. I really like streams in that the code is much more compact, and so on. understanding, perception. Still, programmers more often read code than write.

In terms of performance, this is a well-known issue. As I said, answering the same question a year ago in Kiev, I believe that classic users in their statistical mass will not notice huge performance gains or huge slowdowns from switching to the Stream API. People who specifically aim at this business, they will surely squeeze out a lot of useful things from this.

- And are there some cases when the Stream API loses by the performance of the classic (Collections) API?

Kuksenko:As a rule, the Stream API loses in performance to the classic API simply because the Stream API implies some overhead, some actions that we must take in order to build a good Stream and start working with it. Question: what will we gain by making these some costs? Without parallelization from the point of view of performance, there is nothing to win, except for a more compact and better readable code. The question of parallelization, it was also talked about a lot, and various examples, templates, graphics were written when it is better to move on.

Now the most cautious estimate, which is higher than our internal estimate, and which, in principle, is quite reasonable, was proposed by Doug Lee: if the amount of work to be parallelized generally exceeds 100 milliseconds (in fact, microseconds- approx. from filing apangin ), then by parallelizing it, you can get a win with a probability of 99%. If the amount of work is less than 100 microseconds, it is better not to bother with parallelization. According to our measurements, this is 10 microseconds, but there already ... This is a safe border.

- 100 - a more conservative estimate?

Kuksenko: Yes, and more than enough. I would switch to Stream API only because the code is compact. There are some problems with hotspot and JDK. Not that there are problems, but we know that there is not very good here and here, we know that it is here and here that we need to improve, we are now thinking about how to do this.

Doug

“You remembered Doug Lee.” It was some surprise to me how the interaction with Doug was built, because there is an opinion that Oracle, even more than Sun, was a very conservative organization, even from the point of view of OpenJDK, and didn’t actively let in external commissions. In principle, the situation has improved with Oracle, and there are more and more external committers who are contributing something to the platform, but with Doug, the story is very special, because after working a bit inside, I was so lucky and looked at how interaction with Doug is being built, it turns out that this is generally some very special person who in a special way stands in relation to the organization. And in this sense it is a phenomenon. It’s just that cancarensi is one of the main directions in which modern performance, iron, science and all this is developing.

Kuksenko:If we take a specific area, we can find in this area a person who is a recognized expert. It’s good if there are 100 experts in this area, and we can simply hire a dozen of them, and they will do this area with us. This is useful for our product, and for the entire system. It’s bad when there are outstanding experts in this field, because an outstanding expert requires outstanding conditions, and in this case, sometimes, it may not be possible to hire and order from the top as large corporations like to do something, and in this case you have to negotiate with such a person, because it benefits everyone, the development of the product, if we can agree with the person outside on his contribution to this particular area. Doug Lee is an outstanding person in that field, an expert who is hard to find an equal, therefore, I have developed, I don’t know, historically or not, such a collaboration scheme that he sits at his institute, does things, invents, checks, does them, and they come to us in Java, because everyone benefits from it. We are developing our product, and so on.

- And besides Doug, there are other people who work with the platform in a special mode on special conditions?

Kuksenko: Maybe I don’t know.

- That is, at least in the performance there are no such things?

Kuksenko: We have Oracle Labs, and many things related to High Performance, High Concurrency of things come from them. Oracle Labs is a more academic organization, therefore it is a more academic world, they have their own contacts, their own kitchen, and ready-made solutions come to us that we simply bring to life.

- Performance, it is everywhere, at least in runtime. If you look at GC - performance, look at the collection - performance, Streams - performance, VM - performance, JIT - performance ...

Kuksenko:What does performance mean? Performance - the speed of a certain process. And time is everywhere. Hence the speed of execution of any processes everywhere. This we are already leaving computer science, come to Einstein.

Team

- You have a performance team in Java, it is probably correct to say the Oracle Java SE Performance Team. Tell me about the team, what kind of people are there, how many there are, and how your work is built.

Kuksenko:The team has less than a dozen people. Purely technically, our job now is that we take various features, various subprojects for the development of JDK, and are engaged in their performance, at two levels. First: make sure that there are no serious stocks. Second: if we are convinced that there are no stocks, and we can offer some improvements, we offer them. The project is quite independent, as a rule, we rarely intersect in work.

- That is, one engineer - one project?

Kuksenko: As a rule, yes.

- Can you give examples of such projects?

Kuksenko: Lambda Project, Jigsaw Project, Application Data Sharing. In String, some specific improvements are proposed for synchronization. Performance Improvements for the G1 Garbage Collector.

- How relevant are external performance-contributors, except for Doug (Doug, probably, cannot be considered external)? Relatively speaking, there are all sorts of concurrency-interest and other sources where there is discussion all the time. How actively do you accept community patches, and what about the quality of such patches?

Kuksenko:In this sense, we are in a similar situation: we do not accept patches, we offer them. There is a code owner, for example, guys from Hotspot, and if we have some kind of performance improvement, we offer them our idea of a patch. Similarly, some Vasya Pupkin can come to the OpenJDK mailing list and offer a change and improvement, a patch. And the question of whether or not to accept this or that patch is decided by the code owners, in this case the Hotspot team or the Class Libraries team, and in this sense we do not make decisions, we offer. But we can verify a third-party solution, we can provide some kind of measurement data, but the final word is always up to the owner of the code, no matter how we argue with it, because in addition to performance, there are requirements for the quality of the code, its functional correctness, in the area of security, And so on.

- All such changes are subject to mandatory reviews ...

Kuksenko: We are a little easier, they know us.

- That is, can we say that authority is more important in this area than belonging to a specific team, organization?

Kuksenko: And even more, because, if I were a third-party developer, I could choose some interesting thing in Hotspot, start picking it, picking it, getting some gains, running in and starting to swing it out loud. Being in Oracle, I am forced to do those things that, maybe, do not bring me such great fame, but we need to do them, we need to dig, dig, finish it here, check it, and we have to do it.

- How large is the percentage of fuss and the percentage of interesting research on the work that you have to deal with? Is there anything surprising often? Do you often come across such places in Java where the performance engineer’s foot hasn’t set foot?

Kuksenko: Every day.

- How does it come out? From the outside, when I myself was such a person, it seemed that the people who make Java are some celestials who have any line strictly justified, do not pamper everything at all, and it’s not worthwhile to meddle there with mere mortals. Apparently, this is not entirely true?

Kuksenko:This is not entirely true. Performance very often, if we do not do some Highload, this is not a functional requirement. First, the product should work correctly, and then it should work quickly, and then, if necessary. Therefore, the presence of a huge number of places that a) nobody did, because they didn’t get around to look, b) because you don’t need to look there, and nobody cares, c) because before that nobody was worried, but now it has become enough big problem. But let's not forget that the main effort should be on correctness. We have a huge amount of code, where the performance of engineers didn’t set foot, but all the lines in which are justified in terms of correctness. Let's not forget that now in the team that is engaged in Java SE performance, there are less than 10 people.

- And how much does it feel to be enough?

Kuksenko: If there are a number of highly qualified teams that develop the main code, this is more than enough. Since our key engineers, who are developing both the compiler, and the Garbage Collector, and Class Libraries, are highly qualified, we help them. Not in the sense that we are moving them somewhere, pushing, kicking, but we are acting as an auxiliary tool. If our development team was sufficiently average in terms of performance, then we would take a more active position. Now this is simply not necessary, because many things are done without our participation.

- How much can we say that Oracle has now succeeded in Java in gathering experts in performance, in VM? How much are these world-class people, or do world-class people work somewhere else, in your estimation?

Kuksenko: In this sense, practice is a criterion of truth. If world-class experts worked in some other place, I think we would now be talking about some product.

About other technologies

- And how much is happening now, relatively speaking, in the Java platform from the point of view of performance and Rocket Science, from the point of view of how cool things are done there, can the coolness of these pieces be compared with the coolness of things that do other guys, say, Google V8, or similar, .NET, for example?

Kuksenko:I can not say anything about .NET. I have never even seen him. Google V8 saw - quite a normal thing.

- It's no secret that many JVM engineers consider Dalvik to be a craft. I don’t know why, but whoever you talk to - everyone spits that as a VM she is so-so. How did it happen that everyone seems to be using Android, and VM engineers make claims on her machine?

Kuksenko: It’s the job of engineers to make claims. If the engineers did not make claims on this product, I would have thought that there was something wrong with the engineers or the products. I have a lot of complaints about Java, but I will keep them with me.

- Maybe one or two for our viewers, listeners, from those that can be taken out, from those that you would like to improve or do something.

Kuksenko:Backward compatibility. I believe that you need to throw everything off and write everything from scratch. But this is not an approach.

- He is beloved, it seems to me, not only in performance, but everywhere.

Kuksenko: Design errors of the basic API are slowly accumulating over the years, and this is becoming inconvenient. Without even touching on how the language itself and all systems work, I would throw out the entire basic API and design a new one, more compact and correct, taking into account errors. But I understand that this work is not something that is unrealistic, it is impossible to do. No one is able to design this API from scratch, and therefore it remains to quietly add a new one, and the old quietly to deprecate. That's how we live.

About different JVMs

- Work on creating a runtime, your own virtual machine ... There is an opinion that creating a virtual machine is a pretty trivial matter, some kind of the simplest without optimization, and so on.

Kuksenko: Yes, and with optimizations this is not difficult. You take and do.

“And then what about the people at Oracle doing this for years and thousands of man-hours?”

Kuksenko: Why thousands of man-years?

- And how much? 20 years already exists VM in one form or another, one way or another. How many VMs are there in Oracle JDK now? There are also different options.

Kuksenko: We have an Embedded VM, but I don’t know this embedded world, so I won’t say anything about what is happening here.

- This is not a Hotspot?

Kuksenko:This is Hotspot, but this is Embedded Hotspot, these are individual guys sitting and developing it.

- Is there emphasis on footprint or something else?

Kuksenko: There is emphasis on working well on these pieces of iron.

- There is ARM, most likely?

Kuksenko: There ARM and VM under ARM. And that's all, we have one Hotspot. We still have JRockit, there is support for the customizers for some time. It is supported by completely different people, because the support team is fairly distributed. If there were or maybe still there are people in St. Petersburg who are working on this, but JRockit has remained at the Java 6 level.

- So you are not doing JRockit?

Kuksenko: We are not engaged.

- The history of the beginning of 2010 - the purchase of Sun by Oracle, and the talk about the merger of JRockit and Hotspot, do you know anything or remember, maybe how this decision was made, why it was made that way? What criteria can be here?

Kuksenko: There is an Oracle company, Oracle has two virtual machines, both good. Obviously, you need to merge. We make a decision - merge.

- But they decided to leave Hotspot, to integrate the best features and the best solutions from JRockit there.

Kuksenko: Yes.

“Why not the other way around?” It is clear that, most likely, you need to integrate features from one to another, because the merging of another type is probably impossible, given the complexity of both programs.

Kuksenko:It’s simply impossible, and it turned out something terrible and terrible.

- How can I conduct an analysis, which solution to choose in this case? I can’t even imagine how it is.

Kuksenko:It depends on what requirements. I don’t remember very well and don’t really know what the requirements were, but this raises the question of what our customers want. This is the key question. If the customers say that “We are used to Hotspot and we will never go to JRockit”, and this will be 70-80%, then everything, the further discussion is stopped. If other customers come running and say: “And we want JRockit, but we do not want Hotspot”, then the opposite situation will be the opposite. Then the question arises: "What do you like so much in the same JRockit or in Hotspot that you want to leave?" And after that, when all the custom feedbacks have been collected, the results are understood, and then we look at the features, how much it will cost to add features there, how much will add features there, the general cycle. I think this is just a general solution, that the developments on Hotspot, including the same QA cycle (00:48:

- Hearing Metaspace, Flight Recorder and Mission Control. And what else besides these things was dragged, transferred from JRockit to Hotspot?

Kuksenko: As far as I know, work is ongoing, I don’t know their status, have completed or not, because the synchronization mechanisms will become similar to those made on JRockit. There will be no 100% rewriting, but some ideas ...

- Basic things already done?

Kuksenko: Of course. They were made back when these VMs lived in different companies.

- There is an opinion that the complexity of the existing JIT compilers, especially the C2 compiler, is it such that it already hinders its further development? And all the projects related to rewriting JIT for something new.

Kuksenko:There is such a point of view, but there is a milder one, which is that it is not impossible to develop JIT, but rather a high threshold for entering the JIT development of the existing C2 compiler. That is, a person who wants to do something in the compiler, he spends a lot of time on the initial entry, before starting to produce useful changes and improvements. Firstly, I would not say that this impedes further development, the compiler is developing. The question of a high entry threshold is also debated whether this is a plus or minus. Maybe, on the contrary, this is a good dropout criterion for people.

- So that all sorts of lamers do not climb?

Kuksenko:There may be discussions. This is a political or organizational discussion - counting the man-hours spent on development. I would not want to delve into this.

- That is, there are no major engineering problems?

Kuksenko: There are engineering issues, maybe do it, maybe do it this way. The banal third-party dependency on C and C ++ compilers is starting to annoy.

- And in what direction is all this developing? Writing in Java?

Kuksenko: For example, this is a Graal project - take it, download it, play it, it works.

- Probably hard to say, that is, in the nine it is clear that he will not.

Kuksenko:Not. This is a third-party project under one Swiss university, including with the participation of Oracle. It is actively used by people for prototyping, for work in the field of GPU acceleration, and so on. That is, it occupies its own rather large niche in the field of research, and what will come out is a look further. At least, his niche, as a good academic tool to do research in the field, based on Hotspot, he has long occupied.
- And a similar project in the field of GC. There is Shenandoah, which RedHat once proposed. Do you know anything from what is happening there now?

Kuksenko: Unfortunately, I do not know any details.

- You've seen? How interesting is this?

Kuksenko:I know that RedHat will adapt Shenandoah to Hotspot, and some work is underway.

- There is still a story that five years ago very high hopes were placed on the Garbage First Collector, but if you talk to people at conferences, then almost all CMS Collectors are in production. Is it so in your opinion, with whom you are talking, what they say to you, and if so, why is this happening?

Kuksenko: I think, first of all, from laziness. An ideal product should have one button - “On”

- Have Garbage First engineers, people who develop it, managed to achieve what they wanted?

Kuksenko:It’s hard for me to say, I don’t work with Garbage First, I don’t know what is happening in this area. At the approaching fall of Java One, we have announced the reports of people, including people from our performance team, about what is really happening in the field of Garbage First. Let's wait for Java One, and see the announced results. Of course, I can come now, write a letter to the person and ask: “What are you going to tell in the fall, what happened there?”, But right now in the discussion I don’t know this answer.

Like any things, Garbage First needed to be completed, to complete some work. These works were carried out, they were done, some results were achieved. Obviously, with the right process, Garbage First will sooner or later behave normally. I think it will be early enough.
Why don't people want to leave? Firstly, because there are a lot of ready-made recommendations on how to do this for CMS tuning. You don’t have to think, you open, read the book and ran according to the algorithm: “So, if so, then this way, if so - then this way”. There are no active and large descriptions of how and where which handles to twist under Garbage First in nature, and they are only now being created. This is the first reason why people don’t go under Garbage First, because there I can twist, and here I don’t know yet - it’s scary to switch to something new. The second reason is that many are still too lazy to switch from Java 7 to Java 8, and Java 7 ...

- Almost already end of life ... It was announced that it would be the last public update in April.

Kuksenko: Something like that. And the third reason is laziness.

- Yes. Maybe she should have started.

Kuksenko: There are always two types of laziness for programmers: it’s just native laziness that is naturally present, and the second according to the principle “The sun rises in the East. Works? Works. Do not touch".

- From the point of view of the development approach, there are, relatively speaking, various imperative things when we say how we want to do something, and there are some declarative when we say what we want to do. What were the parameters in the old GC? The size of such a region, this region, that is, it was the answer to the question of how we want to do. And in Garbage First, there are ways to declare what we want. For example, the size of the pause. That is, we are already working in some target user indicators. How interesting is this direction for development?

Kuksenko:It is obvious to everyone, we ourselves know what the ideal product should look like. We even showed at our slide presentation that the ideal product should have one button - “On”. You turned it on, and it works. It should not have 10 thousand levers, pens, buttons, with which you start to twist it. And here it’s not even a question of a functional or imperative paradigm, it is a matter of having, at the end, sooner or later, one big “On” button. It’s not that the goal, we will achieve it and all will retire, and so on, perhaps this goal is unattainable, but it would be useful to strive for it. And the less parameters, the less space for the game, the better for everyone. Because if we now take the Java options of the Hotspot virtual machine, they control the operation of the Garbage Collector, and we just want to understand,

- This is a story about heuristics. I realized what you want to say, and then I’m a little to the side. There is a certain product - Hotspot, GC, Runtime, a chair, it has some engineering characteristics - that it is made of wood, it weighs so much, is nailed and so on, and there are some consumer ones - that it is such and such colors, falling apart or not falling apart, his splinters cling to your ass, do not cling, and so on. It seems that the old GC parameters are some engineering characteristics - all these kilobytes and kilobits, and these milliseconds are perceived characteristics.

Kuksenko:Yes. I’ll bring another one, that we are driving an old-made car, here we press the gas button, here we change gears, here we also determine the gas intake, because it doesn’t work well, here you still need to tighten some handle, I really know or we get into another car, turn on the ignition key, press the gas a bit to start, and then turn on the cruise control and say: “I want 60 km / h,” and go. Which is more convenient?

“And how does it cope already - is it already his own business?”

Kuksenko:This is absolutely not our concern. If I set my requirements to 60 km / h, and it gives me 60 km / h, why should I find out what happens next? Obviously, for the end user, such a product will be faster. Yes, of course, maybe for a number of auto mechanics who used to sit in their garages and regulated advance, ignition or something else there, this means loss of work, including for CMS tuners it also means loss of work, but, in my opinion, in the end, it would be more convenient for everyone.

- If we are moving in the direction of decreasing different levers, and we want to come to the conclusion that the user ...

Kuksenko: We are not moving in the direction of decreasing. We say that “These levers remain on the panel, and those under the hood. Open the hood and twist, please. "

- That is, the ability to twist still remains?

Kuksenko: Of course.

- Simply, on the one hand, there is a movement towards decreasing the number of pens, and on the other hand, what you do from the point of view of your evangelism, which you conduct at conferences, "Dzhugs" and even now, and from the point of view, what we do as the organizers of the conference - we are trying to show some insides. And how much do you need to show the insides, given that people have less and less, even remaining in the same terminology, leverage to influence how their Runtime behaves?

Kuksenko: First of all, I would not argue that people have less leverage over how Runtime behaves and how it all works.

I would say something else. The fact is that if there are requirements for the performance of Runtime, then we can sit down, understand how it behaves, build a performance model, understand what can be improved, and produce a result. And the current state, when we have 100 thousand pens, it leads to the fact that people come: “We don’t, we don’t want to understand or we don’t have time to understand how it all works, we’d better twist the pens, see what will happen. " This is a completely valid way of acting in certain cases, it does not always work, even to the point that there is another layer of people who do not understand how it works in detail, and they know how to turn the knobs. They know that there are little pens that can be twisted, and they walk, they are twisted. This complicates the whole chain of work. Yes, of course, performance engineers like that,

Speech at JPoint

- I will ask you to tell a little about your report that you will have on JPoint . It seems to be called the same as on the Joker. Will this be the same report, or not the same that you redo it? What kind of report, and how will they differ?

Kuksenko: The report will continue to focus on the aspect of performance analysis, productive problems this time using iron. Analysis, and how to see what happens in real hardware during the operation of our program, how and what can be improved specifically for program performance.

The core and the main report will almost completely coincide with what was done on the Joker, but I have long understood that the reports cannot be told once, because every time you tell the second, third time, you correct the mistakes, add that something more interesting, etc. That is, the report becomes more useful, new information appears in it, unnecessary information is removed, and it finally becomes interesting for you to yourself. That is, having made the report two or three times, you can bring it to an already better state, when it can be put on the shelf and not changed. As a rule, a report read once, it is not yet in such a state, and there is always something to improve.

- Do your fans who listened to you in St. Petersburg and who will go to listen to you in Moscow in three weeks, learn something new from him for themselves?

Kuksenko: I hope so. At least I'll make less water. I have a report that I read in St. Petersburg, 40% consisted of water. I think that we need to leave five percent, add more examples, and a slightly clearer explanation of what is happening there, than it was done.

- Finally, give some performance advice to our viewers.

Kuksenko: There can always be only one performance council, and I always say it at all conferences: if you want to engage in performance, you need to imagine and know how it all works, everything and everywhere.

PS:
Sergey's reports, which are discussed in the article, are carefully collected by us here .
The announcement of his speech at JPoint is here .

Tags:

“Engineers' Work - Giving Claims” - Interview with Sergey Kuksenko of the Java Performance Team

Also popular now: