Looking for Performance: Monitoring JVM Performance under Linux with BPF

    Specialist in low-level application optimization, Sasha Goldstein, as part of his report on JPoint, will deviate a little from the usual .NET theme and talk about tools that help fight for the performance of Java applications under Linux. What kind of tool is this, who needs it and why, we decided to find out in advance and interviewed Sasha.

    JUG.Ru Group: Please tell us a few words about yourself and your work?

    Sasha Goldstein: My name is Sasha Goldstein, for the last 10 years I have been working in the Israeli consulting company Sela as CTO.
    My work is focused on issues of optimizing performance, diagnostics on production, monitoring and all kinds of low-level tasks.
    My typical work week is filled with a variety of tasks: I teach, fix bugs or performance problems for clients, and also work on internal projects. I also enter the program committee of a couple of conferences: our own SDP (Tel Aviv, Israel), as well as DotNext (Moscow and St. Petersburg, Russia), which surprisingly takes a lot of time.

    "The performance of most applications is not determined by the hardware or the runtime" - Sasha Goldshtein on monitoring Java performance under Linux

    JUG.Ru Group: You usually talk a lot about .NET performance. What pushed you towards Java?

    Sasha Goldstein: Indeed, most of my work is related to C # and C ++ under Windows. I spent a lot of time optimizing and resolving identified .NET performance issues. However, in the work on low-level optimization and debugging within the framework of different technologies, common elements are traced: tools can have different names, but the general principles, methodology, and thought process coincide. In the last couple of years, I have become closely acquainted with BPF, the Linux tracing framework, and this has prompted me to use BPF to analyze JVM performance.

    JUG.Ru Group: What are the features of the struggle for productivity in Java against the background of .NET?

    Sasha Goldstein:Like I said, many things are identical. The performance of most applications is determined not by hardware or the runtime (JVM, CLR, Python, or something else), but by the environment: features of access to databases, speed of disk search and processing of network requests. For such a class of applications, by and large, it does not matter which runtime you use. When it comes to low-level optimization, for example, minimizing memory consumption, optimizing individual algorithms whose speed is determined by the processor (CPU-bound), and the like, there are situations in which the difference between platforms really matters, especially if you need to configure runtime for your application. In general, the JVM is more flexible than the CLR; and it seems to me

    JUG.Ru Group: In what cases is the struggle for productivity really needed, is this task “expensive” in terms of time costs? What factors clearly indicate that there are problems with the performance?

    Sasha Goldstein:Often, performance is not a functional indicator that needs to be achieved. But even if you are not building real-time systems or ultrafast client applications, there are probably some minimum (reasonable) speed limits that your users will not be willing to cross. For example, a web API that takes 5 seconds to process a login request is likely to piss people off. There is also a cost issue: optimizing performance usually means that you will need less hardware resources, which means direct immediate cost savings, given the cloud-first policy adopted by many.
    It is hoped that most people have a process for determining performance targets and at least the simplest way to monitor and verify these indicators as the development process moves forward.



    JUG.Ru Group: Where to start researching performance issues?

    Sasha Goldstein:The critical point is the presence of a good description of the system, for example, a functional block diagram. When you understand, relatively speaking, “the mechanics of work”: what are the main components and how they are interconnected, it is much easier to guess where to look for bottlenecks, as well as much easier to understand where to start looking for a problem. Tools are secondary. Before starting a bunch of tools, you need to understand what various resources are, how they can be overloaded, and how to test the proposed hypotheses in order to make progress. For example, you can spend days optimizing CPU performance when executing some sorting algorithm, but after that you will find that 99% of the time takes the data request from the database, so more or less efficient sorting does not contribute to the total execution time.

    JUG.Ru Group: Can you talk about the main capabilities of the toolkit using the example of BPF?

    Sasha Goldstein: BPF is a powerful kernel engine introduced in recent versions of Linux kernels that allows dynamic trace programs to be introduced into the kernel. These programs are safely controlled and cannot lead to a system crash, nor do they require compiling and loading kernel modules. As a result, we have a tracing framework that can work very close to the source of the main events, in particular, processing network packets, sending requests to the disk, processing hardware interrupts, and the like. Anticipating your question, I note that there are also some JVM-specific events that I will consider as part of the report on JPoint: garbage collection, distribution of objects, locking to free the monitor, and many others.
    Moreover, BPF allows you to create tools in which aggregation occurs at the level of the tracer - for example, if you are concerned about the histogram of delays (for example, delays of HTTP requests), you do not need to dump a million events, and then post-process to calculate the histogram. Instead, your BPF program provides real-time aggregation and provides only the final result for analysis.
    There is a very powerful toolkit that is being developed by people from Facebook, Netflix, Plumgrid (VMWare) and other companies (including with my modest participation :-)).

    JUG.Ru Group: How difficult is it to implement and master the workflow?

    Sasha Goldstein: BPF is not difficult to use, since there are many tools that can be called up with just one command line, which can be used to identify performance problems. For example, there is a tool called mysqld_slower that displays slow MySQL queries.
    The only problem is that you need to install the new Linux kernel in order to use BPF tools. Most of the functionality was included in Linux 4.1 and 4.4 (which you have in Ubuntu 16.04, for example), but other functions require even newer versions, in particular 4.9, which most do not yet have in production. This, of course, can be circumvented by updating only the kernel, thanks to this approach companies like Facebook, Netflix and others got all the benefits of BPF.



    JUG.Ru Group: Is it possible to give an example of a "typical rake" in working with performance, which BPF-based toolkits can deal with?

    Sasha Goldstein: BPF tools are useful for diagnosing applications with limited processor capabilities, blocking (I / O) times, file access problems, slow database queries, network queries, garbage collection - in fact, a very wide range of problems. I will consider many of them in my report.

    JUG.Ru Group: Are there tasks that only this toolkit can deal with?

    Sasha Goldstein:Yes. When you need to process a large number of events with a tracer, BPF is indispensable. Even fairly simple scenarios, such as CPU profiling, can be made much more efficient by using profiling support in BPF. In most cases, solving problems, such as processing each incoming request and aggregating information about the delay, is not practical with other performance analysis tools.
    In my report, we will consider lock monitoring, DNS resolution, MySQL queries, and a bunch of other problems that can be called typical for production systems.

    JUG.Ru Group: Your report is more practical. Who is he primarily focused on?

    Sasha Goldstein:My report is intended for developers and maintenance engineers (Ops Engineers) developing software for Linux. The focus will be on the JVM (because it's JPoint!), So all the examples will be in Java. We’ll look at a bunch of examples that I hope will be useful for diagnosing problems with their own systems - and even if you don’t have a sufficiently fresh version of the Linux kernel today, it will appear in the very near future. I think every Linux developer will one day find a use for BPF tools.



    If you have questions, suggestions and comments - ask, Sasha is ready to answer them in the comments.

    PS In addition to Sasha, at JPoint 2017, Alexey @shipilev Shipilev, Sergey Walrus Kuksenko, Vladimir vladimirsitnikov Sitnikov and Nikolai xpinjection Alimenkov will talk about the performance . What exactly? See the list of reports .

    And if you live in Siberia and you can’t get to Moscow, we recommend that you take a closer look at JBreak 2017 .


    UPD On the fifth of November in St. Petersburg we do training with Sasha - “Profiling JVM Applications in Production”, registration and terms of participation are on the site .

    Also popular now: