What Java thread is loading my processor

What do you do when your Java application consumes 100% of the CPU? It turns out you can easily find problematic threads using the built-in Unix and JDK utilities. No profiling tools are required.
For testing purposes, we will use a simple program:

public class Main {
    public static void main(String[] args) {
        new Thread(new Idle(), "Idle").start();
        new Thread(new Busy(), "Busy").start();
    }
}
class Idle implements Runnable {
    @Override
    public void run() {
        try {
            TimeUnit.HOURS.sleep(1);
        } catch (InterruptedException e) {
        }
    }
}
class Busy implements Runnable {
    @Override
    public void run() {
        while(true) {
            "Foo".matches("F.*");
        }
    }
}

As you can see, in this piece of code 2 threads are launched. Idle does not consume CPU resources (
remember, sleeping threads consume memory, but not a processor), while Busy heavily loads the CPU by parsing regular expressions and other complex processes.
How can we quickly find the problematic piece of code in our program? First, we will use 'top' to find the process id (PID) of the java application. It is quite simple:
top -n1 | grep -m1 java

We will see the first line of output 'top' containing the word 'java':
22614 tomek     20   0 1360m 734m  31m S    6 24.3   7:36.59 java 

The first column is the PID. Unfortunately, it turned out that 'top' uses ANSI escape codes for colors . Fortunately, I found a perl script to remove extra characters and finally extract the PID.
top -n1 | grep -m1 java | perl -pe 's/\e\[?.*?[\@-~] ?//g' | cut -f1 -d' '

Returns:
22614

Now that we know the PID of the process, we can use top -H to search for problematic Linux threads. The -H switch displays a list of all threads, and now the PID column is the thread ID:
top -n1 -H | grep -m1 java
top -n1 -H | grep -m1 java | perl -pe 's/\e\[?.*?[\@-~] ?//g' | cut -f1 -d' '

Returns:
25938 tomek     20   0 1360m 748m  31m S    2 24.8   0:15.15 java
25938

So we have the JVM process ID and Linux thread ID. And now for the fun part: if you look at the output of jstack (available in the JDK), each thread has an NID that is written after the name.
Busy' prio=10 tid=0x7f3bf800 nid=0x6552 runnable [0x7f25c000]
    java.lang.Thread.State: RUNNABLE
        at java.util.regex.Pattern$Node.study(Pattern.java:3010)

The parameter nid = 0x6552 is the Hex representation of the stream ID:
printf '%x' 25938
6552

Now combine everything into one script:
#!/bin/bash
PID=$(top -n1 | grep -m1 java | perl -pe 's/\e\[?.*?[\@-~] ?//g' | cut -f1 -d' ')
NID=$(printf '%x' $(top -n1 -H | grep -m1 java | perl -pe 's/\e\[?.*?[\@-~] ?//g' | cut -f1 -d' '))
jstack $PID | grep -A500 $NID | grep -m1 '^$' -B 500

The last line starts jstack with a specific PID and displays a stream with a matching NID. That same flow will be problematic.
We carry out:
./profile.sh
"Busy" prio=10 tid=0x7f3bf800 nid=0x6552 runnable [0x7f25c000]
    java.lang.Thread.State: RUNNABLE
        at java.util.regex.Pattern$Node.study(Pattern.java:3010)
        at java.util.regex.Pattern$Curly.study(Pattern.java:3854)
        at java.util.regex.Pattern$CharProperty.study(Pattern.java:3355)
        at java.util.regex.Pattern$Start.(Pattern.java:3044)
        at java.util.regex.Pattern.compile(Pattern.java:1480)
        at java.util.regex.Pattern.(Pattern.java:1133)
        at java.util.regex.Pattern.compile(Pattern.java:823)
        at java.util.regex.Pattern.matches(Pattern.java:928)
        at java.lang.String.matches(String.java:2090)
        at com.blogspot.nurkiewicz.Busy.run(Main.java:27)
        at java.lang.Thread.run(Thread.java:662)


A source

Also popular now: