Tuesday, January 4, 2011

Flat and Graph Profiles for JRuby 1.6

Sometimes it's the little things that make all the difference in the world.

For a long time, we've extolled the virtues of the amazing JVM tool ecosystem. There's literally dozens of profiling, debugging, and monitoring tools, making JRuby perhaps the best Ruby tool ecosystem you can get. But there's a surprising lack of tools for command-line use, and that's an area many Rubyists take for granted.

To help improve the situation, we recently got the ruby-debug maintainers to ship our JRuby version, so JRuby has easy-to-use command-line Ruby debugging support. You can simply "gem install ruby-debug" now, so we'll stop shipping it in JRuby 1.6.

We also shipped a basic "flat" instrumented profiler for JRuby 1.5.6. It's almost shocking how few command-line profiling tools there are available for JVM users; most require you to boot up a GUI and click a bunch of buttons to get any information at all. Even when there are tools for profiling, they're often bad at reporting results for non-Java languages like JRuby. So we decided to whip out a --profile flag that gives you a basic, flat, instrumented profile of your code.

To continue this trend, we enlisted the help of Dan Lucraft of the RedCar project to expand our profiler to include "graph" profiling results. Dan previously implemented JRuby support for the "ruby-prof" project, a native extension to C Ruby, in the form of "jruby-prof" (which you can install and use today on any recent JRuby release). He was a natural to work on the built-in profiling support.

For the uninitiated, "flat" profiles just show how much time each method body takes, possibly with downstream aggregate times and total aggregate times. This is what you usually get from built-in command-line profilers like the "hprof" profiler that ships with Hotspot/OpenJDK. Here's a "flat" profile for a simple piece of code.

~/projects/jruby ➔ jruby --profile.flat -e "def foo; 100000.times { (2 ** 200).to_s }; end; foo"
Total time: 0.99

total self children calls method
0.99 0.00 0.99 1 Object#foo
0.99 0.08 0.90 1 Fixnum#times
0.70 0.70 0.00 100000 Bignum#to_s
0.21 0.21 0.00 100000 Fixnum#**
0.00 0.00 0.00 145 Class#inherited
0.00 0.00 0.00 1 Module#method_added

A "graph" profile shows the top N call stacks from your application's run, breaking them down by how much time is spent in each method. It gives you a more complete picture of where time is being spent while running your application. Here's a "graph" profile (abbreviated) for the same code.

~/projects/jruby ➔ jruby --profile.graph -e "def foo; 100000.times { (2 ** 200).to_s }; end; foo"
%total %self total self children calls name
100% 0% 1.00 0.00 1.00 0 (top)
1.00 0.00 1.00 1/1 Object#foo
0.00 0.00 0.00 145/145 Class#inherited
0.00 0.00 0.00 1/1 Module#method_added
1.00 0.00 1.00 1/1 (top)
99% 0% 1.00 0.00 1.00 1 Object#foo
1.00 0.09 0.91 1/1 Fixnum#times
1.00 0.09 0.91 1/1 Object#foo
99% 8% 1.00 0.09 0.91 1 Fixnum#times
0.70 0.70 0.00 100000/100000 Bignum#to_s
0.21 0.21 0.00 100000/100000 Fixnum#**
0.70 0.70 0.00 100000/100000 Fixnum#times
69% 69% 0.70 0.70 0.00 100000 Bignum#to_s
0.21 0.21 0.00 100000/100000 Fixnum#times
21% 21% 0.21 0.21 0.00 100000 Fixnum#**
0.00 0.00 0.00 145/145 (top)
0% 0% 0.00 0.00 0.00 145 Class#inherited
0.00 0.00 0.00 1/1 (top)
0% 0% 0.00 0.00 0.00 1 Module#method_added

As you can see, you get a much better picture of why certain methods are taking up time and what component calls are contributing most to that time.

We haven't settled on the final command-line flags, but look for the new graph profiling (and the cleaned-up flat profile) to ship with JRuby 1.6 (real soon now!)


  1. Dave: Yep, that works just fine too, and JRuby can be set to force methods to compile before invocation, so they'll show up in Java tools.

    The problem with hprof, though, is that it's not possible to narrow its focus to just Ruby methods (or to exclude core JDK or JRuby implementation methods). As a result, hprof instrumented profiling is *really slow* since it's profiling every piece of Java code loaded.

    If you can get past that, the next issue is that our generated bytecode for Ruby methods often has long, mangled names that would be hard for a Rubyist to interpret, and hprof doesn't show filenames or line numbers.

    Ultimately I'd love to see better JVM tool support for alternative languages with "peculiar" bytecode, but a JRuby-specific profiler will probably always be needed.

  2. William: Of course I know about JXInsight and OpenCore, and I didn't mean to imply that there are no such command-line tools for profiling. I merely wanted to lament the fact that there's so *few* command-line tools. Out of all the dozens of profiling options for the JVM, I know of exactly three command-line-only tools: "hprof", which is very limited, slow, and profiles *everything*; The Java Interactive Profiler (JIP) library, which seems to have been abandoned (and so we stopped shipping it with JRuby); and OpenCore. Of these, only OpenCore is really useful for instrumented profiling.

    The main reason we don't ship OpenCore as the built-in profiler is the fact that it's not open source and therefore not distribution-compatible with JRuby's dist licenses. But there's no reason JRuby users shouldn't look into it.

    OpenCore really is an amazing tool, and we'd love to promote it more. Perhaps there's a way it could be easily bundled up as a Ruby gem so that JRuby users could have easier access to it? It could be the "professional grade" profiler while we continue to ship our own "basic" profiling logic as part of JRuby.

    I would also like to standardize the hooks we put in place for profiling in JRuby, so that OpenCore can use "blessed" entry points and hopefully have a better guarantee of working from version to version.

    Toss me an email :) I didn't mean to slight OpenCore or JXInsight in any way, and you know I'd love to get them in front of more JRuby users. We just wanted a dead simple (and perhaps not as scalable) built-in option, so people have an easily accessible "first tool" to reach for.

  3. How can I redirect the output? In windows > isn't working.