New JRuby Compiler: Progress Updates

I've been cranking away on the new compiler. I'm a bit tired and planning to get some sleep, but I've gotten the following working:

all three kinds of calls
local variables
string, fixnum, array literals
'def' for simple methods and arg lists
closures

Now that last item comes with a big caveat: I have no way to pass closures I create. The code is compiling, basically as a Closure class that you initialize with local variables and invoke. But since block management in JRuby is still heavily dependent on ThreadContext nonsense, there's no easy way to pass it to a given method. So the next step to getting closures to work in the compiler is to start passing them on the call path as a parameter, much like we do for ThreadContext.

I've managed to keep the compiler fairly well isolated between node walking and bytecode generation, though the bytecode generator impl I have currently is getting a little large and cumbersome. It's commented to death, but it's pushing 900 LOC. It needs some heavy refactoring. However, it's behind a fairly straightforward interface, so the node-walking code doesn't ever see the ugliness. I believe it will be much easier to maintain, and it's certainly easier to follow.

In general, things are moving along well. I'm skipping edge cases for some nodes at the moment to get bulk code compiling. There's a potential that as this fills out more and handles compiling more code, it could start to be wired in as a JIT. Since it can fail gracefully if it can't compile an AST, we'd just drop back to interpreted mode in those cases.

So that's it.

...

Ok, ok, here's performance numbers. Twist my arm why don't you.

(best times only)

The new method dispatch benchmark tests 100M calls to a simple no-arg method that returns 'self', in this case Fixnum#to_i. The first part of the test is a control run that just does 100M local variable lookups.

method dispatch, control (var access only):
 interpreted, client VM: 1.433
 interpreted, server VM: 1.429
 ruby 1.8.5: 0.552
 compiled, client VM: 0.093
 compiled, server VM: 0.056

Much better. The compiler handles local var lookups using an array, rather than going through ThreadContext to get a DynamicScope object. Much faster, and HotSpot hits it pretty hard. At worst it takes about 0.223s, so it's faster than Ruby even before HotSpot gets ahold of it. The second part of the test just adds in the method calls.

method dispatch, test (with method calls):
 interpreted, client VM: 5.109
 interpreted, server VM: 3.876
 ruby 1.8.5: 1.294
 compiled, client VM: 3.167
 compiled, server VM: 1.932

Better than interpreted, but slow method lookup and dispatch is still getting in the way. Once we find a single fast way to dynamic dispatch I think this number will improve a lot.

So then, on to the good old fib tests.

recursive fib:
 interpreted, client VM: 6.902
 interpreted, server VM: 5.426
 ruby 1.8.5: 1.696
 compiled, client VM: 3.721
 compiled, server VM: 2.463

Looking a lot better, and showing more improvement over interpreted than the previous version of the compiler. It's not as fast as Ruby, but with the client VM it's under 2x and with the server VM it's in the 1.5x range. Our heavyweight Fixnum and method dispatch issues are to blame for the remaining performance trouble.

iterative fib:
 interpreted, client VM: 17.865
 interpreted, server VM: 13.284
 ruby 1.8.5: 17.317
 compiled, client VM: 17.549
 compiled, server VM: 12.215

Finally the compiler shows some improvement over the interpreted version for this benchmark! Of course this one's been faster than Ruby in server mode for quite a while, and it's more a test of Java's BigInteger support than anything else, but it's a fun one to try.

All the benchmarks are available in test/bench/compiler, and you can just run them directly. If you like, you can open them up and see how to use the compiler yourself; it's pretty easy. I will be continuing to work on this after I get some sleep, but any feedback is welcome.

Written on January 4, 2007