Friday, October 26, 2007

Easy JRuby 1.0.2 Bugs

For those of you looking to start helping JRuby along, here's a few open 1.0.2 bugs that would be pretty easy for a newb to look at. They'd also help you get your first taste of JRuby internals. Good eatin!

This should help get you started:
FYI: Most of these are also 1.1 bugs, so you'll be helping two releases at once (provided you fix them and provide patches for both branches, of course!)

This is fairly simple, isolated core class code. Have a look at src/org/jruby/

There's a patch here that looks pretty clean, but needs updating to apply cleanly to trunk and 1.0 branch.

We have File.sysopen defined, but for some reason we don't have IO.sysopen. If you can't fix it, at least do some research as to *why*.

This one throws a NullPointerException, which are always really ugly, really broken, and usually really easy to figure out and fix. The spec in question is our (oldish) copy in test/externals/rubinius/spec/core.

This one just needs proper error handling, and there's a preliminary patch provided.

I'm not sure if we can support these, but if we can we should. If we can't, we should raise an error. Either way it's probably not hard to resolve.

There's a patch here that just needs a little more work.

I don't even understand what's happening in this one, because I keep falling asleep reading the description. I think it's probably easy.

Again, there's a patch here that appears to just need a bit more work.

This one has all the background work done on the JRuby side. We're simply not using the right algorithm for Range#step over character ranges. A quick peek at MRI source or a little insight on how to "just fix it" is all that's needed here, and the code is pretty simple. (Update: Fixed, thanks to Mathias Biilmann Christensen.)

This has a patch that's been applied to trunk, but it didn't apply cleanly to the 1.0 branch. Tidy it up, and it's done.

...but Kernel#trap does. Huh? (Update: Fixed! Ola had already repaired it, but not closed the bug.)

This is pretty straightforward...we just don't have Q/q implemented. (Update: Fixed, thanks to Riley Lynch.)

Thursday, October 25, 2007

JRuby 1.1 on Rails: More Performance Previews

Nick Sieger, a Sun coworker and fellow JRuby core team member, has posted the results of benchmarking his team's large Rails-based project on MRI and the codebase soon to be released as JRuby 1.1 (trunk). Read it here:

Now there's a couple things I get out of this post:
  1. JRuby on Rails is starting, at least for this app, to surpass MRI + Mongrel for simple serial-request benchmarks. And this a week before the 1.1 beta release and all the continuing performance enhancements we know are still out there. I think we're gonna make it, folks. If this continues, there's no doubt in my mind JRuby 1.1 will run Rails fastest.
  2. JRuby on Rails will perform at least as well as MRI + Mongrel for the app Nick and his teammates are building. Several months ago they committed to eventually deploying on JRuby, and if you knew what I know about this app, you'd know why that scared the dickens out of me. But I'm glad the team had faith in JRuby and the JRuby community, and I'm glad we've been able to deliver.
I hope we're approaching a time when people believe that JRuby is the real deal--an excellent choice for running Rails now and potentially the best choice in the near future. There's always more work to do, but it's good to see the effort paying off.

Give it a try today, why not?

Tuesday, October 23, 2007

Help Us Complete JRuby 1.1 and 1.0.2 Releases

We're looking to do releases of a 1.1 beta and 1.0.2 over the next couple weeks, and we're hoping to pull in help from the community to turn over as many bugs as possible. A lot of the bugs we'd like to fix for each release wouldn't be very difficult for folks to get into, even if you only have a bit of Ruby experience and a good Java background. Currently we're looking at the following counts:
  • JRuby 1.0.2: 54 scheduled - These are almost exclusively compatibility issues, though there's a few minor performance items and several stability fixes.
  • JRuby 1.1: 147 scheduled - There's almost all the 1.0.2 bugs in here plus many additional performance enhancements and bug fixes that can't be cleanly/easily backported to 1.0.2.
  • Post 1.1: 7 JRuby 1.x, 71 unscheduled - These bugs are in danger of getting punted to a future release, so if you have a bug in these lists or you see something you think you'd like to/want to fix, now's the time.
The 1.0.2 release is basically a maintenance release to the 1.0 branch, and as such includes primarily compatibility and stability fixes. There's a couple minor perf issues that will be resolved, generally only where they were considered so slow as to be "broken".

The big story is going to be 1.1. It will include a massive number of performance enhancements and bug fixes. It will include a complete Ruby-to-Java bytecode compiler, which is responsible for much of the performance improvements. And it should also include a completely reworked regular expression subsystem that eliminates one of the last really major bottlenecks running Rails code. We may not have that done for the 1.1 beta release at RubyConf, but it will be there for 1.1 final around a month later.

So if you've been looking for a chance to get involved in JRuby, hop on the mailing lists or start browing the bug lists above, and feel free to email me directly ( if you have questions about any specific bug. JRuby needs your support!

Wednesday, October 17, 2007

Another Performance Discovery: REXML

I've discovered a really awful bottleneck in REXML processing.

Look at these results for parsing our build.xml:
read content from stream, no DOM
2.592000 0.000000 2.592000 ( 2.592000)
1.326000 0.000000 1.326000 ( 1.326000)
0.853000 0.000000 0.853000 ( 0.853000)
0.620000 0.000000 0.620000 ( 0.620000)
0.471000 0.000000 0.471000 ( 0.471000)
read content once, no DOM
5.323000 0.000000 5.323000 ( 5.323000)
5.328000 0.000000 5.328000 ( 5.328000)
5.209000 0.000000 5.209000 ( 5.209000)
5.173000 0.000000 5.173000 ( 5.173000)
5.138000 0.000000 5.138000 ( 5.138000)

When reading from a stream, the content is read in in chunks, with each chunk being matched in turn. Because our current regexp engine uses char[] instead of byte[], each chunk must be decoded into UTF-16 characters, matched, and encoded back into UTF-8 bytes.

For small chunks, like those read off a stream, this decode/encode cycle is fairly quick. Here, the streamed numbers are pretty close to MRI. However, when an XML string is parsed from memory, the process goes like this:
  1. set buffer to entire string
  2. match against the buffer
  3. set buffer to post match (remainder of the string)
Now this is obviously a little inefficient, since it creates a lot of extra strings, but a copy-on-write String implementation helps a lot. However in our case it also means that we decode/encode the entire remaining XML string for every element match. For any nontrivial file, this is *terrible* overhead.

So what's the fix? Here's the same second benchmark using a StringIO object passed to the parser instead, with a simple change to rexml/source.rb:

Index: lib/ruby/1.8/rexml/source.rb
--- lib/ruby/1.8/rexml/source.rb (revision 4596)
+++ lib/ruby/1.8/rexml/source.rb (working copy)
@@ -1,4 +1,5 @@
require 'rexml/encoding'
+require 'stringio'

module REXML
# Generates Source-s. USE THIS CLASS.
@@ -8,7 +9,7 @@
# @return a Source, or nil if a bad argument was given
def SourceFactory::create_from arg#, slurp=true
if arg.kind_of? String
elsif arg.respond_to? :read and
arg.respond_to? :readline and
arg.respond_to? :nil? and

New numbers:
read content once, no DOM
0.640000 0.000000 0.640000 ( 0.640000)
0.693000 0.000000 0.693000 ( 0.693000)
0.542000 0.000000 0.542000 ( 0.542000)
0.349000 0.000000 0.349000 ( 0.349000)
0.336000 0.000000 0.336000 ( 0.336000)

This is a perfect indication why JRuby's Rails performance is currently nowhere near what it will be. We continue to find these little gems...and there's no telling how many more are out there. With recent execution performance numbers looking extremely solid and recent Rails performance getting closer and closer, the upcoming 1.1 release ought to be amazing.

Friday, October 12, 2007

Performance Update

As some of you may know, I've been busily migrating all method binding to use Java annotations. The main reasons for this are to simplify binding and to provide end-to-end metadata that can be used for optimizing methods. It has enabled using a single binding generator for 90% of methods in the system (and increasing). And today that has enabled making some impressive perf improvements.

The method binding ends up looking like this:
@JRubyMethod(name = "[]", name2 = "slice",
required = 1, optional = 1)
public IRubyObject aref(IRubyObject[] args) {

This binds the aref Java method to the two Ruby method names [] and slice and enforces a minimum of one argument and a maximum of two. And it does this all automatically; no manual arity checking or method binding is necessary. Neat. But that's not the coolest result of the migration.

The first big step I took today was migrating all annotation-based binding to directly generate unique DynamicMethod subclasses rather than unique Callback subclasses that would then be wrapped in a generic DynamicMethod implementation. This moves generated code closer to the actual calls.

The second step was to completely disable STI dispatch. STI, we shall miss you.

So, benchmarks. Of course fibonacci numbers are indicative of only a very narrow range of performance, but I think they're a good indicator of where general performance will go in the future, as we're able to expand these optimizations to a wider range of methods.

JRuby before the changes:
$ jruby -J-server -O bench_fib_recursive.rb
1.039000 0.000000 1.039000 ( 1.039000)
1.182000 0.000000 1.182000 ( 1.182000)
1.201000 0.000000 1.201000 ( 1.201000)
1.197000 0.000000 1.197000 ( 1.197000)
1.208000 0.000000 1.208000 ( 1.208000)
1.202000 0.000000 1.202000 ( 1.202000)
1.187000 0.000000 1.187000 ( 1.187000)
1.188000 0.000000 1.188000 ( 1.188000)

JRuby after:
$ jruby -J-server -O bench_fib_recursive.rb
0.864000 0.000000 0.864000 ( 0.863000)
0.640000 0.000000 0.640000 ( 0.640000)
0.637000 0.000000 0.637000 ( 0.637000)
0.637000 0.000000 0.637000 ( 0.637000)
0.642000 0.000000 0.642000 ( 0.642000)
0.643000 0.000000 0.643000 ( 0.643000)
0.652000 0.000000 0.652000 ( 0.652000)
0.637000 0.000000 0.637000 ( 0.637000)

This is probably the largest performance boost since the early days of the compiler, and it's by far the fastest fib has ever run. Here's MRI (Ruby 1.8) and YARV (Ruby 1.9) numbers for comparison:

$ ruby bench_fib_recursive.rb
1.760000 0.010000 1.770000 ( 1.813867)
1.750000 0.010000 1.760000 ( 1.827066)
1.760000 0.000000 1.760000 ( 1.796172)
1.760000 0.010000 1.770000 ( 1.822739)
1.740000 0.000000 1.740000 ( 1.800645)
1.750000 0.010000 1.760000 ( 1.751270)
1.750000 0.000000 1.750000 ( 1.778388)
1.740000 0.000000 1.740000 ( 1.755024)

$ ./ruby -I lib bench_fib_recursive.rb
0.390000 0.000000 0.390000 ( 0.398399)
0.390000 0.000000 0.390000 ( 0.412120)
0.400000 0.010000 0.410000 ( 0.424013)
0.400000 0.000000 0.400000 ( 0.415217)
0.400000 0.000000 0.400000 ( 0.409039)
0.390000 0.000000 0.390000 ( 0.415853)
0.400000 0.000000 0.400000 ( 0.415201)
0.400000 0.000000 0.400000 ( 0.504051)

What I think is really awesome is that I'm comfortable showing YARV's numbers, since we're getting so close--and YARV has a bunch of additional integer math optimizations we don't currently support and thought we'd never be able to compete with. Well, I guess we can.

However a more reasonable benchmark is the "pentomino" benchmark in the YARV suite. We've always been slower than MRI...much slower some time ago when nothing compiled. But times they are a-changin'. Here's JRuby before the changes:
$ time jruby -J-server -O sbench/bm_app_pentomino.rb

real 1m50.463s
user 1m49.990s
sys 0m1.131s

And after:
$ time jruby -J-server -O bench/bm_app_pentomino.rb

real 1m25.906s
user 1m26.393s
sys 0m0.946s

$ time ruby test/bench/yarv/bm_app_pentomino.rb

real 1m47.635s
user 1m47.287s
sys 0m0.138s

$ time ./ruby -I lib bench/bm_app_pentomino.rb

real 0m49.733s
user 0m49.543s
sys 0m0.104s

Again, keep in mind that YARV is optimized around these benchmarks, so it's not surprising it would still be faster. But with these recent changes--general-purpose changes that are not targeted at any specific benchmark--we're now less than 2x slower.

My confidence has been wholly restored.