Sunday, September 16, 2012

An experiment in static compilation of Ruby: FASTRUBY!

While at GoGaRuCo this weekend, I finally made good on an experiment I had been thinking about for a while: a static compiler for Ruby. I thought I'd share it with you good people today.

First we have a simple Ruby script with a class in it:



We compile it with fastruby, and it produces two .java source files: Hello.java and RObject.java.

Hello.java implements the methods the Ruby class does in the script, and calls the same methods (with some mangling for invalid Java method names like _plus_ and _lt_).



RObject.java implements stubs for all method names seen in the script. As a result, all dynamic calls can just be virtual invocations against RObject. Classes that implement one of the methods will just work and the call is direct. Classes that don't implement the called method will raise an error.



RKernel comes with fastruby, and provides Kernel-level methods like "puts", plus methods for coercing to Java types like toBoolean and toString. It also caches some built-in singleton values like nil.




And there's a few other classes for this script to work. It should be easy to see how we could fill them out to do everything the equivalent Ruby classes do.




I don't have any support for a "main" method yet, so I wrote a little runner script to test it.


And away we go!


This is about 30% faster than JRuby with invokedynamic. It is not doing any boundschecking (for rolling over to Bignum) but it is also not caching 1...256 Fixnum objects like JRuby does, nor caching them in any calls along the way (note that it creates three new RFixnums for every recursion that JRuby would not recreate). I call that pretty good.

Obviously because this is designed to compile the whole system at once, we could also emit optimized versions of methods that look like they're doing math. That is yet to come, if I continue this little experiment at all.

There's also some fun possibilities here. By specifying Java types, the compiler could add normal Java methods. Implementing interfaces could be done directly. And Android applications built with this tool would be entirely statically optimizable, only shipping the small amount of code they actually call and having a very minimal runtime.

Pretty neat?

19 comments:

  1. Very neat actually. Perhaps the start of a RubyMotion for Android?

    ReplyDelete
  2. Hi Charles,
    what are the differences respect to Mirah language?

    ReplyDelete
  3. How does is compare to the java equivalent?

    ReplyDelete
  4. Niclas: RubyMotion was indeed part of the inspiration behind my attempting this. I have no idea how their compiler works, but this is how I figured I might be able to do a minimal optimized compiler for Ruby.

    francesco: I guess this lives somewhere between JRuby and Mirah. It has the potential to be closer in spirit to Ruby, but there will definitely be dynamic features of Ruby that aren't possible.

    Remi: It should be exactly the same performance as Java...when Java is allocating the same objects. You can pretty much see exactly what the code is, so it comes down to boxed number performance again.

    ReplyDelete
    Replies
    1. This is very exciting,

      I think it might be especially helpful, when all you need to do is glue code between one or more Java libraries sitting in jars.
      One would essentially pay virtually nothing in performance in comparison to Java in this specific use case.

      Hope this experiment continues, are there any lists of ideas worth exploring? I'd like to participate in what ever I can.

      Delete
  5. A kind of what Coffeescript is for Javascript no ? :)

    ReplyDelete
  6. Very cool!

    Here's an idea I've toyed with for a while: Have a static Ruby compiler such as this. Decide what features of Ruby it is incapable of handling (things like "eval" and such). Then, if the parser encounters any "forbidden" constructs, it halts compilation, invokes a more capable compiler, and restarts. As an added bonus, emit warnings whenever the system has to drop down to a less optimizable compiler (similar to Clojure's "warn-on-reflection" option).

    In this way, I think you could get significant gains for a large number of Ruby programs, and possibly even encourage devs (through informative warnings) to reconsider "magic/performance" trade-offs.

    ReplyDelete
  7. What amazes me the most is that the repo is so small for this.

    You might want to consider renaming it though, since people might confuse it with https://github.com/tario/fastruby

    ReplyDelete
  8. Josh: Yeah I'm kinda debating how far I would want to take this. If you're writing Ruby, part of the benefit are those features that aren't statically optimizable, right? So would a Ruby that's just classes, methods, instance vars, constants, and basic core classes be useful? Or perhaps would it be useful enough that people would accept the loss of evals, metaprogramming, some block binding behaviors, etc? Debatable.

    Davor: The repo is small, but of course it's taking advantage of JRuby to do a lot. But yeah, I wouldn't imagine it getting very big for a while. And good call on renaming...I'll think about that.

    ReplyDelete
  9. It's a good question to ask, I think. If you consider a spectrum from Lua to Ruby, there's definitely a lot of ground in the middle. If you add procs/lambdas (with lexical-only scope and no live rebinding) to that list of yours, I think you still have a very useful Ruby. For example, I'd bet you could still run Bacon (https://github.com/chneukirchen/bacon/blob/master/lib/bacon.rb).

    ReplyDelete
  10. Josh: Yeah, no rebinding was the limitation I was thinking of, since you could simply copy values into the closure on create, rather than keeping them on-heap somewhere. In any case I think we could go very far with the subset and have a pretty useful Ruby-like tool.

    ReplyDelete
  11. Awesome! This is somewhat along the lines of what I was thinking recently: http://www.ruby-forum.com/topic/4405774 and we kind of reached the same conclusion--it would be nice to have a very fast version of Ruby that still does duck typed runtime checking, but perhaps avoids some of the slowdowns Ruby has.

    I wouldn't refer to it as "static compilation" though, since every method returns an RObject so you don't get type checking at compile time...I'm not sure what to call it instead though..."Easily optimizable Ruby"? Anyway if you want to pursue it, I'd definitely subscribe to a mailing list and I think the idea is great--I would love the world to have a "super fast Ruby" option for some of my scripts :)

    Another few ideas: 1) you could have automatic promotion to "RBigInteger" by checking the result of the overflow, as Jruby does. 2) you might be able to integrate it with JRuby objects, themselves, to be able to re-use some of the JRuby standard library. 3) The name suggested in the post was "Amethyst" as a Ruby sequel. 4) Also the startup time will be way less than JRuby's normally is.

    Cheers!
    -roger-

    ReplyDelete
  12. Charles,

    Another great contribution!

    I think this is a cool niche language.
    There are plenty of times when I want to write Ruby, but I need the power of Java.

    And, as long as the Java that is generated is easy enough to work with (i.e. extend, etc) this would be very useful tool. The easy stuff would be easy, and the harder stuff I could punt on ... ;)

    Thanks!

    ReplyDelete
  13. Code translated to Java... I'm not sure you know what 'static compilation' means...

    ReplyDelete
  14. I agree. This is not static compilation. Groovy pre 2.0 is similar. Compiles to bytecode that is compatible with Java being dynamic. Since 2.0 they added the ability to do static compilation. You can specify if you want your entire class to be statically compiled or just a function. The interpreter then throws an error if you try to use dynamic typing in those methods or classes. I don't think you would get a compilation error using your example since your code is still dynamically typed.

    ReplyDelete
  15. Nothx, Roberto: I respectfully disagree.

    Static compilation does not necessarily mean all types are resolved at compile time...or at least that's not the meaning I'm familiar with. The meaning I know means that it's compilation based on a static view of the world. In this case, the entire set of Ruby scripts you want to compile must be compiled at the same time, and based on those scripts a static view of all methods ever invokable is used to build the RObject superclass. All types then essentially resolve to RObject, all dispatches *are* statically typed, and there's no dynamic invocation anywhere.

    You might say it's static compilation where all types are RObject :) But it is indeed statically resolving everything, and does no dynamic dispatch at runtime (unless you consider Java's normal virtual dispatch to be dynamic).

    ReplyDelete
  16. I guess by "static compilation" you mean that method definitions aren't going to change during runtime. To me I'd call that "static JVM runtime ability" or something like that :)

    It would be great to have this as an inline ability in normal Ruby. Start a mailing list so we can discuss this a bit :)

    ReplyDelete