The Power of Java's NIO

Akira Tanaka just did a lightning talk on IO.copy_stream, a new method added to Ruby 1.9 last night that does a fast stream transfer all in C code, avoiding the intermediate strings. It's a good idea, one I hope takes hold for other IO operations that could use some native lovin.

It's so good, in fact, I took five minutes out of my day to produce a crude implementation of it in JRuby (and note, I have not looked at Akira's implementation, so I'm sure it handles cases my version does not):
@JRubyMethod(name = "copy_stream", meta = true, compat = RUBY1_9)
public static IRubyObject copy_stream(
IRubyObject recv, IRubyObject stream1, IRubyObject stream2)
throws IOException {
RubyIO io1 = (RubyIO)stream1;
RubyIO io2 = (RubyIO)stream2;

ChannelDescriptor d1 = io1.openFile.getMainStream().getDescriptor();
if (!d1.isSeekable()) {
throw recv.getRuntime().newTypeError("only supports file-to-file copy");
}
ChannelDescriptor d2 = io2.openFile.getMainStream().getDescriptor();
if (!d2.isSeekable()) {
throw recv.getRuntime().newTypeError("only supports file-to-file copy");
}

FileChannel f1 = (FileChannel)d1.getChannel();
FileChannel f2 = (FileChannel)d2.getChannel();

long size = f1.size();

f1.transferTo(f2.position(), size, f2);

return recv.getRuntime().newFixnum(size);
}
It's primitive, but it does the trick. NIO streams unfortunately only provide this nice "transferTo" method for FileChannel, though there's other ways you can do fast copies from stream to stream using buffers. But this was a nice simple way to wire it up for now. So, let's see numbers.
require 'benchmark'

Benchmark.bmbm do |bm|
bm.report("Control") do
10000.times do
File.open(__FILE__) do |file|
File.open(__FILE__ + ".tmp", 'w') do |file2|
end
end
end
end
bm.report("10k file copies") do
10000.times do
File.open(__FILE__) do |file|
File.open(__FILE__ + ".tmp", 'w') do |file2|
IO.copy_stream(file, file2)
end
end
end
end
end
First JRuby numbers (excluding rehearsal):
                      user     system      total        real
Control 2.191000 0.000000 2.191000 ( 2.191000)
10k file copies 5.190000 0.000000 5.190000 ( 5.190000)
Not bad for 10k copies of a file...it works out to about 3s above and beyond the cost of running the blocks and opening/closing the files. How about Ruby 1.9?
                      user     system      total        real
Control 0.360000 0.550000 0.910000 ( 0.903211)
10k file copies 0.850000 2.450000 3.300000 ( 5.363433)
Ruby 1.9's numbers come out around 4.4s above and beyond the control logic. Obviously we need to look at improving JRuby's numbers for the blocks and file opening, but it's good to know that Java's IO actually *can* be nice and fast when you need it to be.

This feature will only be available as part of JRuby's 1.9 support, which is still in development.
Written on March 30, 2008