Hacker News new | past | comments | ask | show | jobs | submit login
FastR: An implementation of the R language in Java [pdf] (oracle.com)
85 points by susi22 on Nov 16, 2013 | hide | past | favorite | 33 comments



The programming language analysis is pretty interesting, but you have to ask, what's the point of a brand-new Java implementation? R isn't just a programming language, but it's a software framework/ecosystem. They mentioned this in the slides, but it's problematic because R crucially relies on C and Fortran interaction (which I thought the JVM can't do efficiently, since it doesn't like giving C/Fortran raw memory access to its internals). Decades of work has gone into highly optimized Fortran linear algebra libraries, for example -- which R and all the other high-level numerical languages (NumPy/SciPy, Matlab, Julia) use. And many of the CRAN packages (the availability of which are a major reason anyone uses R in the first place) are partly or mostly C/Fortran code.

There are many other R implementation efforts going on right now -- Radford Neal lists a few (as well as his own) here: http://radfordneal.wordpress.com/2013/07/24/deferred-evaluat...

The presentation focuses on the R programming language, which they nicely show has all sorts of misfeatures that impede rapid execution. If you're going to not try to have compatibility with R and CRAN, you might as well start from scratch with design and performance in mind, as in Julia: http://julialang.org/


> ... C and Fortran interaction (which I thought the JVM can't do efficiently, since it doesn't like giving C/Fortran raw memory access to its internals).

As of 2002 (JDK 1.4) Java has excellent integration with native memory (you can freely pass pointers from C/FORTRAN to Java and vice versa[1]).

There are numerous Java math libraries that use BLAS/LAPACK already[2]. In fact, AFAIK, most Java matrix math libraries use FORTRAN code (at least as an option).

[1]: Java side: http://docs.oracle.com/javase/7/docs/api/java/nio/ByteBuffer... C side: http://docs.oracle.com/javase/7/docs/technotes/guides/jni/sp...

[2]: For example, https://github.com/fommil/matrix-toolkits-java, http://mikiobraun.github.io/jblas/


Is there any way to ensure that accessing ByteBuffers (or whatever) is fast?

I had used memory mapped buffers (which should be equivalent in performance), and there was no way to make the JIT inline access to these arrays. It was all calls (indirect, not properly branch predicted at that). Equivalent code to C++, running 10 times slower, with no way to speed it up.

(And the reason I was using memory maps, if you insist -is a 2GB read-only dataset used by multiple processes at the same time - I went to C++ eventually, because there was no way to get reasonable performance from Java, either memory use or speed. This is circa 2010)


Yes. High performance Java code sometimes makes use of JVM intrinsics, accessible through the sun.misc.Unsafe class. Those are JITted down to a simple memory access instruction. That class also has intrinsics for CAS, and in JDK 8 it's got intrinsics for different memory fences as well. Those calls are compiled to a single native instruction.

For example, the Java Chronicle library[1] uses these techniques, as well as memory mapped files, to implement fast persistent message queues.

[1]: https://github.com/OpenHFT/Java-Chronicle


Cool. Was this in Java 6? (circa 2010 - I couldn't find a way to do it back then). Also, why would you need (even on JDK7 or JDK8) unsafe access to jit a memory mapped access inline? Is there an underlying philosophical reason, or is it just that they never got to do it?


Yes, it was in Java 6 (except for direct access to fences, which has been added in Java 8).

The sun.misc.Unsafe class is used extensively by JDK classes, and is meant for internal use. It provides intrinsics that are translated to a single machine instruction. Normally, you don't use the class directly. For example, you use, say, AtomicInt for CAS operations (which, internally uses s.m.Unsaafe) or the ByteBuffer class (which internally uses s.m.Unsafe for direct pointer access). The JDK classes add all sorts of protection (like range checks) around s.m.Unsafe, but if you know what you're doing, using s.m.Unsafe directly and eschewing some of those protections (usually adding ones more pertinent to your domain), you get some performance gains which may be significant depending on your use case.


Ah, wonderful! So if I understand this correctly, this doesn't give C/Fortran access to Java-native primitive arrays; but instead, it's specific to NIO byte buffers (and then the matrix libraries have to build on top of that). But that should be fine for doing R replacements, at least in theory.

(Personally, when programming Java I find it more convenient to use primitive arrays as opposed to matrix libraries, but that might be dependent on the operations I tend to do: lots of increment/decrements and only occasional linear algebra. I guess this isn't exactly relevant to the R replacement question.)


Java also gives native libraries direct access to primitive arrays, but that requires pinning them in place for the duration of the call (i.e. not letting the GC move them) so it incurs some performance penalty.

Direct byte buffers are very common in high-performance Java code. Reading/writing from/to those buffers can be made just as fast as plain Java arrays.


Thanks for the information. Sorry I was misinformed before...


I think compatibility with GNU R and CRAN is a pretty reasonable goal - even if it's not the focus of this particular Oracle research project.

We've been working hard and now systematically to get Renjin (also R on the JVM) to run CRAN packages: http://packages.renjin.org. Renjin also compiles C and Fortran code to JVM bytecode, though there is still some work to do there as well.

Regarding the hand-tuned matrix math libraries, there's nothing to stop you from using them with Renjin - you can drop in MKL or Atlas as desired, or fall back to pure-Java versions in a pinch.


Julia is quite cool, actually.

As for Java, Oracle together with AMD, are in the process of making the GPU trasparent to Java developers as part of the Sumatra project.

So this is one are where R could benefit of running on Oracle's JVM. It remains to be seen if other Java vendors would adopt such feature.


FWIW, there's also Renjin that does this: https://github.com/bedatadriven/renjin


Given the problems with Java's floating-point implementation [1], would this be reliable for statistical analysis?

[1] https://news.ycombinator.com/item?id=6585828


1. It is more a philosophical disagreement than a problem, and whatever the semantics you want, the JVM does not limit you. The disagreement revolves around the Java compiler and language semantics, not the JVM.

2. The math in FastR, if I understand the presentation correctly, is performed by FORTRAN libraries anyway. Using battle tested FORTRAN libraries for matrix computations is common practice in C, Java, Julia, Matlab and most other environments. They basically all share the same underlying matrix math code.


I do think that Kahan's objections to Java's floating-point support were more than just philosophical, although it's unclear to me at this point how many of them still apply. The biggest issue that seems to still remain is that you can't change rounding modes or check flags, both of which are essential for verifying numerical stability and correctness. Far worse than any floating-point issues on the JVM is that your indices and integers are 32-bit, so you're limited to 2GB arrays before you have to take bizarre measures to access larger amounts of data.

Although it is standard in high-level systems to call out to a BLAS library [1], for some inexplicable reason it seems that both R and NumPy use the reference BLAS by default, which is quite slow – around 4x slower than better BLASes. Matlab ships with Intel's proprietary MKL, which includes a very fast BLAS implementation, while Julia ships with OpenBLAS, which is a similarly fast open source BLAS implementation derived from the legendary GotoBLAS [2]. Since all BLAS implementations share a common Fortran ABI, it's easy to swap them out, but it's not quite true that all of these systems are using the exact same Fortran code.

[1] https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprogra...

[2] https://en.wikipedia.org/wiki/GotoBLAS


There is a lot of work going into designing java arrays "2.0", including support for "long" arrays, immutable arrays, and arrays of structs.


These are points that need to be addressed for any implementation of R, or any other domain specific language for which numerical guarantees nearly always outweigh performance concerns.

GNU R, for example, is implemented in C, and the implementations of R's basic arithmetic functions are actually quite complicated because they take care of so many of the edge cases cited in the cited post. For example, the round() function casts its argument first to a 64-bit before calling the C library's rint() function to preserve precision. [1]

[1] http://svn.r-project.org/R/trunk/src/nmath/fround.c


The semantics of Java need not restrict what other languages on the JVM do, though it may make the generated byte code a little larger from the inclusion of f2d and d2f instructions.



The Relite mention at the end looks very worthwhile. We were just about to begin a large rewrite of analysis code from R to CUDA; Relite has the potential to save the effort of rewriting our existing code.


Relite author here. I'd love to learn more about your use cases, feel free to get in touch!


Definitely, thank you for the offer. I'm running through the install now but will send you an email once I've run a couple tests. I see your contact info is listed on your home page.


Worth noting is that FastR is a JVM implementation of R that uses Truffle and Graal.

http://openjdk.java.net/projects/graal/

https://wiki.openjdk.java.net/display/Graal/Publications+and...


Are there any tools to convert R to JS or Python, or... any other common language that doesn't require a 60mb runtime distribution?


That's a really bizarre reason not to use R. Python's distribution is about 30MB if I recall. It doesn't really make much sense to convert R to JS or Python, since the semantics are so different.

If you don't want to use R, Pandas in Python provides very powerful data frames (which are likely faster for many cases). However, it depends on NumPy, matplotlib, and a few other libraries, which probably total more than 60 MB.


Python 3's installed size is 90MB here, 2 is 60MB.


Good thing is that python is pretty ubiquitous, as is virtual env :)


I already use python, so having it in same codebase would be perfect. However, I have some code in R that is pretty important, so I have to include that as well with an application (model).


I have done some work converting R to C++ by hand. The problem, as this presentation discusses, is that a 95% automated conversion is easy, but that last 5% often involves diving deep into the weirdness of the R interpreter.


I am using RCPP for interfacing, didn't see anything automatic for conversion though, it's just an interface.


A guy named Jony Hudson has gotten R to kind of compile to JS using emscripten.

See http://r.789695.n4.nabble.com/R-in-the-browser-td4667985.htm...

But the JS blob ends up being like ~15mb!


more java -_- I need less java in my life, not more.


Which platforms are you targeting?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: