Hacker News new | past | comments | ask | show | jobs | submit login
GraalVM 19.0 (graalvm.org)
195 points by tosh on May 11, 2019 | hide | past | favorite | 52 comments



For anyone with a bit of an academic bent, the ideas behind GraalVM are really really cool -- basically, they compile by partially evaluating an interpreter for X with a program in language X. Simple idea at a high level but they actually made it work well, with some clever tricks like annotated partial-eval boundaries in the interpreter and an API for first-class speculative assumption objects.

Their paper is a good read: https://dl.acm.org/citation.cfm?id=3062381


Yes! While not quite mature yet, Graal is now an industrial-strength implementation of some heretofore-theoretical compsci ideas (Futamura projections [1]). You give it an interpreter and it gives you an optimizing compiler. It is, IMO, the biggest technological breakthrough in compilers in the last decade. I picture it as automatically specializing or "templatizing" (as in C++ templates), on a use-site basis, and doing so speculatively, even when proving that the transformation is correct is impossible. Other JITs do so as well, but Graal does it generally for any language expressed as a Truffle interpreter.

BTW, the amount of groundbreaking technological innovation in the Java space these days -- be it with optimizing compilers like Graal, GCs like ZGC, and low-overhead production profiling (JFR) -- going on at Oracle (where I work) is quite phenomenal.

[1]: https://en.wikipedia.org/wiki/Partial_evaluation


Note that PyPy does something similar, but based on (meta-)tracing instead of partial evaluation.

See https://stefan-marr.de/papers/oopsla-marr-ducasse-meta-traci... for a discussion.


There was a lot of work on partial evaluation in the 1990s, including getting compilers via Futamura. But that research mostly withered. Speaking to some the old timers, I get the impression that the core problem was that compilers obtained from interpreters by Futamura didn't perform well. Indeed, in his 2010 PEPM keynote, Augustsson asked: "O, partial evaluator, where art thou?" [1]

Is there a brief summary what is Graal etc doing differently to overcome 1990s style partial evaluation performance problems?

[1] https://dl.acm.org/citation.cfm?id=1706356.1706357


Futamura projections were not theoretical-only before Graal.

There exist several partial evaluators and benchmarks for the use case of Futamura projections.

I will admit that I know too little about the internals of Graal but I haven't seen any papers describing how the traditional issues with partial evaluation have been solved for the general case, so my guess is that all this still only works and yields good results under certain assumptions and designs, i.e. just throwing any generic interpreter code at the partial evaluator alone will not necessarily yield a compiler that is particularly good. [1]

The idea of general partial evaluation and the Futamura projections is well, rather general. Take code and static input and produce a specialized program to that code. In case of an interpreter being the code and a program the static input produce compiled code, etc.

It's rather hard to deliver on the promise that this always works and in particular yields performant results on arbitrary code and inputs. And I don't think Graal delivers this either (not saying it has to).

It's possible that Graal can be considered the first industry compiler for a full mainstream language that does a high degree of partial evaluation and allows tooling around it (i.e. Truffle for implementing language interpreters that can be optimized/become compilers). However it would be disingenuous to claim it's the first compiler/partial evaluator to yield practical results of applying the Futamura projections. (You didn't quite claim that bit the previously "theroretical" part kind of goes there).

[1] > Writing language interpreters for our system is simpler than writing specialized optimizing compilers. However, it still requires correct usage of the primitives and a PE mind- set. It is therefore not straightforward to convert an existing standard interpreter into a high-performance implementation using our system.

> Our experience shows that all code that was not explicitly designed for PE should be behind a PE boundary. We have seen several examples of exploding code size or even non- terminating PE due to infinite processing of recursive meth- ods. For example, even the seemingly simple Java collection method HashMap.put is recursive and calls hundreds of dif- ferent methods that might be processed by PE too.

https://chrisseaton.com/truffleruby/pldi17-truffle/pldi17-tr...


Instead of theoretical I should have said "research-grade". People sometimes don't realize how big the gap between a research finding and a product that actually delivers market benefits is. Often the discovery is only 5% of the work, and the rest isn't just grunt programming, but deep applied research with big technical challenges. That Graal has shown that the idea can actually yield competitive benefits in practice, running on real applications, and at acceptable costs (that are being continuously reduced) is a great achievement. Graal/Truffle is, obviously, not magic, and cannot convert any interpreter to a competitive optimizing compiler (it doesn't even beat C2 on some important workloads, which is why it is not replacing it just yet), but as a first approximation for what it does, I think that interpreter->optimizing compiler is a fair description, and even from a full-picture point of view, Truffle languages are competitive at a significantly reduced cost to other approaches we see in industry. Applied research that yields significant bottom-line benefits is precisely what a technological breakthrough is.


> I picture it as automatically specializing or "templatizing" (as in C++ templates), on a use-site basis, and doing so speculatively, even when proving that the transformation is correct is impossible

In isolation, this sounds like a description of how HotSpot works, no? (For the uninitiated: HotSpot's JITs make dangerous assumptions, optimise accordingly, and discard compiled objects if those assumptions ever break.)


> In isolation, this sounds like a description of how HotSpot works, no?

First of all, Graal is a compiler that serves as a HotSpot compiler when running as a Java bytecode JIT (HotSpot is the name of OpenJDK's JVM: it includes two compilers, with Graal serving as a third, an interpreter, several GCs and various other runtime features). But yes, HotSpot's default optimizing compiler, called C2, also does speculative optimizations and deoptimizes on "mistakes", but Graal is more general in the sense that its easier to teach it various optimizations for many languages. C2 is very good, but because Graal is believed to be easier to maintain, it may match and surpass it one day (it already surpasses it for some important workloads). Project Metropolis investigates the possibility of eventually making Graal the default optimizing compiler in HotSpot (https://openjdk.java.net/projects/metropolis/)


I like Graal a lot but a lot of the performance benefits are in the EE version. That’s not in and of itself an issue except we are talking Oracle here.

Several months ago I emailed them to ask what the pricing would be for my SaaS service to license GraalEE. I got a reply stating they’re working on it and will get back to me.. haven’t heard a peep since.

So the EE version is available for evaluation but not even Oracle can tell you how much it costs or how to get a license :-/


I was at a Java conference recently, and sat in a session with an Oracle guy showing this off and answering a lot of questions about this topic.

It's really not that nefarious. The bottom line is that they haven't come up with pricing yet, because it's not at a point where the people who would pay are interested yet.

Oracle's target audience is large companies that will pay out the wazoo for something enterprise-grade and fully-baked. And it's just not there yet. It doesn't support the current version of Java. It doesn't run on Windows, which is 95+% of the developer workstations in those environments. The debugger story is pretty weak (i.e. do all your debugging in a regular JVM, and hope for the best after you compile to native). For that crowd, it truly is at the stage of, "Check it out at conferences, and be aware that it's coming, but don't plan any project work around it yet".

At that stage of a product's lifecycle, the architects might be tinkering with it, but the management and business stakeholders with budget authority don't even want to hear about it.

Spoiler alert: it's probably going to be expensive. If you're hung up on this question, then you probably aren't going to get to use it. How many HN and Reddit commenters use any paid Oracle products at all? But for the startup/indie/student crowd that dominates here... there's plenty of interesting benefit to the open-source version, and there's no "catch" to its licensing beyond that of OpenJDK itself.


I’d say it is nefarious to put out an EE version that’s marked ‘production use requires a license’ and then make it impossible to figure out how to get a license or what one would cost.

Even enterprises would deep budgets would not let their devs near such an arrangement - they could be held to ransom by Oracle!

If it’s not ready or ‘fully baked’ as you say then mark it as beta and say that future versions will require a license.


Well, yes. Devs in that environment would not go near it (at least not for anything beyond a POC). In those environments, you have to go through a review board process in order to use some new open source JSON-parser library from Maven Central. Devs aren't just cowboy'ing a new compiler into their workflow.

"Production use requires a license" is a NOT an indicator that it's ready for production use (there are a number of warnings against this). The fact that you need a license, yet can't get one, is a clear signal that it isn't ready for production use. The people in Oracle's target audience see this sort of thing a lot with preview software, and aren't confused by it.


> The fact that you need a license, yet can't get one, is a clear signal that it isn't ready for production use.

Everything you said is reasonable except this. If it isn’t ready for production use then no need to beat about the bush by advertising a commercial license you can’t get? And if one doesn’t exist by the way how are Twitter using it in production?


I honestly don't know how to explain this more clearly. They are not "advertising a commercial license". They are saying NOT to use the Enterprise version in production, because licensing is not yet available.

The only reference I've ever seen to Twitter using GraalVM in production is in this discussion thread. And that commenter clearly states that Twitter is using the open-source community edition.

No one who would ever conceivably be a paying Oracle customer would be confused by any of this. When that much money is on the table, there are experienced people in the room with reading comprehension.


> They are not "advertising a commercial license".

https://blogs.oracle.com/graalvm/announcement "GraalVM Enterprise is available for purchase and is free on Oracle Cloud."

> They are saying NOT to use the Enterprise version in production

[citation needed]

> The only reference I've ever seen to Twitter using GraalVM in production is in this discussion thread.

Considering that Twitter has been talking about this in public for some time, this statement is unlikely to make the point you were trying to make: https://www.youtube.com/watch?v=pR5NDkIZBOA https://www.youtube.com/watch?v=PtgKmzgIh4c https://www.youtube.com/watch?v=ZbccuoaLChk


Twitter uses the free GPL Community Edition in production. EE is faster but a lot of the magick is in the open source one.


> a lot of the performance benefits are in the EE version

I think Twitter get something like 13% with the CE version. That’s absolutely massive in terms of JVM performance increase for the same code.


Have you benchmarked anything to see any difference between CE and EE or are you just going by their download page? I am super interested in meta-compilation approaches and reading a bit of literature on Rpython and Graal.


We have compared both versions and for the most use cases performance are similar and EE is a bit faster but for some even the CE is faster.

I know projects that are 100% open source would be nice, but the reality is that it is not a bad idea to have a business model attached to an open source IMO increasing its sustainability.


Here is more information about EE version : https://docs.oracle.com/en/graalvm/enterprise/19/guide/overv...


> Several months ago I emailed them [...] haven’t heard a peep since.

Stupid question: Why don't you ask again? Several months ago it was in release candidate mode, while now it is a real product. You should probably be able to get more concrete information now.


Profile guided optimization is part of the EE version, that is probably the biggest difference.

I also saw that the EE version is free of charge on Oracles cloud, smart !


Seriously though.

Anyone have any idea what it might cost?


How come it seems like none of the smaller, general-purpose VMs took off? Did .Net and the JVM win?

I recall there being several contenders 6-10 years ago. Like the Parrot VM [1], which was driven by Perl but aimed to support a large variety of languages.

[1] https://en.wikipedia.org/wiki/Parrot_virtual_machine


I estimate that developing OpenJDK + Graal/Truffle costs ~$100M a year (hundreds of full-time developers). That's a lot of money.


Making a VM isn’t too difficult. Making s VM with an optimising JIT is a lot more work, and adding things like tooling, instrumentation and debugging increases the work needed.


Nobody did the hard work to add advanced optimising backends to those projects.

They were all new ideas for the frontend - great but someone somewhere has to do the work to get it to perform well, and few people know how to do that so it never got done.


Parrot never reached a point where it didn't suck. GraalVM has the futurama projection tech to make the universal vm dream possible.


futamura? :)



For those interested in this sort of thing, this video showcases Graal's ability to compile JVM bytecode to native binaries in a really impressive way: https://youtu.be/topKYJgv6qA


I just watched the entire video. Thanks.

TLDR for others: GraalVM lets you compile your code to native, which makes startup times 1000x faster and memory footprint 10x lower. This is great for things like "java/clojure based command line tools", and maybe even micro services. Long running processes on the JVM are about 2x faster than a GraalVM compiled program, so it won't make your normal backend any faster. This is because JIT is better since it can understand how your program is used when selecting which types of optimisations to apply. Also GraalVM doesn't support everything in the dynamic class loading and reflection space.


I started using zprint code formatter for clojure. The binary is running GraalVM. I invoke it form editor and the result is instantaneous! No startup, no warmup, just nicely formatted code where the jungle used to be. I'm really excited about this technology.


NOTE: You are describing the native-image feature of GraalVM. GraalVM is primarily a full featured JVM so you can run your normal backend and it will probably be a bit faster, and it does support everything in the dynamic class loading and reflection space.


For people like me who've never heard of this before:

> GraalVM is a universal virtual machine for running applications written in JavaScript, Python, Ruby, R, JVM-based languages like Java, Scala, Kotlin, Clojure, and LLVM-based languages such as C and C++.


It's worth noting that the project is also owned by Oracle. Regardless of how great GraalVM is, that fact may give some people pause...


OK, but four of the five FAANG companies are betting very heavily on OpenJDK (even Facebook uses it, but not as heavily as the others), which is also owned by Oracle. Oracle has been good to OpenJDK over the decade they've owned it (disclosure: I work on OpenJDK at Oracle).


also worth noting is that Oracle has completely open sourced the Community Edition and actively developing it https://github.com/oracle/graal (I work at Oracle)


Questions,

Native Image is basically SubstrateVM ?

Is Sulong part of GraalVM? I never seen it mentioned in those release notes. I am just wondering if running the C extension task is even possible in real world.

Timeline on Project Loom?

And I know this is annoying, TuffleRuby, when can it run rails?


I'm the lead for Project Loom. We don't commit to a timeline on any OpenJDK projects, partly because the question of "is it good enough to release?" depends on feedback. I can say that early access builds will be available in a couple of months, and that I don't think it's too soon to take Loom into consideration if you're now planning a new long-term project.

Native image is, indeed, Substrate VM, but I'll let Graal people address the Graal questions.


> I can say that early access builds will be available in a couple of months, and that I don't think it's too soon to take Loom into consideration if you're now planning a new long-term project.

That's great to hear. Can anyone contribute to the project to help out?


Sure. What we need most is for people to try the API and give us feedback, try running various frameworks with fibers and give us feedback etc..


Hoping to see some Scala speedups with tail call optimisations finally possible :)


Tail calls will come later. We're focusing on fibers, first (there's a cost/benefit consideration).


Sulong is GraalVM's LLVM bitcode interpreter, so it's mentioned under LLVM in the release notes. Its code base has been merged into GraalVM's main repository at https://github.com/oracle/graal.


If Oracle can make this project the heart of a cloud offering; I will guess they will have an advantage over Amazon, Google and so on.


You can run Graal VM on Oracle cloud today: https://docs.oracle.com/en/graalvm/enterprise/19/guide/overv...


This seems useful I need to look into it. It could mean having a convinient runtime for calling Java db drivers?


Can someone provide a simple explanation of Graal for those of us who are completely oblivious of all of the concepts of Graal?



Is there a good tutorial on contributing to the language implementations like graalpython?


There is an example simple language which may be a good place to start. It’s also worth asking on gitter as most language teams either have some good introductory tasks they can give you, or can come up with something.

It’s very different from working with a bytecode compiler and a normal vm and interpreter, but lots of fun to work with.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: