The Deletion of gcj

rekado · on Oct 3, 2016

This is a little sad because this means we will have to keep around old versions of GCC with GCJ in order to be able to bootstrap the JDK from source (without a binary bootstrap JDK). Only OpenJDK6 (with the extensive IcedTea patches) can be bootstrapped with GCJ alone, so we also have to keep OpenJDK6 around in order to be able to build the latest JDK.

Bootstrapping becomes harder and more legacy software needs to be accumulated in order to actually build new software from source.

(I'm working on GNU Guix, a functional package manager, and I wrote the package definitions for the JDKs.)

josteink · on Oct 3, 2016

As someone who worked on porting Microsoft .NET Core to new platforms, that's definitely a pain I felt.

At first, very few assumptions was made about the Unix bootstrapping process, and the mantra was that you should build the managed components on a Windows machine which already had .NET.

Just to get something going, the Unix community decided to use mono to bootstrap the process on non-Windows platforms. This effectively meant you could bootstrap .NET Core on almost any platform where mono was around. That gave quite some platform reach out of the box, and on overall an improvement much welcomed by Microsoft.

But later, after Unixes like Linux and OSX become better supported, the bootstrapping method changed to downloading a previous build of .NET Core for the current platform and using that to build the latest sources. I.e. going self-bootstrapping, which sounds fancy on paper.

The cost of that? Suddenly it just wasn't doable (without considerable knowledge and effort) to bootstrap .NET Core from source any more on platforms not currently considered first class.

I'm not just talking about "strange" outlier OSes like FreeBSD, NetBSD or Alpine Linux or whatever. Just trying to build for Fedora 24 (which doesn't have published packages) fails. This is because the Fedora 23 packages are linked against library-versions which are no longer in Fedora 24.

You'd think that be easy to fix, but now, even with Fedora 25 betas starting to ship, there's still no official Fedora 24 package exists from Microsoft, so Fedora users (and developers, and porters, etc) are SOL.

So now basically just trying to bootstrap for a previously supported mainstream Linux-distro is seriously PITA, next to impossible. In my honest opinion that's a major regression just for the sake of being self-bootstrapping.

So yeah. You should never underestimate the value of having secondary platforms available for bootstrapping your own. You may come to depend on it.

snuxoll · on Oct 3, 2016

Compilers that bootstrap themselves should try to eliminate external dependencies needed to just get the bootstrap going for this exact reason. Hell, Ruby uses itself as part of the build process and one of the first things it does is compile `miniruby` which is a small enough variant of the runtime and libraries that it can run `rake` and a couple other tools.

This is actually really sad to hear, because I've been waiting to see proper RPM's for the CoreCLR in Fedora for a long time, but having to constantly bring in new binaries to bootstrap the package is a huge headache that few package maintainers would be willing to take. Maybe provide the option to build a statically linked build that could be shoved into a different package and used to bootstrap a new dynamic version?

int_19h · on Oct 3, 2016

That sounds like it's partly due to poor componentization and/or dependency graph of the framework - correct me if I'm wrong here...

When I think of bootstrapping something like .NET, I would assume that most high-level stuff (i.e. large parts of the standard library, compiler and other tooling etc) are all 100% managed. So to run those, all you need is the VM and the part of stdlib that is implemented in native - the rest is just bytecode, and shouldn't have any OS dependencies, so you could easily redistribute and reuse the older build.

On the other hand, VM and native parts of stdlib, I would expect to be written in C or C++, and so, while that's where all those pesky dependencies are, that's also the code that you should be able to build with just as C/C++ compiler.

So the process would work like so: 1. Build VM. 2. Build minimal stdlib (native parts). 3. Use the preceding to run an older build of full stdlib and compiler, precompiled into bytecode. 4. Use the preceding to rebuild the compiler. 5. Use the preceding to rebuild the stdlib.

Which is also something that a fairly simple script should be able to tackle, such that the dev only needs to do "make bootstrap" or something similar.

So, where does this picture break?

josteink · on Oct 4, 2016

> 1. Build VM. 2. Build minimal stdlib (native parts). 3. Use the preceding to run an older build of full stdlib and compiler, precompiled into bytecode. 4. Use the preceding to rebuild the compiler. 5. Use the preceding to rebuild the stdlib. ... So, where does this picture break?

It breaks at point 3. There's 2 distinct things which breaks here, much due to what you call poor componentization.

To build the stdlib, you need Nuget (which is written in .NET) to resolve binary dependencies and the Roslyn compiler (also written in .NET). These two components are wrapped up in a simple to use all-in-one command line utility, much like Clojure's leininginen: Basically one tool to rule them all. This tool is also written in .NET.

This means you need a prebuilt .NET version, including the base stdlib, to run the toolchain and compiler needed to build the stdlib itself. In theory this should not have to present any issues. Basically so far so good.

The problem is that what you need to download will contain 1. corerun/corehost runtimes used to bootstrap the process (platform native binaries) and 2. The managed stdlib, which also contains platform specific binaries (Win32 system-calls on Windows, glibc on Linux, another Libc on FreeBSD, etc). These are not all bundled in one big package where binaries for all platforms are included, but as separate packages per supported platform (Windows, OSX, Fedora 23, Ubuntu 14, Ubuntu 16, etc)

So on a previously unbootstrapped platform you can run into any of the following issues:

1. Native components are dynamically linked, and may fail to resolve libraries if they have had their soname/version bumped, even though you can build these components yourself just fine. This is obviously a structural issue with the build process, which IMO should be fairly doable to solve.

2. Because the compiler itself is .NET -based, it relies on the managed stdlib, and this this library itself must be built for your platform (FreeBSD's implementation differs slightly from Linux, etc).

So on a platform where nobody has ever done any work before, the download for step 2 will fail and the build will crash.

To rectify this you will basically need to get acquainted with dotnet's multitude of individual repos, how they tie together, in what order they should be built, how their build-outputs should be combined, and all kinds of un/under-documented stuff, and do that manually... Phew! ... To package a base "SDK" which can be used to initiate the normal coreclr build-process, which previously failed. You may need to fake this SDK for several dependent repos.

That single step is incredibly complex, and I think only a handful of people on the whole internet knows how this is done.

Compare this to how things were previously: The process was bootstrapped automatically using mono, and every single coreclr developer knew how to get things up and running.

That this was greenlit without any further objections is something I find hard to believe.

int_19h · on Oct 4, 2016

I'm still not quite getting this part:

> The problem is that what you need to download will contain 1. corerun/corehost runtimes used to bootstrap the process (platform native binaries)

if they're native, why not just build them right there and then? Or is that where the dependency resolution tool (that is managed) creates a circular dependency?

The other problem, if I understand correctly, is that the managed code has a bunch of #ifdefs for various platforms. If, instead, it selected one codepath or another depending on the platform at runtime (e.g. for something like Process, which has to invoke radically different APIs depending on the platform, make it a very thin wrapper around Win32Process/UnixProces/..., and pick the appropriate factory in the static constructor), the same managed code bundle could be used everywhere. Except for a brand new platform, of course, but that is a problem that the Mono solution also has (if you have to target something that doesn't run Mono).

josteink · on Oct 5, 2016

> if they're native, why not just build them right there and then?

You typically do... But then the next build step says "DL this pre-made SDK kit" and doesn't overwrite it with what you've just built. So you will need to patch the build-process (in several repos) to overcome this.

> The other problem, if I understand correctly, is that the managed code has a bunch of #ifdefs for various platforms. If, instead, it selected one codepath or another depending on the platform at runtime (e.g. for something like Process, which has to invoke radically different APIs depending on the platform, make it a very thin wrapper around Win32Process/UnixProces/..., and pick the appropriate factory in the static constructor), the same managed code bundle could be used everywhere.

Correct. And so is your analysis.

But for several reasons that was not the solution which was chosen. Having an extra vtable or stackframe for every platform action invokable throughout the entire .NET framework was considered something to avoid as far as possible, with regard to performance and memory efficiency. And that's absolutely a valid concern to have, since it effectively will affect all .NET applications.

Remember: The goal here is for .NET Core to server as a base for .NET as delivered on the main, commercial Windows platform (including Azure!) with as little changes as possible. And in the cloud-space you don't want to impede your own performance.

Disclaimer: Things may have changed by now, but that at least was the state last time I poked into things.

lmm · on Oct 3, 2016

> I'm not just talking about "strange" outlier OSes like FreeBSD, NetBSD or Alpine Linux or whatever. Just trying to build for Fedora 24 (which doesn't have published packages) fails. This is because the Fedora 23 packages are linked against library-versions which are no longer in Fedora 24.

This sounds like more of an issue with Fedora than anything else. If Fedora 24 can't run programs for Fedora 23 (not ancient programs that happened to work on Fedora 23, but programs that were specifically built for Fedora 23), that's pretty ridiculous.

snuxoll · on Oct 3, 2016

Fedora maintains binary compatibility within a specific release, but major versions of software are updated every 6 months and if the ABI is different a soname bump will happen. Unless software in the Fedora repositories flat out will not build or run with a newer ABI-release of a dependency older versions of the library aren't typically packaged.

In all honesty it's virtually never an issue, most libraries maintain API compatibility and everything gets a mass-rebuild before alphas start to ship (though everyone tries to notify affected maintainers when they are going to do a soname bump on the mailing list) - so as long as you don't depend on old libraries for bootstrap of new package versions it's fine. There's plenty of compilers in the repositories that bootstrap themselves every release without giving grief, so Microsoft needs to either provide an external bootstrap mechanism again or reduce the dependencies needed to get it bootstrapped to preferably nothing more than glibc.

wtallis · on Oct 3, 2016

> If Fedora 24 can't run programs for Fedora 23 (not ancient programs that happened to work on Fedora 23, but programs that were specifically built for Fedora 23), that's pretty ridiculous.

As a general-purpose application framework/runtime, .NET probably has far broader dependencies than any one application. Maintaining binary compatibility for a dynamically-linked .NET framework is a tall order.

josteink · on Oct 3, 2016

> This sounds like more of an issue with Fedora than anything else. If Fedora 24 can't run programs for Fedora 23 (not ancient programs that happened to work on Fedora 23, but programs that were specifically built for Fedora 23), that's pretty ridiculous.

I've seen issues like this for significantly smaller packages on Ubuntu.

One example being exactimage, which for some reason weren't included in one single release around 14.04. Trying to copy a single binary across versions here, would cause similar breakage because of the versions of linked dependencies.

I think this is "symptomatic" of the Linux world and actually fairly common in software where source is assumed available and recompiling using that source a feasible task. Things like this is usually just not problem in the real world when things are properly packaged.

Again: for .NET the problem was not the dependency itself, but the root bootstrapping process.

masklinn · on Oct 3, 2016

Wouldn't a good cross-compilation story work for bootstrapping? Once you've got the system to run on platform A, you can cross-compile for B and use those artefacts to bootstrap B.

Nullabillity · on Oct 3, 2016

That doesn't really work for a system like Nix (or derivatives like Guix), where binary packages are only considered an optimization, and packages are always rebuilt without having the old version, or other user state, available.

kryptiskt · on Oct 3, 2016

How do they handle GHC or Rust which must have a compiler to build?

Ericson2314 · on Oct 3, 2016

Binaries. But we could in principle go back to them being written in other languages.

masklinn · on Oct 3, 2016

The initial installation becomes ridiculously painful though, and even that might not work if the original language as itself self-bootstrapped, or cross-compiled with something not available anymore (e.g. something on a different architecture)

oneplane · on Oct 3, 2016

But Nix and Guix need a base compiler to start with, right?

vertex-four · on Oct 3, 2016

Indeed, though the goal is to keep the closure of base dependencies as small as possible. They don't really want to be depending on anything other than a C/C++ compiler and a shell.

Ericson2314 · on Oct 3, 2016

Actually, criss compiling is a solution here. Hydra should be able to deal with packages on one platform depending on packages from another.

davexunit · on Oct 3, 2016

How does cross compiling help? How do you build the first Java from source?

JoshTriplett · on Oct 3, 2016

Cross-compiling allows you to just have one binary of the compiler, for a fast architecture like x86-64, and then bootstrap binaries for every architecture from there. Without cross-compiling, you need to have a bootstrap binary of the compiler for every architecture you want to support.

davexunit · on Oct 3, 2016

That's a different and less important problem.

mitchty · on Oct 3, 2016

As someone that has done this to get ghc building on an unsupported platform, it really is the only sane way to approach this chicken/egg problem. I'm slightly surprised that .net doesn't have a cross compilation option.

davexunit · on Oct 4, 2016

The sane way is to write a bootstrapping compiler, like some language implementations do.

mitchty · on Oct 4, 2016

ghc can compile to c to do bootstrapping, or a more standard 3 stage bootstrap.

There are pros/cons to both approaches.

davexunit · on Oct 4, 2016

Machine-generated C is not source code, so this doesn't solve the reproducibility problem.

mitchty · on Oct 5, 2016

So for languages like Haskell, or lets have fun and throw Idris in the mix too as its implemented in Haskell, what/how should they be changed to not have to bootstrap the way they do now so that reproducibility is preserved?

For languages that have effectively abandoned that route, what would satisfy your requirements? Which languages specifically handle things in a way you consider correct? Is the effort involved worth it if only reproducibility is the end goal?

I don't think its safe to say that a bootstrapping compiler is the only sane way. Even if System F is a fairly simple language and what GHC compiles to, having to keep a bootstrapping compiler around just for bootstrapping purposes seems like mostly wasted effort IMO.

justincormack · on Oct 3, 2016

Yes that's very useful. Go has that, NetBSD has that, its getting more common at last. It doesn't solvevall the problems but has other benefits.

Scaevolus · on Oct 3, 2016

Maybe releasing a container with .net core included and using that to bootstrap on other platforms could work? You would run the container with the source code and build directories from the host mounted and run the appropriate command.

It's not perfect, since now you have a dependency on Docker (or equivalent), but maybe it's still an improvement?

There's also some efforts like [snap](http://snapcraft.io/) to make it easier to release binaries with all their dependencies packaged together, but I expect making it work for a complex project like .NET is difficult.

loeg · on Oct 3, 2016

We call them "statically linked binaries". Yeah, that would work well for bootstrapping.

Scaevolus · on Oct 3, 2016

Statically linking binaries doesn't help when there's a bunch of auxiliary resources (bytecode/class libraries) that the compiler needs to run.

You can bake that into one binary in additional segments, but it can be fiddly and require more extensive code changes to make various fopen() calls read from the binary instead of a standalone file.

loeg · on Oct 3, 2016

Fortunately, this is not the problem described above.

danudey · on Oct 3, 2016

Likewise, in order to build Go from source you need to already have Go 1.4 installed. In an OS where you by default download source and compile it (e.g. Gentoo, MacPorts on OS X), this means downloading an old, unsupported, and unmaintained version of Go just to build it to run a newer version. I had an issue recently where, on my laptop, Go 1.4 refused to build, and now I have no versions of Go on my laptop. I could install from binaries from the website, but that wouldn't satisfy dependencies so I'm stuck for now.

civodul · on Oct 3, 2016

Seconded. With GCJ, Java had a bootstrapping path from C/C++, unlike all the current single-implementation languages: Rust, GHC, etc.

Having this path is important from a security and user freedom viewpoint: we could build OpenJDK from source.

Not having such a bootstrap path means relying on huge opaque binary blobs, which is a net loss.

pcwalton · on Oct 3, 2016

Well, for Rust the binary blob isn't opaque: you can go back through the source code history and find the exact source for every revision of the blob, all the way back to the OCaml bootstrap implementation. There's no loss of "user freedom" at any step, just a lot of complexity (but compared to the complexity of, say, GCC it's a drop in the bucket).

lmm · on Oct 3, 2016

Was bootstrapping GCJ from source any easier than bootstrapping OpenJDK would be? GCC itself is very complex and while it has a bootstrap process that still relies on having a C compiler AIUI.

davexunit · on Oct 3, 2016

GCC is in the set of bootstrap binaries, which needs to be kept as small as possible. It's imperative to raise awareness of this issue so that language implementors provide a way to bootstrap their compilers. For example, Guile Scheme can bootstrap itself from source using an interpreter written in C.

zck · on Oct 3, 2016

This is completely off-topic, but I've been having a hell of a time trying to figure out how to, on Guix, install the JDKs and then set JAVA_HOME properly. Is there documentation about how to do that? I'm sure I'm just missing something.

I'm happy to talk either here, or my email's in my profile, if you prefer.

rjsw · on Oct 3, 2016

I would have thought that Jikes was a better bootstrap compiler than GCJ.

mickronome · on Oct 3, 2016

I think Jikes RVM needed some kind of bytecode interpreter to build itself, but I don't think it would need to be very advanced. Slow would also be OK since the part getting compiled was not terribly big. Don't know if it could help in any practical way, but it's a fun piece of technology!

nn3 · on Oct 3, 2016

Isn't jikes written in Java? You would need to bootstrap it first.

correktlogic · on Oct 3, 2016

I was a contributor to Jikes back in the day. The open sourced IBM Jikes compiler was written in C++. But it didn't produce native code. It produced .class files from .java files.

rjsw · on Oct 3, 2016

You need to be able to produce .class files, as well as C headers, to bootstrap the Sun JDK. You don't need to compile Java to native code.

correktlogic · on Oct 3, 2016

Jikes didn't have any javah-like functionality to generate .h files to my knowledge.

lsd5you · on Oct 3, 2016

It was used for creating the native builds of XWT [1] 15 years ago. At the time I found it amazing to see Java programs becoming native (a lot snappier and generally faster too). It seemed (naively) to be beyond what was possible.

However, the main problem with GCJ was it used conservative garbage collection - boehm GC - and as such was fundamentally broken for long running programs (what java is good for). Without wanting to denigrate it too much, it seems to me that a lot of 'good' work has been put into this approach which was essentially a cul-de-sac.

I had a related rather general thought recently about documenting features in projects. Sometimes the hardest thing to know is whether something is in fact a good idea or not (both in theory and in practice) and the amount of effort/quality of documentation is probably not a good indicator of this. So in the case of the GCJ garbage collector, there was always a lot of talk about how clever the garbage collector was and all the technical aspects, but unless you were already an expert, how would you know if it was infact a good idea or not.

[1] That is this one, there are several now doing similar things (!). http://www.xwt.org/

The code lives on here https://sourceforge.net/projects/vexi/

correktlogic · on Oct 3, 2016

Anyone know if gcj ever experimented with a precise GC? Or did the design of GCC simply not allow for a non-conservative GC implementation?

gumby · on Oct 3, 2016

Historical. We implemented the Bohm GC at Cygnus, funded by Metaphor back around 92/93 if I remember. GCC is / was very conservative.

dragonquest · on Oct 3, 2016

This is why you should always create something, even if you think it will never be accepted by a large audience. It is worth it for the journey itself.

Reading this post made me wish I had spent more time with 'gcj'. Though in all honesty I know I gave up on it because I didn't want to go the tougher way - the the normal JDK was just there. Creation is beautiful, and its ending bittersweet for the creator and the witnesses.

TheCondor · on Oct 3, 2016

I live in the moment! Enjoy and cherish those magical times when the work is right, the team is right, and you're having fun. Don't let the stupid stuff bother you.

You can rebuild teams like that, find those projects, etc.. but it feels like an eternity when you've had it and then don't have it for a while.

aardvark179 · on Oct 3, 2016

http://openjdk.java.net/jeps/8166089

One Java AOT option goes away, and a new one is arriving.

bcg1 · on Oct 3, 2016

There is Avian too

https://readytalk.github.io/avian/

iso-8859-1 · on Oct 3, 2016

Hopefully the new one can replace RoboVM as well...

chrisseaton · on Oct 3, 2016

I'm afraid the job of the new AOT is not to produce standalone binaries - it's code to be loaded into the JVM, not to run without it.

okket · on Oct 3, 2016

GCJ was deemed obsolete in 2010, amazing that it was still around until now...

http://stackoverflow.com/questions/3032727/java-jre-vs-gcj

Did anyone used GCJ for something in production?

marktangotango · on Oct 3, 2016

I evaluated gcj to use in a shareware desktop application in the 2000's. The goal was to compile java + SWT into native binaries for distribution. The goal was a slim installer that didn't require the user to futz with installing java, or bunding a jre. Ended up going with Excelsior JET which is a fantastic project.

In the end, the full java spec doesn't lend itself to aot, because if you claim to be "real" java you have to include support for runtime classloading, which means there has to be a bytecode interpreter embedded in there somewhere. My thoughts are now, if you really want a java/c# style aot language, vala is the way forward.

amyjess · on Oct 3, 2016

> My thoughts are now, if you really want a java/c# style aot language, vala is the way forward.

What about D?

my123 · on Oct 3, 2016

C# fully supports AOT since a while.

swift · on Oct 3, 2016

I'm pretty sure that C# AOT does not eliminate the need for a JIT in the runtime system. Just as one example, how could expression trees be implemented without one?

pjmlp · on Oct 3, 2016

You cannot use them.

In .NET Native you have to explicitly inform the compiler of any kind of reflection behavior you might require.

zgramana · on Oct 4, 2016

Not quite right. Mono has the lovely mkbundle tool which can do what the poster wants: AOT your intermediate/byte code and build a statically linked executable containing your app, the runtime, and all your deps (except for key system dynamic libraries like libc or Security.framework, similar to how go works).

No restrictions on reflection, unlike .NET Native (which has slightly different design goals).

I've used it many times for just this reason with great results. Plus, the runtime was re-licensed to MIT/X11.

pjmlp · on Oct 5, 2016

My .NET experience is only on Windows, thanks for the clarification.

lmm · on Oct 3, 2016

I was part of a student society that used GCJ for the code that ran everything (including generating the website). I have not-so-fond memories of the porting effort to get it working on Sun Java and MySQL 5 (it had originally used 4). It was a different time to be sure.

krylon · on Oct 3, 2016

IIRC, Fedora at one point shipped some Java programs compiled with GCJ, for example Eclipse.

int_19h · on Oct 3, 2016

One nice thing that GCJ had was fairly smooth integration with C++ via CNI - much higher level than JNI, and easier to use.

https://gcc.gnu.org/onlinedocs/gcj/About-CNI.html#About-CNI

Could be used for scripting and extensibility purposes.

correktlogic · on Oct 3, 2016

What a shame about gcj being dropped. I remember using GCJ with CNI to interface with native C++ code was a much more pleasant experience than using JNI.

https://gcc.gnu.org/onlinedocs/gcj/About-CNI.html#About-CNI

wodencafe · on Oct 3, 2016

It seems the most beloved feature of GCJ was its ability to compile Java to native code.

Is this something that could be implemented in the Java JDK / OpenJDK? Understandably a lot of work, but Oracle is looking at AOT now, for performance reasons.

The performance improvements could be taken even further by something like this, Java to native code compilation.

myhf · on Oct 3, 2016

Google cache: http://webcache.googleusercontent.com/search?q=cache:zFK8UNR...

akx · on Oct 3, 2016

gcj being killed means someone needs to reimplement pdftk in... well, something else. D:

LukeShu · on Oct 3, 2016

Indeed, that was my concern as well.

The best solution is probably to replace the C++ bits with Java, and go full JVM.

akx · on Oct 4, 2016

Yep. I recently had to get pdftk running on an old Centos-or-whatever system, and that ended up being a big fat no-go, so I reimplemented the functionality we needed from pdftk (filling forms) with just Java: https://github.com/akx/foofdf

pritambarhate · on Oct 4, 2016

Not directly related to topic, has anybody used Excelsior JET to compile Java apps to native code? How was your experience? Did it improve the start up time of the app? Did you find it any faster than using a JVM?

fake-name · on Oct 3, 2016

I skimmed the article, and all the comments here, and I still have no fucking idea what "gcj" is. It's apparently related to java. Maybe at least define your acronym once?

tombert · on Oct 3, 2016

Yeah, I mean, it's just incredibly hard to go on Wikipedia and look up "gcj". All I got was this extremely informative page about an open source java compiler: https://en.wikipedia.org/wiki/GNU_Compiler_for_Java.