That sounds like it's partly due to poor componentization and/or dependency grap...

josteink · on Oct 4, 2016

> 1. Build VM. 2. Build minimal stdlib (native parts). 3. Use the preceding to run an older build of full stdlib and compiler, precompiled into bytecode. 4. Use the preceding to rebuild the compiler. 5. Use the preceding to rebuild the stdlib. ... So, where does this picture break?

It breaks at point 3. There's 2 distinct things which breaks here, much due to what you call poor componentization.

To build the stdlib, you need Nuget (which is written in .NET) to resolve binary dependencies and the Roslyn compiler (also written in .NET). These two components are wrapped up in a simple to use all-in-one command line utility, much like Clojure's leininginen: Basically one tool to rule them all. This tool is also written in .NET.

This means you need a prebuilt .NET version, including the base stdlib, to run the toolchain and compiler needed to build the stdlib itself. In theory this should not have to present any issues. Basically so far so good.

The problem is that what you need to download will contain 1. corerun/corehost runtimes used to bootstrap the process (platform native binaries) and 2. The managed stdlib, which also contains platform specific binaries (Win32 system-calls on Windows, glibc on Linux, another Libc on FreeBSD, etc). These are not all bundled in one big package where binaries for all platforms are included, but as separate packages per supported platform (Windows, OSX, Fedora 23, Ubuntu 14, Ubuntu 16, etc)

So on a previously unbootstrapped platform you can run into any of the following issues:

1. Native components are dynamically linked, and may fail to resolve libraries if they have had their soname/version bumped, even though you can build these components yourself just fine. This is obviously a structural issue with the build process, which IMO should be fairly doable to solve.

2. Because the compiler itself is .NET -based, it relies on the managed stdlib, and this this library itself must be built for your platform (FreeBSD's implementation differs slightly from Linux, etc).

So on a platform where nobody has ever done any work before, the download for step 2 will fail and the build will crash.

To rectify this you will basically need to get acquainted with dotnet's multitude of individual repos, how they tie together, in what order they should be built, how their build-outputs should be combined, and all kinds of un/under-documented stuff, and do that manually... Phew! ... To package a base "SDK" which can be used to initiate the normal coreclr build-process, which previously failed. You may need to fake this SDK for several dependent repos.

That single step is incredibly complex, and I think only a handful of people on the whole internet knows how this is done.

Compare this to how things were previously: The process was bootstrapped automatically using mono, and every single coreclr developer knew how to get things up and running.

That this was greenlit without any further objections is something I find hard to believe.

int_19h · on Oct 4, 2016

I'm still not quite getting this part:

> The problem is that what you need to download will contain 1. corerun/corehost runtimes used to bootstrap the process (platform native binaries)

if they're native, why not just build them right there and then? Or is that where the dependency resolution tool (that is managed) creates a circular dependency?

The other problem, if I understand correctly, is that the managed code has a bunch of #ifdefs for various platforms. If, instead, it selected one codepath or another depending on the platform at runtime (e.g. for something like Process, which has to invoke radically different APIs depending on the platform, make it a very thin wrapper around Win32Process/UnixProces/..., and pick the appropriate factory in the static constructor), the same managed code bundle could be used everywhere. Except for a brand new platform, of course, but that is a problem that the Mono solution also has (if you have to target something that doesn't run Mono).

josteink · on Oct 5, 2016

> if they're native, why not just build them right there and then?

You typically do... But then the next build step says "DL this pre-made SDK kit" and doesn't overwrite it with what you've just built. So you will need to patch the build-process (in several repos) to overcome this.

> The other problem, if I understand correctly, is that the managed code has a bunch of #ifdefs for various platforms. If, instead, it selected one codepath or another depending on the platform at runtime (e.g. for something like Process, which has to invoke radically different APIs depending on the platform, make it a very thin wrapper around Win32Process/UnixProces/..., and pick the appropriate factory in the static constructor), the same managed code bundle could be used everywhere.

Correct. And so is your analysis.

But for several reasons that was not the solution which was chosen. Having an extra vtable or stackframe for every platform action invokable throughout the entire .NET framework was considered something to avoid as far as possible, with regard to performance and memory efficiency. And that's absolutely a valid concern to have, since it effectively will affect all .NET applications.

Remember: The goal here is for .NET Core to server as a base for .NET as delivered on the main, commercial Windows platform (including Azure!) with as little changes as possible. And in the cloud-space you don't want to impede your own performance.

Disclaimer: Things may have changed by now, but that at least was the state last time I poked into things.