The .NET Native Tool-Chain

jbevain · on May 10, 2014

It's pretty cool to see .NET catching up with Mono on that front ;)

Fun fact, MonoTouch applications are built with a very similar pipeline, except that LLVM does the compilation in the end.

The steps 4 and 5 in the blog post map quite exactly to the Mono.Linker and the Mono.Tuner. Both tools are using a XML file format for whitelisting pieces of code that need to be included, that's useful if you're calling code with reflection at runtime which is not statically discoverable at build time.

dochtman · on May 10, 2014

Yeah, the MDIL thing sounds like it could have been an LLVM IR derivative. Too bad they always have to NIH things (though I guess in this case there might be legitimate reasons to do something different than the level of LLVM IR).

bunderbunder · on May 10, 2014

Too bad they always have to NIH things (though I guess in this case there might be legitimate reasons to do something different than the level of LLVM IR).

I'm going to guess that in this case, their reason to not go with LLVM was the most compelling one possible: When Microsoft was designing the .NET toolchain, LLVM didn't exist yet.

Given the timing of when it came out, I wouldn't even be surprised if LLVM wasn't at least partially envisioned as an open-source answer to .NET. In which case there's a hint of NIH behind LLVM, with Mono being the non-NIH open source option.

xyziemba · on May 10, 2014

MDIL is designed to solve a different problem than LLVM IR. LLVM IR is an exchange format between a compiler frontend and backend. MDIL is designed to solve certain fragility problems in binaries.

.NET Native uses MDIL to create a looser coupling between NUTC and the runtime. This means NUTC doesn't have to understand the precise binary layout of the runtime. The binder takes care of fixing that up!

MDIL has more advanced capabilities too. For more background on MDIL, I highly recommend watching the video I linked to in the blog post: http://channel9.msdn.com/Shows/Going+Deep/Mani-Ramaswamy-and...

pjmlp · on May 10, 2014

> Yeah, the MDIL thing sounds like it could have been an LLVM IR derivative.

It comes from the Singularity OS compiler toolchain, which is older than LLVM.

It is also used to deploy .NET applications as native code on Windows Phone 8 devices.

gsnedders · on May 10, 2014

LLVM goes back to 2003, and Sing# is based on Spec# which is only 2004?

pjmlp · on May 10, 2014

The Wikipedia mentions 2003 and we can assume there was already work going on before.

http://en.wikipedia.org/wiki/Singularity_%28operating_system...

Although the first published papers are indeed from 2004.

gsnedders · on May 10, 2014

Work on LLVM started in 2000, FWIW.

gsnedders · on May 10, 2014

Competition isn't per-se bad, and if we're assuming they aren't using LLVM, what gains are there to using the IR? More front-ends, I suppose. But I wouldn't be surprised if they had more machine dependent behaviour than LLVM IR (which, let's be honest, includes a fair bit), especially for things like overflow detection. There also are pretty bad limitations to LLVM IR's ability to deal with GC roots, which for MDIL will be especially important.

stephengillie · on May 10, 2014

MS is one of the very few organizations to whom I'll give an NIH pass, if for no other reason than merely to have someone exploring alternate implementations.

macca321 · on May 11, 2014

I don't know if there would be any point to Microsoft if they didn't NIH.

zvrba · on May 10, 2014

LLVM IR is machine-independent, so the NIH accusation is not really fair.

MDIL is described to be the platform's native machine code + some extension tokens (to avoid direct encoding of pointers, for example).

A similar LLVM IR derivative would no longer be by a long shot something resembling LLVM IR, so the property of being "derivative" would not buy you anything.

gsnedders · on May 10, 2014

LLVM IR is machine-independent? What? There's plenty that isn't machine-independent, from the obvious (e.g., x86_fp80), to the less obvious (to create IR in the first place one has to have knowledge of the target ABI), what integer types are supported (i64 isn't supported everywhere!), etc.

barrkel · on May 10, 2014

LLVM IR is not machine independent, and is documented specifically as not designed to enable portable IR.

Sephr · on May 10, 2014

PNaCl's LLVM IR is machine independent and can be used standalone without Chrome.

gecko · on May 10, 2014

While true in a narrow sense, it does so by targeting the lowest common practical denominator—in practice, a chip that looks a lot like a 386 or 32-bit ARM with no SIMD pipeline. That puts it as little more than a better GNU Lightning.

Sephr · on May 17, 2014

It has SIMD: https://developer.chrome.com/native-client/reference/pnacl-c...

logicchains · on May 10, 2014

Hopefully it's not like the CIL (CLR intermediate language), which is competitive with Java in verbosity. E.g.

.class public Foo

{

    .method public static int32 Add(int32, int32) cil managed

    {

        .maxstack 2

        ldarg.0 // load the first argument;

        ldarg.1 // load the second argument;

        add     // add them;

        ret     // return the result;

    }

}

If you ever look at a disassembled CLR executable, expect to see a heap of lines like:

valuetype A* modopt([mscorlib]System.Runtime.CompilerServices.CallConvThiscall) 'A.{ctor}'(valuetype A* modopt([mscorlib]System.Runtime.CompilerServices.IsConst) modopt([mscorlib]System.Runtime.CompilerServices.IsConst))

jbevain · on May 10, 2014

CIL - just like the LLVM IR - if human readable, is not really meant to be written by hand, unlike Java. Not sure what you mean here.

logicchains · on May 11, 2014

LLVM IR is much less verbose than CIL, isn't class-based, and on the whole is much simpler. The CIL's complexity makes it less flexible than something like LLVM IR or JVM bytecode; this is one of the reasons alternative languages have flourish on the JVM, whereas the CLR only has C# and F#, and F# had to rely on the developers' influence at Microsoft to get the runtime to support some of its features.

memsom · on May 12, 2014

Err... there's more than C# and F# targeting the CLR. Even just from Microsoft (VB.Net and Managed C++), but there are a heap more.

logicchains · on May 13, 2014

I said flourished. Are there any non-MS languages on the CLR that have achieved equal popularity to that of Clojure, Scala or Groovy on the JVM?

vorg · on May 13, 2014

> achieved equal popularity to that of Clojure, Scala or Groovy on the JVM

Clojure and Scala seem to be tied for 2nd place popularity behind Java, but Groovy's lagging far behind. Were you looking at the bintray-maven download stats at https://bintray.com/groovy/maven/groovy/view/statistics for Groovy when you put it in with Scala and Clojure? Even tho those stats show 660,000 downloads of Groovy over the past month, click on country and you'll see 625,000 of them came from a server in China, only 12,000 from the US, and 2000 from Germany, the 3 biggest countries. Obviously the China stats are fabricated. Groovy's true popularity is far behind Scala and Clojure.

logicchains · on May 13, 2014

I was just listing what I assumed were the top 3 most popular non-Java languages on the JVM. But perhaps it's already been overtaken by Kotlin or Ceylon.

mattgreenrocks · on May 10, 2014

I'm curious what the minimum size of a .NET-native executable is.

sz4kerto · on May 10, 2014

Wonder how this handles dynamic code generation (System.Reflection.Emit and the rest).

danbruc · on May 10, 2014

...or how easy it is to break by doing a lot of reflection.

adandy · on May 10, 2014

In one of the panels at Build somebody mentioned that expression trees would be interpreted at run time rather than JIT compiled so at least you have that.

x0n · on May 12, 2014

Yep, I asked the question live and the answer was that dynamic code would use an interpreter where AOT was not feasible.

CurtHagenlocher · on May 10, 2014

It doesn't, or at least doesn't yet.

_wmd · on May 10, 2014

It's worth nothing that if this approach becomes popular, it raises the bar for running code from Windows on top of Mono without any changes, although this often never worked anyway except for simpler apps with few deps

At least for me, this was a small original appeal of C#/Mono

skrebbel · on May 10, 2014

I don't see how: it always goes via MSIL and Mono runs MSIL. Admittedly, apps or libraries published solely as "native" binaries won't run on Mono, but I strongly suspect that those are typically native end-user apps which are very often Windows-specific already. Windows Store stuff, and similar.

(I also strongly suspect that this entire "native" move was fueled purely by that top-of-the-bill Windows 8 computers take 5 seconds to start up the damn Weather app)

On the library side, there's a strong move towards the opposite, actually: you can now make a (MSIL) DLL that runs on Windows, Windows Phone, iPhone, Android, Mono, etc etc without having do distribute different versions (basically, what Java always had) of the binary. It's overdue, but it's really nice.

I'm not worried (and I run a dev team that runs a C# backend on mono/linux).

bunderbunder · on May 10, 2014

I also strongly suspect that this entire "native" move was fueled purely by that top-of-the-bill Windows 8 computers take 5 seconds to start up the damn Weather app

Bingo. Especially on 64-bit machines, the time it takes for apps to launch can be downright embarrassing.

Locke1689 · on May 10, 2014

Kind of.

If you fully understand Microsoft's portfolio it makes more sense.

Desktop Windows 8 is a scenario, but not a very good representative. While 64-bit JIT time can take a while and CLR startup time can be really long, both of these things can be addressed in a variety of ways in vanilla Windows machines, ranging from NGen improvements to CLR pre-loading strategies.

The real killer is alternate platforms. Every time you JIT in a mobile device it wastes time, CPU power, and battery life, all of which are much more limited than in a desktop machine. CLR startup can be mitigated somewhat, but resident memory is an especially precious commodity on mobile.

The last real kicker is that security concerns mean that many platforms (XBox) mark a lot of their pages as NX, making JIT-ing in these pages very difficult or impossible. Moreover, better security infrastructure means doing this on more platforms more of the time, so this will become increasingly problematic.

All together, this makes a very compelling case for AOT compilation.

Maarten88 · on May 10, 2014

The Weather App, as I understand it, is built by the Bing team, who use WinJS (javascript) for their apps. This native compiler backend is for .NET, so they wont benefit from this at all.

skrebbel · on May 10, 2014

Damn, there goes my theory. But then why the hell so slow?

chinpokomon · on May 11, 2014

It was vastly improved with the 8.1 Update 1, but I'll grant you it was depressingly slow on Windows 8 and the 8.1 upgrade.

pjmlp · on May 10, 2014

> It's worth nothing that if this approach becomes popular

I hope so. In the 90's we had strong typed languages with native compilers, somehow we got into this VM detour because everyone wanted to create applets.

And those languages[1], for multiple reasons faded away.

[1] Oberon(-2), Object Pascal, Modula-2, Modula-3, ...

mbel · on May 10, 2014

Very true. It's not exactly rational, but it always makes me sad when I realize that popularity of given language only slightly depends on quality of its design (Haskell seems to be a counter example, it survived years in obscurity and now seems to gain popularity).

alok-g · on May 10, 2014

Could someone tell me what are the pros and cons of MSIL vs. native? The article introduction (I did not read the rest) only talks about the benefits of native. But if there were only benefits, why Microsoft would have gone to MSIL in the first place. Apologies if I am missing something obvious.

bunderbunder · on May 10, 2014

When .NET first came out, "write once run everywhere" idea presented a lot more value. There were a lot more CPU architectures being used in business. (The old Unix vendors and their proprietary RISC architectures were waning by then, but were still a force in the industry - see Java.) Macs were still running on PowerPC. And the PC platform itself was looking like it might take a turn for the hetrogeneous. IA-32 was still dominant, but the need to move toward 64 bit had been recognized. Intel's answer, Itanium, was a completely different architecture that came out less than a year before .NET hit the market. AMD64 had been announced and was expected to hit the market fairly soon. With Windows computers having a choice among three different architectures that weren't binary-compatible, and the possibility that enterprise customers might still end up sticking with SPARC or MIPS for the long haul, the bytecode option probably was the most strategic option for Microsoft at the time.

Even now, having an intermediary binary format's still fairly valuable. I could conceivably get a component from a vendor and put it into production on a slew of different platforms - Intel chips running Windows, Linux and OS X, phones with A5 or ARM chips, etc. without having to mess around with source code distributions, other people's funky build configs, etc. The ones going on smartphones would eventually be native-compiled, but getting bytecode binaries means I don't need to fiddle with separate native binaries for different target platforms.

It's also worth noting that what the Java and .NET platforms have achieved in terms of language interoperability hasn't really been matched by interpreted and native-compiled languages. That's arguably a result of culture rather than any sort of technical limitation. OTOH, the technical limitation that the .NET and Java virtual machines define object formats, function signatures, and whatnot does go a long way in facilitating interoperability.

Finally, the .NET and Java assembly languages are dead easy to target, especially compared to those of complex architectures like x86. Even if shipping native binaries becomes dominant in the .NET space, MSIL won't be going away. And it makes everyone's life easier to not have to reinvent the "native code generation" wheel for every possible programming language/CPU architecture pairing. LLVM's a big deal for the same reason, and to some extent the only really big difference between it and .NET or Java is that it doesn't encourage people to package bytecode up into executables.

pjmlp · on May 10, 2014

Because Microsoft wanted to compete with Java. The initial versions of .NET offered their own version of Applets as successor of ActiveX.

Actually, the early designs from .NET were COM based and known as COM+ VOS. If you read the design document you will find many similarities with the new Windows Runtime.

stephengillie · on May 10, 2014

How much will this be able to work with old ASP code?

We've got boat-loads of it, and while we're slowly retiring it, we can't even diagnose trouble-spots because most monitoring tools (like New Relic) don't instrument it.

bananas · on May 10, 2014

It's dead. Start again.

stephengillie · on May 10, 2014

Dead or not, it's all we've got to serve some of the biggest US residential real estate corporations.

We are starting again, the mobile site for 1 of our 33,000 domains is in testing, but it will be over a year before we have most of our customers using it.

bananas · on May 10, 2014

The end has been marked for over a decade you realise?

stephengillie · on May 10, 2014

I haven't quite been working there a year yet. We're basically cannibalizing the entire company (codebase, data import methods, data storage, website and image hosting, MLS data warehousing, server and network infrastructure) to relaunch the company under the same name.

We're in the painful process of upgrading off ASP, DefaultHttpHandler code that requires 15-min recycle limits, 32bit DLLs with a GAC refresh process, 32bit IIS 6.0, MS NLBs, SQL 2000 DTS and undocumented built-in-house data-import processes, SQL 2005 main database servers, PowerEdge 2850s, and entirely non-cached nor accelerated sites.

bananas · on May 12, 2014

That's a living hell. I wish you good fortune.

tom_jones · on May 10, 2014

Great post! Can't wait to see where .NET Native will go.

tonyedgecombe · on May 11, 2014

Hopefully into drivers.

escaped_hn · on May 10, 2014

Are people still able to decrypt the MDIL exe to get the readable source code?

xyziemba · on May 10, 2014

MDIL is mostly native machine code (e.g. ARM or x64 instructions). Transforming MDIL back to source code would require significant analysis and effort. It would be more like to transforming a compiled C++ application back into C++.

Also, note that end users wouldn't be trying to transform MDIL back to source code. MDIL goes through the 'binding' step that transforms it to native machine code.

jonathanoliver · on May 10, 2014

Kinda sounds like Golang, except not as platform independent.

seabrookmx · on May 10, 2014

Not really. I've written a bit of Go and a lot of C#, and can tell you from my experience that they are very different tools. C# is a very feature rich language like C++, while avoiding the majority of the 'gotchas' in C++ (in obvious areas such as memory management). Generic types, Abstract Classes, LINQ etc. all can be very productive if you know how/when to use them.

Golang is much more minimal. It does have some very nice concurrency features that could be a big win for some users. C# is fine at concurrency, but Golang's way of integrating it into the language is definitely more streamlined.

IMO pick the best tool for the job. In many cases, large enterprise pieces of software that want the mature tooling of C# will benefit from .NET Native. /rant

leaveyou · on May 10, 2014

I wonder why did they downvote you. I'm a long time C# dev and recently used Go for some projects and when I saw this I was thinking exactly the same.