First of all, we do care very much about supporting compilers for managed languages like Java and C#, but we're starting with this first version that only supports zero GC and atomic primitive types. We have plans to grow outwards from there to support structured binary data, based on the ES6 binary data API, and controlled interaction with the GC. Luke has ideas about how to do this without losing the predictable performance for lower-level compilers like Emscripten.
We do have plans for startup time. I hope to pitch a very small, simple API for a kind of opaque compiled Function. Internally we've been calling it FunctionBlob (we'll bikeshed the name later). The idea is that `new FunctionBlob(src)` is almost identical to `new Function(src)` except the object is (a) an opaque wrapper that can be converted to a function via `fblob.toFunction()` and (b) entirely stateless and therefore compatible for transfer between Workers as well as offline storage. This would essentially make it possible to do things like background compilation of asm.js on a Worker thread, and caching of the results of compilation in offline storage. That way next time you startup you don't have to download or optimize the source. (This could work especially well with the app experience where you could perform these download/optimize steps as part of the installation process.)
As for the use of JS, this is purely pragmatic. The code works today in browsers, so people can start using it and it works -- and even quite fast; Emscripten is already shockingly performant in Firefox and Chrome -- but over time it'll see better and better optimization.
Presumably, if it was a blob, it could also be stored in local storage/indexdb/filesystem API? Is the internal format supposed to be architecture neutral, or dependent? I can see arguments for either, but if it were neutral, than the blob conversion could be done offline/statically on the server, and downloaded by the client dynamically (e.g. XHR to fetch function blobs). If it were architecture dependent, then I could see advantages as well, letting the browser vendor choose the optimal form of the blob. This would potentially yield better performance, but you wouldn't be able to host blobs on the server.
I was only thinking that it would be internal. Your server point is good, but I think way, way harder, and kind of starts the whole project back at the beginning: how to design a standard, efficient, optimized bytecode format. So I think it's probably not really feasible.
Not the same, but an additional optimization you can do is incremental compilation. Because you have JavaScript's eval, you can download the code a bit at a time and optimize (and cache via FunctionBlob) each piece.
Storing it and transferring it on the server is one thing; serializing it locally in the browser itself might be a more reasonable goal? That is, it wouldn't be expected to be portable to anything but that very same browser, but it would allow you to cache the compiled result. (I would expect the serialized string to be signed by the browser itself, to prove that it was created by the browser – and for the deserialization to fail on some browser upgrades).
Care would have to be taken to ensure that these blobs cannot be mutated via other means, or that a non-function blob can be converted to a function blob with passing through validation, otherwise it might be an attack vector, for someone to figure out the internal representation, and manipulate storage to forge one that trips up some fast-path code which makes assumptions about the input being correct.
Right, that's the idea of FunctionBlob. It wraps a browser-internal representation of the optimized compiled code. The web code can instruct the browser to store that offline, without exposing its implementation-specific details to the web code. The web code can then, in a later session, retrieve that optimized code from storage as another FunctionBlob, which it can then convert into a Function. This is no different from just storing the asm.js source code in offline storage, except it avoids redoing the work of compiling and optimizing the source. (It'd still have to be stored in position-independent format and there might be some back-patching necessary when reloading it.)
I think GC interaction should be the highest priority. As a way to run C/C++ in the browser at native speed, asm.js is awesome, but as a way to run Python, Java, Go, etc in the browser at native speed, asm.js would be world-changing.
Very cool. My position for years has been that "Javascript is the defacto bytecode of the web" is, on balance a very good thing. However, it has been a bit of a hack.
What you're proposing is: "Javascript is the bytecode of the web/ How about we make it a much better bytecode?"
First of all, we do care very much about supporting compilers for managed languages like Java and C#, but we're starting with this first version that only supports zero GC and atomic primitive types. We have plans to grow outwards from there to support structured binary data, based on the ES6 binary data API, and controlled interaction with the GC. Luke has ideas about how to do this without losing the predictable performance for lower-level compilers like Emscripten.
We do have plans for startup time. I hope to pitch a very small, simple API for a kind of opaque compiled Function. Internally we've been calling it FunctionBlob (we'll bikeshed the name later). The idea is that `new FunctionBlob(src)` is almost identical to `new Function(src)` except the object is (a) an opaque wrapper that can be converted to a function via `fblob.toFunction()` and (b) entirely stateless and therefore compatible for transfer between Workers as well as offline storage. This would essentially make it possible to do things like background compilation of asm.js on a Worker thread, and caching of the results of compilation in offline storage. That way next time you startup you don't have to download or optimize the source. (This could work especially well with the app experience where you could perform these download/optimize steps as part of the installation process.)
As for the use of JS, this is purely pragmatic. The code works today in browsers, so people can start using it and it works -- and even quite fast; Emscripten is already shockingly performant in Firefox and Chrome -- but over time it'll see better and better optimization.