> This seems very much targeted at emscripten and not to cross-compilers that start with GC'ed languages like GWT, Dart, ClojureScript, et al.
That's correct, although one could implement a garbage collector on top of the typedarray heap (we have working examples of this). There are some limitations to GCing the typed array manually, though, such as not taking advantage of the browser's ability to better schedule GCs.
Looking further in the future, though, it would be completely reasonable to extend asm.js to allow the super-optimizable use of the upcoming BinaryData API [1] in the style of JVM/CLR-style objects. Again, though, this is speculative; BinaryData isn't even standardized yet.
> It's also unclear to me how this solves the problem of startup time on mobile.
We have several strategies to improve this. The "use asm" directive, in addition to allowing us to produce useful diagnostic messages to devs when there is an asm.js type error, allows us to confidently attempt eager compilation which can happen on another thread while, e.g., the browser is downloading art assets. Looking farther in the future again, we could make some relatively simple extensions to the set of Transferrable [2] objects that would allow efficient programmer-controlled caching of asm.js code including jit code using the IndexedDB object store.
> In all likelihood, the majority of asm.js outputs would be actually be non-human readable output of optimizing cross-compilers, so there isn't much benefit from having a readable syntax that humans could read, so what's the real justification for using JS as an intermediate representation over say, a syntax specifically designed for minimum network overhead and maximum startup speed?
Before minification, asm.js is fairly readable once you understand the basic patterns, assuming your compiler keeps symbolic names (Emscripten does). The primary benefit is that asm.js runs right now, rather efficiently, in all major browsers. It also poses zero standardization effort (no new semantics) and rather low implementation effort (the asm.js type system can be implemented with a simple recursive traversal on the parse tree the generates IR using the JS VM's existing JIT backend). This increases the chances that other engines will adopt the same optimization scheme. A solution for native performance is only a solution if it is portable and we want to maximize that probability.
> The usual response is minify + gzip, but it's not a panacea.
In addition to minify+gzip, one can also write a decompressor in asm.js that unpacks the larger program. Also, see [3] for how minified gzipped Emscripten code is comparable to gzipped object files.
> This seems very much targeted at emscripten and not to cross-compilers that start with GC'ed languages like GWT, Dart, ClojureScript, et al.
That's correct, although one could implement a garbage collector on top of the typedarray heap (we have working examples of this). There are some limitations to GCing the typed array manually, though, such as not taking advantage of the browser's ability to better schedule GCs.
Looking further in the future, though, it would be completely reasonable to extend asm.js to allow the super-optimizable use of the upcoming BinaryData API [1] in the style of JVM/CLR-style objects. Again, though, this is speculative; BinaryData isn't even standardized yet.
> It's also unclear to me how this solves the problem of startup time on mobile.
We have several strategies to improve this. The "use asm" directive, in addition to allowing us to produce useful diagnostic messages to devs when there is an asm.js type error, allows us to confidently attempt eager compilation which can happen on another thread while, e.g., the browser is downloading art assets. Looking farther in the future again, we could make some relatively simple extensions to the set of Transferrable [2] objects that would allow efficient programmer-controlled caching of asm.js code including jit code using the IndexedDB object store.
> In all likelihood, the majority of asm.js outputs would be actually be non-human readable output of optimizing cross-compilers, so there isn't much benefit from having a readable syntax that humans could read, so what's the real justification for using JS as an intermediate representation over say, a syntax specifically designed for minimum network overhead and maximum startup speed?
Before minification, asm.js is fairly readable once you understand the basic patterns, assuming your compiler keeps symbolic names (Emscripten does). The primary benefit is that asm.js runs right now, rather efficiently, in all major browsers. It also poses zero standardization effort (no new semantics) and rather low implementation effort (the asm.js type system can be implemented with a simple recursive traversal on the parse tree the generates IR using the JS VM's existing JIT backend). This increases the chances that other engines will adopt the same optimization scheme. A solution for native performance is only a solution if it is portable and we want to maximize that probability.
> The usual response is minify + gzip, but it's not a panacea.
In addition to minify+gzip, one can also write a decompressor in asm.js that unpacks the larger program. Also, see [3] for how minified gzipped Emscripten code is comparable to gzipped object files.
[1] http://wiki.ecmascript.org/doku.php?id=harmony:binary_data [2] http://www.whatwg.org/specs/web-apps/current-work/multipage/... [3] http://mozakai.blogspot.com/2011/11/code-size-when-compiling...