Sitting about 10 feet away from Luke Wagner right now. He has told me that asm.js is Mozilla's response to NaCl. You compile code with Clang (or another compiler) into the asm.js subset of JavaScript, which they know how to optimize well, and their JIT compiler will offer you performance very close to that of native C (they claim a factor of at most 2x). They use special tricks, like (a+b)|0, to force results to be in the int32 range, and avoid overflow checks. The heap views uses multiple typed arrays pointing to the same data to give asm.js code a typed heap of aligned values, avoiding the JS garbage collector (you can manage this heap as you wish).
They already have sample apps, including some using libraries like Qt (I believe OpenGL as well), compiling to asm.js. I believe it has potential, so long as they have good support for standard and common C libraries (i.e.: porting to asm.js is almost effortless).
Because backwards compatibility is really important. x86, the Windows API, TCP, IPv4, and Unix (compared to Plan 9) are all examples of things that are full of cruft, but they persist because they have the right survival characteristics.
Besides, all it really comes down to is that the bytecode format has a weird encoding when sent over the wire. That's OK. You can view it as isomorphic to a format with a more natural encoding if you'd like; such a "disassembler" would be trivial to write.
> Because backwards compatibility is really important. x86, the Windows API, TCP, IPv4, and Unix (compared to Plan 9) are all examples of things that are full of cruft, but they persist because they have the right survival characteristics.
All of those things were also incredibly successful at what they did.
JavaScript and the DOM have not been incredibly successful at turning the browser into a first-class application development platform.
This is for many reasons; network performance, CPU performance, the difficulty of composing rich APIs, the lack of cleanly defined re-usable widgets and libraries (eg, the non-suitability of the DOM), the difficulty of interacting with the host.
Given that, why not start fresh for targeting applications? Leave JS in place, let your system target it as an output format for backwards compatibility, and -- finally -- clean up the massive cruft of the web as an app platform. Discarding decades of experience in producing effective consumer applications on the desktop (and now mobile) is foolish.
> JavaScript and the DOM have not been incredibly successful at turning the browser into a first-class application development platform.
Seriously? From where I stand, it looks like browser-based applications are destroying the desktop-based software market with alacrity, and are coming for mobile. And also that major vendors like Google and Microsoft are shipping desktop platforms where JS (and sometimes the DOM) is a primary way of developing applications.
The browser as application development platform is one of the two most important developments in software development in the last 20 years. That looks "incredible successful" to me.
> To what? What scenario do you envision where Apple, Google, Microsoft and Mozilla are all willing to implement a single new language for the web?
Well hopefully we wouldn't make the same mistake again by going with a language and we would just implement a VM on which JS is implemented. It would be much easier to validate the correctness of a VM anyway, achieve better speed than was possible before, and not tie people down to a (painful) language. On a personal note I would actually start to view the web browser as a platform instead of a bunch of tangled strings, cans, and tape.
EDIT: I didn't answer your question. But, as a developer, I will only move to the web for my default programming language when I have bytecode or assembly to look at, not some convoluted scripting language. Hopefully, Asm.js is a minor, minor bump on the way towards the browser as a usable platform for arbitrary development.
> Well hopefully we wouldn't make the same mistake again by going with a language and we would just implement a VM on which JS is implemented.
A VM still needs some kind of well-defined input format, and that's a language. Whether it's human-readable text or binary doesn't make that big of a difference. Either way, designing language semantics is hard.
> Whether it's human-readable text or binary doesn't make that big of a difference.
To people like the grandparent, that is the only thing that matters at all. It's not a technical problem they have but an emotional one. They do not want the code they write to be a second class citizen to the code someone else writes, and they will absolutely be ecstatic to see us throw away the last 15 years of progress and start from stratch in order to alleviate that problem they have.
Funny you should mention that, given that I just this week had to use NEON intrinsics to eek out better user-visible wall-time performance in a native app.
I guess it depends on the program, but for 99% of cases you should not have to. The very few times you would have to go for it, it's usually already part of some library you can use.
I'm not a fan of a future in which the only people that can do interesting things (including the use of SIMD intrinsics) are the platform vendors (eg, Mozilla), while the rest of us live in a JavaScript sandbox.
Maybe Mozilla should try writing their entire browser (VM included) in JavaScript/asm.js and let us know how that goes.
Large parts of the Firefox browser (as distinct from the Gecko+SpiderMonkey) are written in JS. Have a look at the code some time. Or, just open chrome://browser/content/browser.xul in Firefox to get a taste.
>I'm not a fan of a future in which the only people that can do interesting things (including the use of SIMD intrinsics) are the platform vendors (eg, Mozilla), while the rest of us live in a JavaScript sandbox.
What you describe as bleak is a much better future than what we have now. At least with Mozilla's proposal we will have a well defined low-level optimizable javascript "assembly", whereas now we just have Javascript itself.
We never had access to the use of SIMD intrinsics in browsers in the first place, anyway.
>Yes, exactly. I want it all: native performance, security, open platform.
The problem is by trying to have "all" we might get less than what we have now.
NaCL for example is a horrible "standard", as far as specifications.
And if companies are allowed to build whole native closed source castles in the web browser, we might return to the era of Active X and Flash. Maybe not in the sense of less security (a common Active X issue), but surely in the sense of less interoperability, transparency and end user control.
You would basically just be running native apps in the browser. Why not do it in the desktop or mobile and let the internet be the open, not opaque, platform that it mostly is?
> A VM still needs some kind of well-defined input format, and that's a language.
I don't really care about the input format, that's incidental. I'm protesting using javascript semantics to implement operations that don't at all require javascript semantics (and would be better off without them).
Don't forget Flash, which did actually work relatively well, and pretty much powered the development of the web as a video and gaming platform... but which we've eventually ditched for HTML/CSS/JavaScript + native web APIs anyway.
That is fair enough. Though that wasn't really Flash's original ambition. And to be further fair, the fact that it wasn't their explicit ambition is possibly what saved it, while Java suffered a doomy fate from Microsoft's embrace and extend strategy.
Flash was fine as a format for vector images. (like SVG). Then it was fine for animations. Then simple interactive graphics. Then it gets a scripting language. And as it gets more and more sophisticated, more used as an application platform-- something flash was not originally designed to do, it becomes more and more like Java in its shortcomings. Binary blobs. Long load times. Serious security holes. Slow as molasses.
But flash always had one good thing going for it: Anti-aliasing.
"Java applets were introduced in the first version of the Java language in 1995"
Remember, why Javascript is called Javascript. So it wouldn't be seen as competing with Java as "the language of the web".
Java, and its VM was designed for the web. To run in a browser. It was given every conceivable chance, and it utterly failed because ultimately, it's fundamentally a terrible idea. But somehow fellows like you are too headcopped to see history for what it is, and you are doomed to repeat it.
I remember when most developers would be surprised if you told them you could make programs in Java that are not applets. And java had a reputation for being "that really shitty slow painful web language"
September 18, 1995: Netscape 2.0 released with Java and Javascript support. This has the DOM level 0-. Java is marketed as the language for "Big boy apps" and javascript is merely the scripting "glue" that lets java access DOM level 0, which is limited really only to reading form data and changing the .src attribute on some images. [1]
1998: "The initial DOM standard, known as "DOM Level 1," was recommended by W3C in late 1998. About the same time, Internet Explorer 5.0 shipped with limited support for DOM Level 1. DOM Level 1 provided a complete model for an entire HTML or XML document, including means to change any portion of the document. Non-conformant browsers such as Internet Explorer 4.x and Netscape 4.x were still widely used as late as 2000." [2]
August 2001: Internet Explorer 6 is released with still " and partial support of DOM level 1" [3]
in the same month (August 2001) Netscape 6.1 is released [1]
netscape 6 was the first Netscape browser based on the "Mozilla Application Suite", what Firefox is based on today. It's hard to find detailed information on this, but I would also guess that Netscape 6 had "Partial Support" for DOM Level 1
February 6, 2002 (6 months later): Java's J2SE 1.4 runtime is released, and has, again, Partial DOM level 1 support, along with the ability to directly manipulate the page that an applet is on without using Javascript as "glue"
Which of course is the whole point of the DOM, as a "language agnostic" API that needed to be used from not just Javascript, but Java, C++, and VBScript and whatever other language.
It was, to be fair, shitty, but we are talking, at this time, 2002, people are using IE6 and netscape 4 still. Browser support for DOM matured around 2006, and so did Java and its applets, keeping pace right along with the browsers. People generally don't really understand that Java and Javascript are two seperate languages. It's all just "Java, that really shitty slow web language"
I'm not sure what you're trying to prove; javascript runtimes are VMs too.
Java was designed poorly, and it performed poorly. It just so happens that its design was well-suited to long-running servers, however, so that's where it's used.
It is not a VM that accepts binary bytecode as input, which is what the person I am replying to wanted. Context matters. And you could have read that yourself.-
Edit to respond to ninja edit:
One could argue quite successfully that one of the chief reasons Java (applets) in a browser is a bad design is because of its "standardised" bytecode format, which is what everyone in this discussion thread is screaming for. My point is: Look we've already done this. Twice in fact, because flash works the same way, and does it much better than Java ever did. And yet, it's still a failed concept in both cases. Flash was able to get by better by virtue of having a monopoly instead of a standard, and thus, has the freedom to change its swf format and bytecode format.
> It is not a VM that accepts binary bytecode as input, which is what the person I am replying to wanted. Context matters. And you could have read that yourself.-
Er, so?
> One could argue quite successfully that one of the chief reasons Java (applets) in a browser is a bad design is because of its "standardised" bytecode format, which is what everyone in this discussion thread is screaming for.
Then please, reasonably argue it. I don't understand how the argument applies.
Java applets perform poorly in the browser for a number of reasons, none of which have anything to do with bytecode:
- Java's generational GC is designed around reserving a very large chunk of RAM, and performs poorly if insufficient RAM is reserved. This is a terrible idea for desktop software.
- Java's sandboxing model is broken and insecure, as it exposes an enormous amount of code as an attack surface. A bug in just about any piece of code in the trusted base libraries can result in a total sandbox compromise.
- Java is slow to start and slow to warm up, and applets more so. It historically ran single-threaded in the browser and blocked execution as it did start up.
- Swing does look native, and doesn't look like the web page, either. Applets can't actually interface with the remainder of the DOM in any integrated fashion (eg, you can't have a Java applet provide a DOM element or directly interface with JS/DOM except through bridging), so applets are odd-men-out for both the platform, and the website they're on.
> Flash was able to get by better by virtue of having a monopoly instead of a standard, and thus, has the freedom to change its swf format and bytecode format.
That doesn't even make sense. Flash was better because it didn't lock up your browser when an applet started, and didn't consume huge amounts of RAM due to a GC architecture that was poorly suited to running on user desktops.
Flash sucked because of its extremely poor implementation and runtime library design.
Actually I missed this before. Javascript runtimes are /not/ VMs- I don't think I've ever seen a javascript engine use a virtual machine. Ever. Do you have evidence of this? (unless you mean in the sense that asm.js is treating the javascript runtime, as though it were a VM)
Basically it comes down to this: It's easier and way more efficient to secure an untrusted program using a language grammar rather than a "bytecode verifier"
and a few other things.
I saw your post before you deleted it. I didn't get a chance to respond before. I just wanted to say that you probably know a lot more about VM's than I do, and I'll concede that. I don't really know for sure whether switching to a bytecode vm would be great or not for the browser. I know that doing /any/ change to "the web" is a huge uphill battle, and so the ecmascript committee has to make a lot of comprimises for pragmatism. In any case, worse is better, and in the real world we can't have the perfect computer system. Haven't you seen tron legacy?
The beauty of this approach is that browser vendors don't have to execute these quirky expressions optimally. They are standard JS, so they will already work.
Yeah, but that's not hard. And there's a spec that makes it extremely clear which tricks are necessary -- no need to reverse-engineer someone else's implementation, and no risk of tricks that work today breaking tomorrow.
No one is saying (a+b)|0 is pretty. If you care about syntax, you would probably use something like sweet.js to make a macro for it or something like that.
asm.js is not concerned with pretty syntax. It just takes the existing type of code emscripten and mandreel have generated for a long time now, and writes a formal spec for it. That's useful to get code to run faster. That's it.
It's not moving away from, it's addressing shortcomings.
As you should know, JS only has 1 type of numeric representation in userland, Number, which serves both floats and ints. Not surprisingly, operations with ints are much more efficient than with floats, and if an optimizer can introspect your code and determine that some value will only ever be an int, internally it will represent it as a int. Otherwise, it's forced to use a float representation.
Thus, coercing all values to values that can be internally represented as ints can yield a significant performance boost.
And importantly, still work on browsers that don't specifically support asm.js. They'd be slower, because they wouldn't be taking advantage of the available optimization, but the code would still run.
It really doesn't, though. For the first 12 or so years of its existence it was basically unusable. It wasn't until jQuery hit the scene that it started to meet even the bare minimum level of usability. Even then, it still took a huge amount of effort and investment from Google, Apple and Mozilla to make it what it is today, which admittedly isn't very good at all.
Even now, it generally only looks like a "good" option to those who don't have experience with other, better programming languages (PHP excluded; it's nearly as bad as JavaScript is). Developers who do have more experience with other languages usually use JavaScript very reluctantly, and this is especially true when it comes to the best developers.
The fact that CoffeeScript, TypeScript and Dart get so much attention just further shows how bad JavaScript is. Developers don't want to use it, and so they try to hide it to a large extent (CoffeeScript), or modify it to be more like languages like C++, Java and C# (TypeScript and Dart).
JavaScript should have been replaced many years ago, if it ever should have even been included in Navigator so many years ago.
> For the first 12 or so years of its existence it was basically unusable. It wasn't until jQuery hit the scene that it started to meet even the bare minimum level of usability.
jQuery didn't address problems with the JavaScript language -- to the contrary, it's a pretty good example of the power JS has and what can be accomplished with it. The problem jQuery addressed was the DOM API (and some other points of interface with the browser), which was probably the most significant pain point, and which probably wouldn't have been much different had another language been chosen.
> Developers who do have more experience with other languages usually use JavaScript very reluctantly
Yes, by and large, one of the biggest problems with JavaScript might be that people seem to approach it as if they don't or shouldn't have to learn it instead of whatever else they're already familiar with.
I fit that description from 1999-2003 or so -- along with the description of someone who had/has experience with other "better" programming languages -- but after a few key pointers about scope and functions from someone who had taken the time to learn to use it, I don't share your apparent view at all. My experience has been that it's serviceable and more or less on par with Perl, Python, and Ruby... to the point where my suspicion is that most developers for whom the language itself is a significant obstacle wouldn't do appreciably better were JS magically replaced with something like any of those three.
> Developers don't want to use it, and so they try to hide it to a large extent (CoffeeScript)
CoffeeScript is JavaScript semantics with different tokens and a handful of shortcuts/sugar. More power to people who enjoy using it, but to the extent anybody's arguing that it's good enough, they're also arguing that JS really wasn't that bad all along.
It's plugins done right with the gigantic caveats that it duplicates a large swath of low-level DOM APIs, requiring a large amount of effort to write two implementations of essentially the same things, and has no direct access to the DOM or external JavaScript, meaning it lacks support for some APIs (WebRTC) and cannot really be used as a drop-in replacement for JS or to take advantage of the existing HTML renderer (it's all or nothing).
That is, it's plugins done right if you insist on living in a separate process, despite the adequacy of both NaCl and JS engines' ability to keep code within the inner sandbox/VM, and what should be the adequacy of the sandbox of renderer processes even if someone manages to run native code in it.
Me, I'd prefer to just run emscripten on the Python interpreter, take advantage of a very small VM that nevertheless is apparently able to run code at half native speed, and start using a new language on top of all the nice existing stuff.
> has no direct access to the DOM or external JavaScript, meaning it lacks support for some APIs (WebRTC) and cannot really be used as a drop-in replacement for JS or to take advantage of the existing HTML renderer (it's all or nothing).
Is there some fundamental reason why these won't all get solved in the future?
The fundamental reason is that Pepper plugins live in a separate process, so latency is high - a security measure, but one that I'm arguing is unnecessary and forces highly unfortunate design choices.
They really want the performance of asm.js code to be predictable, and as a compiler writer, I believe this is entirely possible. The subset of JS they use is made to have good performance. The people here don't seem to like some of the implementation details of NaCl. I think they also might just not want to let Google impose something. Also, the inventor of JavaScript is the Mozilla CTO (shhh!).
I'm partial to PNaCl too. I think it's probably better, in the long term, to have a standard bytecode format to compile to... But, asm.js might win this war, just because the road to asm.js support is much shorter. You basically already can compile C programs to asm.js and have them run cross-browser. Worse is better?
There are good and bad things about PNaCl. But overall, the resistance to PNaCl goes beyond Mozilla -- it's a really hard standardization/synchronization problem. Our goal with asm.js was to produce something that could work today (because it's 100% compatible with existing JS semantics) and work even better tomorrow as engines optimize it even better.
Edit: Re: predictable performance, that's exactly what asm.js is about. The performance model is much closer to that of C or PNaCl/LLVM.
1) It's nonexistent so far, as far as I can tell, after years of research+work (unlike asm.js, which took one engineer a few months to implement a compiler for in Mozilla). Or can you actually use PNaCl in Chrome now? Or is there a concrete timeframe for when it'll be ready?
2) PNaCl involves pulling in all of Pepper, which is undocumented and very much tied to Chrome's architecture last I checked. Other browsers don't want to get on the "chase Chrome's Pepper implementation" treadmill. asm.js is just running in the JS VM, so talks to the DOM (and GL and networking, etc) in the normal way, which browsers already implement.
This is lower-level. It is JavaScript, but really it is a subset that implements low-level operations. In fact lower than PNaCl, this is past the LLVM IR level.
I bet if you rewrote the spec to use a different syntax, and didn't mention that it (syntax aside) is a subset of JS, people would be falling over themselves to say "finally, proper bytecode for the web!"
So will there be code to turn LLVM IR into asm.js stuff? That would be pretty awesome (and a nice way to make Emscripten more accessible to other languages?).
> So will there be code to turn LLVM IR into asm.js stuff?
There already is, that is precisely what emscripten does when run with the ASM_JS=1 flag set. Emscripten's input is LLVM IR, not C or C++. (Although, it has been mostly tested on LLVM IR generated from clang that was given C or C++.)
The author's specific concern there is about control over memory layout. In that regard asm.js should stack up well, because its memory model is a large linear array of bytes. It's hard to imagine more fine-grained control than that, except if perhaps mmap() and munmap() were supported.
This seems very much targeted at emscripten and not to cross-compilers that start with GC'ed languages like GWT, Dart, ClojureScript, et al. If you are cross-compiling Java or C# to asm.js, you don't really want to manage memory manually. I work on GWT, and asm.js as an output target is very interesting to me (I've worked on a number of performance Web games using it, GwtQuake, AngryBirds, etc), but the starting assumption is GC, so I want to leverage non-boxed numerics, and all of the other nice stuff, but don't want to stuff everything into a TypedArray.
It's also unclear to me how this solves the problem of startup time on mobile. A giant glob of Javascript takes a non-trivial amount of time to load even on today's fastest smartphones. The spec seems to argue that Browser VMs could recognize asm.js and if I read between the lines, employ snapshoting the app and caching it for later quick startup?
In all likelihood, the majority of asm.js outputs would be actually be non-human readable output of optimizing cross-compilers, so there isn't much benefit from having a readable syntax that humans could read, so what's the real justification for using JS as an intermediate representation over say, a syntax specifically designed for minimum network overhead and maximum startup speed? Seems like it might be worthwhile for Mozilla to also work on efficient representations of asm.js that minimize this overhead. The usual response is minify + gzip, but it's not a panacea.
First of all, we do care very much about supporting compilers for managed languages like Java and C#, but we're starting with this first version that only supports zero GC and atomic primitive types. We have plans to grow outwards from there to support structured binary data, based on the ES6 binary data API, and controlled interaction with the GC. Luke has ideas about how to do this without losing the predictable performance for lower-level compilers like Emscripten.
We do have plans for startup time. I hope to pitch a very small, simple API for a kind of opaque compiled Function. Internally we've been calling it FunctionBlob (we'll bikeshed the name later). The idea is that `new FunctionBlob(src)` is almost identical to `new Function(src)` except the object is (a) an opaque wrapper that can be converted to a function via `fblob.toFunction()` and (b) entirely stateless and therefore compatible for transfer between Workers as well as offline storage. This would essentially make it possible to do things like background compilation of asm.js on a Worker thread, and caching of the results of compilation in offline storage. That way next time you startup you don't have to download or optimize the source. (This could work especially well with the app experience where you could perform these download/optimize steps as part of the installation process.)
As for the use of JS, this is purely pragmatic. The code works today in browsers, so people can start using it and it works -- and even quite fast; Emscripten is already shockingly performant in Firefox and Chrome -- but over time it'll see better and better optimization.
Presumably, if it was a blob, it could also be stored in local storage/indexdb/filesystem API? Is the internal format supposed to be architecture neutral, or dependent? I can see arguments for either, but if it were neutral, than the blob conversion could be done offline/statically on the server, and downloaded by the client dynamically (e.g. XHR to fetch function blobs). If it were architecture dependent, then I could see advantages as well, letting the browser vendor choose the optimal form of the blob. This would potentially yield better performance, but you wouldn't be able to host blobs on the server.
I was only thinking that it would be internal. Your server point is good, but I think way, way harder, and kind of starts the whole project back at the beginning: how to design a standard, efficient, optimized bytecode format. So I think it's probably not really feasible.
Not the same, but an additional optimization you can do is incremental compilation. Because you have JavaScript's eval, you can download the code a bit at a time and optimize (and cache via FunctionBlob) each piece.
Storing it and transferring it on the server is one thing; serializing it locally in the browser itself might be a more reasonable goal? That is, it wouldn't be expected to be portable to anything but that very same browser, but it would allow you to cache the compiled result. (I would expect the serialized string to be signed by the browser itself, to prove that it was created by the browser – and for the deserialization to fail on some browser upgrades).
Care would have to be taken to ensure that these blobs cannot be mutated via other means, or that a non-function blob can be converted to a function blob with passing through validation, otherwise it might be an attack vector, for someone to figure out the internal representation, and manipulate storage to forge one that trips up some fast-path code which makes assumptions about the input being correct.
Right, that's the idea of FunctionBlob. It wraps a browser-internal representation of the optimized compiled code. The web code can instruct the browser to store that offline, without exposing its implementation-specific details to the web code. The web code can then, in a later session, retrieve that optimized code from storage as another FunctionBlob, which it can then convert into a Function. This is no different from just storing the asm.js source code in offline storage, except it avoids redoing the work of compiling and optimizing the source. (It'd still have to be stored in position-independent format and there might be some back-patching necessary when reloading it.)
I think GC interaction should be the highest priority. As a way to run C/C++ in the browser at native speed, asm.js is awesome, but as a way to run Python, Java, Go, etc in the browser at native speed, asm.js would be world-changing.
Very cool. My position for years has been that "Javascript is the defacto bytecode of the web" is, on balance a very good thing. However, it has been a bit of a hack.
What you're proposing is: "Javascript is the bytecode of the web/ How about we make it a much better bytecode?"
"I work on GWT, and asm.js as an output target is very interesting to me (I've worked on a number of performance Web games using it, GwtQuake, AngryBirds, etc), but the starting assumption is GC, so I want to leverage non-boxed numerics, and all of the other nice stuff, but don't want to stuff everything into a TypedArray."
I'll let dherman elaborate in more detail, but he is working on it. :)
"The spec seems to argue that Browser VMs could recognize asm.js and if I read between the lines, employ snapshoting the app and caching it for later quick startup?"
This is really a problem with any portable code delivery format that wants to run compiled code. You either have to stick to native code, in which case you aren't portable, or you have to compile on the client, which adds startup time. Caching of compiled artifacts is going to be necessary in any portable solution, and I don't see any reason off the top of my head why this would be particularly different for asm.js.
"Seems like it might be worthwhile for Mozilla to also work on efficient representations of asm.js that minimize this overhead. The usual response is minify + gzip, but it's not a panacea."
I agree, and I've talked to Alon about this. I think that it would be an interesting project to develop a format that compresses asm.js down as small as possible. Then you can simply uncompress and eval on the client side to ensure backwards compatibility.
Java went through a similar issue with their pack200 format. I am not privy to where all the time is spend in mobile JS startup (parsing, etc). It seems like using some kind of special JS precompressor would both benefit entropy coding as well as allow a faster parser, but I've seen it turn out to be a wash in the past. :( However, even if you could reduce network bandwidth, it would be a win for people's data plans and batteries. :)
Yes, this has downsides like compiling your own GC. For GC heavy code it might be slow. But for raw computation you would probably get faster results than source-to-source like GWT does - you get LLVM optimizations, and you get asm.js which is easier for JS engines to run quickly.
> It's also unclear to me how this solves the problem of startup time on mobile.
Yeah, that is a problem. It's a problem on normal JS too, and also for things like PNaCl. Only shipping final native binaries can fully avoid that (but that is nonportable).
> The spec seems to argue that Browser VMs could recognize asm.js and if I read between the lines, employ snapshoting the app and caching it for later quick startup?
Yes, that is one possibility. It could work for normal JS too, I'm not sure why it hasn't been done yet. Worth investigating.
> In all likelihood, the majority of asm.js outputs would be actually be non-human readable output of optimizing cross-compilers,
Yes.
> so there isn't much benefit from having a readable syntax that humans could read, so what's the real justification for using JS as an intermediate representation over say, a syntax specifically designed for minimum network overhead and maximum startup speed?
The justification is that asm.js code will work, right now, in all browsers. (And it can fairly easily be optimized by them to run much faster than compiled code in JS.) Any new format would require standardization and take a long time. asm.js is just JS.
For network transmission, we should implement a special minifier for it (written in JS of course).
This needs to be shouted from the rooftops. This is the single most important constraint of the project and also its greatest strength.
So asm.js isn't the magical perfect browser bytecode that everyone on HN wants -- and which would have all manner of flaws if it actually existed in concrete form, rather than a platonic ideal in people's heads -- but that's a completely unrealistic goal anyway.
But asm.js is usable in all browsers immediately. asm.js is just JS.
> But asm.js is usable in all browsers immediately. asm.js is just JS.
A magic JS-based bytecode that's usable in all browsers immediately isn't useful if it isn't fast. Which it isn't, because it's a JS-based bytecode executing under existing JS engines.
So now we have apps that perform incredibly poorly when run on a browser without "asm.js" support, and a rather ridiculous bytecode format that will have to be parsed natively to run reasonably quickly, with a fair bit more complexity for every layer in the development and runtime stack because they insist on keeping it as valid JS syntax.
Our numbers show asm.js can be 2x slower than native or better. That's not "not fast". And, even without asm.js optimizations, the same code is 4x slower than native, which is as good or better than handwriten JS anyhow - which is not "incredibly poorly".
If you have other numbers or results, please share.
For a desktop and/or mobile app, on which the consumer is waiting and you are burning battery (laptop/phone) or just simply CPU cycles, 2x-4x slower is 'not fast'. You're simply wasting the end-users time and resources for what amounts to ideological reasoning.
We're always making a trade-off between performance and ease of programming, but when your competition is coming in at 2x faster than your optimal case, and 4x in the standard case, you're going to lose for all but the simplest apps.
How does PNaCl compare in terms of performance to "native" code? It still has the compilation overhead, it still has a lot of the bounds checking… It's not clear to me that PNaCl will actually be much quicker than asm.js.
I believe the ideal (for users) would be to target NaCL natively, with a fallback to server-side PNaCL compilation, and an absolute fallback to PNaCL compilation/execution.
>Python, for one, is plenty usable, and is not fast.
In any environment where I might currently choose to use Python I also have the option to use something else for parts of the project where Python proves to be too slow. Will FirefoxOS provide such an escape hatch?
>JS engines are close to Java/C in speed
'close' is a pretty vague term. For a lot of tasks Ruby is 'close' enough to C that the difference doesn't matter. For a different set of tasks Java is not 'close' enough to C (or C+asm) to be a viable choice and neither is Javascript.
I'd also like to point out that battery life does matter, and using at least twice the CPU cycles for most tasks isn't conducive to good battery life.
>In any environment where I might currently choose to use Python I also have the option to use something else for parts of the project where Python proves to be too slow. Will FirefoxOS provide such an escape hatch?
Seeing that Python is 10-20 times slower than V8 for most Python/JS native operations, you should have that problem much. Especially considering that the purpose of asm.js is to give you an even greater boost in speed. And seeing that NaCl never got anywhere, not only this is your best bet but it's far better than anything else out there at the moment.
asm.js IS a bytecode format. That it is human readable or that it accepts some tradeoffs because of JS doesn't matter. The end result (after the JIT pass) would not be any slower for it. The only problem with a readable "asm" would be slower load times, but that can be taken care of in the future by providing some pre-compiled format or more control over caching if asm succeeds.
Please be careful with the "close to C in speed" claim.
There are only a very small number of languages that can legitimately claim that (C++, Fortran, and sometimes Ada). Java is not one of them. JavaScript is surely not one of them, even with the latest versions of V8.
The only time we see performance remotely close (which still usually means several times slower, at best) to C is for extremely unrealistic micro-benchmarks that have been very heavily optimized to a state where they don't at all resemble real-world code.
We have had Word Processors and Spreadsheets in Javascript (Google Docs), 3D and 2D games, and even a h264 decoder and a PDF renderer. Heck, they have ported QT to Javascript, and the example applications run at a very acceptable speed. None of the above are slow.
So, no, it's not true that V8 is only fast in selected "microbenchmarks".
You might not do scientific applications or NLE video editing with it, but for everything else it should be just fine.
>'Very acceptable' speed isn't what consumers are looking for when comparing battery life and wall-clock performance between competing platforms.
Where does the idea come that V8s and co very acceptable speeds come at the expense of battery life and wall-clock performance???
Not to mention that people are using far less capable web apps in the mobile and desktop space now (i.e pre-asm.js javascript), so the increase in speed due to the asm.js/optimisation standardisation would only make battery life and wall-clock performance better.
> This seems very much targeted at emscripten and not to cross-compilers that start with GC'ed languages like GWT, Dart, ClojureScript, et al.
That's correct, although one could implement a garbage collector on top of the typedarray heap (we have working examples of this). There are some limitations to GCing the typed array manually, though, such as not taking advantage of the browser's ability to better schedule GCs.
Looking further in the future, though, it would be completely reasonable to extend asm.js to allow the super-optimizable use of the upcoming BinaryData API [1] in the style of JVM/CLR-style objects. Again, though, this is speculative; BinaryData isn't even standardized yet.
> It's also unclear to me how this solves the problem of startup time on mobile.
We have several strategies to improve this. The "use asm" directive, in addition to allowing us to produce useful diagnostic messages to devs when there is an asm.js type error, allows us to confidently attempt eager compilation which can happen on another thread while, e.g., the browser is downloading art assets. Looking farther in the future again, we could make some relatively simple extensions to the set of Transferrable [2] objects that would allow efficient programmer-controlled caching of asm.js code including jit code using the IndexedDB object store.
> In all likelihood, the majority of asm.js outputs would be actually be non-human readable output of optimizing cross-compilers, so there isn't much benefit from having a readable syntax that humans could read, so what's the real justification for using JS as an intermediate representation over say, a syntax specifically designed for minimum network overhead and maximum startup speed?
Before minification, asm.js is fairly readable once you understand the basic patterns, assuming your compiler keeps symbolic names (Emscripten does). The primary benefit is that asm.js runs right now, rather efficiently, in all major browsers. It also poses zero standardization effort (no new semantics) and rather low implementation effort (the asm.js type system can be implemented with a simple recursive traversal on the parse tree the generates IR using the JS VM's existing JIT backend). This increases the chances that other engines will adopt the same optimization scheme. A solution for native performance is only a solution if it is portable and we want to maximize that probability.
> The usual response is minify + gzip, but it's not a panacea.
In addition to minify+gzip, one can also write a decompressor in asm.js that unpacks the larger program. Also, see [3] for how minified gzipped Emscripten code is comparable to gzipped object files.
https://bugzilla.mozilla.org/show_bug.cgi?id=840282 has some measurements on how fast the implementation of asm.js (called OdinMonkey) already is.
"sm" is SpiderMonkey, that is the normal JS engine. v8 is Chrome's JavaScript engine.
Current results on large Emscripten codebases are as follows, reported as factor slowdown compared to gcc -O2 (so 1.0 would mean "same speed")
I kinda hope that someone at Mozilla keeps a count of just how many monkeys have gone into the code base. Off the top of my head, there's SpiderMonkey, TraceMonkey, JagerMonkey, IonMonkey, ScreamingMonkey, IronMonkey, and perhaps you should count Tamarin. It would be neat to see mascot like versions of them all... :)
IronMonkey was apparently for "mapping IronPython and IronRuby (and maybe IronPHP) to Tamarin". I just vaguely recalled seeing the name pop up in a Mozilla related context -- I certainly couldn't have told you what it was without looking it up. :)
The cool thing is that those of us who have small performance-critical javascript routines e.g. game engines have a whole new cheatsheet of 'optimal' javascript. I can't wait for box2d, octrees and matrix libraries to adopt it; a whole new generation of hand-optimised assembler!
Reminds me a bit of efforts like C--, which sought to seek a simpler pseudo-assembly used as some kind of intermediate language for compilers. But those efforts never gained much traction, whereas I think that some transpilers actually exploited a few features beyond even normal, full-fleged C -- namely GCC extensions -- to make some features easier and/or faster (it's been a while, but it was probably some trampolining optimization).
Let's see how it turns out.
As for something completely different, I've always wondered how it would be to program a webapp in a rather different language than JS -- most transpiled languages aren't that fundamentally different from JS. And Emscripten seems mostly used to port some code that "runs in a box". Wonder how far one could come doing some stereotypically Web 2.0 things in e.g. Pascal.
Elm and Roy are quite different in that they abandon normal js semantics.. as far as going to a more blub language, please do report your experimental results!
If C-- had been a subset of C, then it would be more comparable. Having immediate compatibility with existing implementations makes adoption vastly easier.
I've become increasingly convinced that a standardized VM in the browser that other languages could target would be the best idea. And we could forget that the whole JS thing ever happened.
Seriously, asm.js already comes pretty close to a standardized VM at a low level. But we intend to grow it to include integration with structured binary data and GC, to the point where it'll provide very similar functionality to VM's like the JVM or CLR.
The big real-world benefits of asm.js over a boil-the-ocean VM are that (a) the content is already 100% compatible and quite highly optimized in existing browsers and engines, and (b) it's far easier for engines to implement on top of their existing technology.
The biggest downside is that the syntax is weird. But that's just not a big deal for compilers. They can generate whatever weird syntax they want. You could even implement your preferred bytecode format as an IR and compile that to asm.js if you wanted, but I'm not sure compiler-writers would even care that much. They'd do just as well to have IR data structures in the internals of their compiler, without needing a special surface syntax for it.
There was a company in the '90s that was working on a byte code representation of programs which was designed to be “optimally” small by encoding more at the AST level than the ASM level. There were papers, but they were acquired by Microsoft (I think) and vanished. Does anyone remember them? The papers would be interesting in light of this, and conveniently near the end of patent lifetimes if patented and solid prior art if not.
Edit: Maybe I'm thinking of Michael Franz and Thomas Kistler's work on “Slim Binaries”. It matches the content if not the ambiance that I remember. It's been a while. My brain tapes could have print-through.
I really can't tell what you're advocating here. It seems as though you're against JS in general, but I think you'll have a tough time arguing that JS isn't: 1) standardized, 2) a VM, 3) in the browser, or 4) a target for other languages.
JS is standardized. JS VM's are not. Also I'm not against JS per se, I've grown to appreciate the language for what it is, it's more the JS monopoly that I kind of dislike.
Nobody is going to agree to standardize the internals of their JS implementation. Competition for JS performance remains high, engines change their implementations all the time. Engine implementors do not want to, and should not, expose their implementation internals.
Instead, what people really want is a standard semantics they can target that isn't tied to any one vendor's implementation details. That's the crux of a VM, and that's what asm.js is providing.
Sorry, I did not explain my position accurately. When I said that JS VMs are not standardized, I was not arguing that they should be (that is in fact a terrible idea), I was responding to the poster above who said that since we (a) already have a JS standard and (b) browsers have JS VMs, that we already pretty much have what I was arguing for.
I also agree that standard semantics is what people really want. Asm.js seems like an interesting project but one of the issues that (I like to think) the VM approach would solve is that the semantics could be more low-level. Why do I think that that is a good idea? Well ideally, one would not have to wait for different browsers to implement say websockets, but the websocket functionality could be simply implemented by the website as a library which would use (presumably secure) lower level primitives of the VM to achieve the websocket functionality. (If that sounds vaguely familiar to the microkernel idea, it's not just you :-) ).
What you're asking for is drastically harder to achieve from a security perspective. If we didn't care about security, we could just standard the Linux syscalls or something and call it a day. :) But what we're doing here is providing the low-level computational model but still only providing attenuated access to system facilities like the network through standardized web API's.
I would not say 'drastically'. I guess arguments from the monolithic vs microkernel discussion can be recycled. Basically, security is one of the main arguments for the latter [1] and a lot of OSs that require high security are in fact microkernels. But yeah, implementing the Linux API in the browser does in fact sound like a rather sub-par idea but I was not arguing for that.
I did not say that it would be simple, it would be pretty hard actually. The question is, whether it is worthwhile for people to be investing so much time to make something do something it was not intended to do (I'm not talking just about asm.js but about all the other languages and tools that target JS as well. I feel like these don't really add anything new to the table). And the question is also not whether it is hard or not but whether it is hard compared to similar projects. For example, I'm not convinced that it would be any harder than building a new browser (again I realize that this is super hard, but I'm speaking comparatively).
As to arguing, I prefer the word 'debating'. Also I was under the impression that that is what comments were for. Correct me if I'm wrong.
You have to define what you mean by "intent" for that to make any sense. Here is Dave Herman, one of the authors of javascript, telling you here is asm.js. you can use compilers to compile to it. Is that not intendy enough for you?
> Asm.js seems like an interesting project but one of the issues that (I like to think) the VM approach would solve is that the semantics could be more low-level.
Can you elaborate? asm.js semantics are already low-level, in fact as I said in another comment, they are lower than some VMs (e.g. PNaCl).
Which part of asm.js do you find to be too high-level?
You're too narrow in your definition here. Think of it this way: Dalvik and HotSpot are both implementations of the Java Virtual Machine (JVM). In the same vein, V8 is just an implementation of the Javascript Virtual Machine (JSVM).
"Ahh, I see, Clojure SHOULD HAVE been compiled to Java, not the JVM" said no one ever. This is a blatantly horrible idea. Horrible ideas are great for hacks, but it actually looks like Mozilla is serious about this (which scares me).
I want my bytecode to look like this: `iadd`, NOT `(a+b)|0`. Maybe I'm just old fashioned, but running bytecode through a text interpreter, which is then compiled again to machine code (possibly via another bytecode) seems like a horrific hack.
I would be much happier with a standardized VM which Mozilla could then implement via javascript or whatever if they wanted.
"I want my bytecode to look like this: `iadd`, NOT `(a+b)|0`."
That isn't enough of a reason to forego backwards compatibility. It is a hack, to be sure. But there are all sorts of suboptimal things that we deal with in order to make sure that technologies survive in the real world. Especially when it's really just a question of what the encoding looks like, which is not something that's that important.
(Besides, it turns out that making a multi-language VM that performs as well as a custom VM for dynamic languages is an unsolved problem. Many language implementers, Mike Pall of LuaJIT fame for example, don't believe in it. The closest thing is probably PyPy...)
> I want my bytecode to look like this: `iadd`, NOT `(a+b)|0`.
So build a front-end! I actually thought about doing this myself, because I knew we would get people confused about the difference between syntax and semantics. The syntax is not important to the computer, and it's generally not important to a compiler either (codegen can output whatever format it wants).
Note that the bytecode for the JVM and CLR, for example, don't look like `iadd` either. They're binary encoded formats. The format is just not relevant.
> Maybe I'm just old fashioned, but running bytecode through a text interpreter, which is then compiled again to machine code (possibly via another bytecode) seems like a horrific hack.
Never has to pass through an interpreter at all. It does not pass through an interpreter in our implementation.
You seem to be very hung up on surface syntax. The importance of a VM is not the syntax of its language but the semantics.
> You seem to be very hung up on surface syntax. The importance of a VM is not the syntax of its language but the semantics.
I'm protesting that there is a syntax to this 'VM'. It's rather indicative of Mozilla's approach to life, namely, write everything in JS out of some pseudomasochistic desire for backward compatibility hacks.
I would really, really like to see some benchmarks comparing NaCL vs asm.js, and I won't buy this as a viable compilation target until there's data to back up these (dubious) claims. If they do, then perhaps a frontend would be useful.
> The importance of a VM is not the syntax of its language but the semantics.
> I'm protesting that there is a syntax to this 'VM'.
That isn't a very reasonable thing to protest. Every language has a syntax. That includes every ASM variant. Syntax is an inherent aspect of language. What it sounds like you're actually offended by is that its syntax is very different from most ASM syntaxes.
> The semantics of JS are pretty horrible too.
Are you talking about asm.js, or are you talking about the superset that is not relevant to this discussion?
> I would really, really like to see some benchmarks comparing NaCL vs asm.js, and I won't buy this as a viable compilation target until there's data to back up these (dubious) claims.
The current numbers are that asm.js is around 2x slower than native code. I didn't compare to NaCl (which would be apples-to-oranges since it is non-portable) nor PNaCl (which I am not sure is ready yet for benchmarking? Please correct me if not).
We expect to improve on the 2x later this year, this is just the first iteration. I do think 2x is quite promising already though - it's in the range of Java and C# (on a fast VM for them).
I'll be putting up some slides with more specific numbers tomorrow after I finish giving a talk on it.
> It's rather indicative of Mozilla's approach to life, namely, write everything in JS out of some pseudomasochistic desire for backward compatibility hacks.
I'm just trying to make progress in a messy world. JS is not my ideal programming language (even though I've worked hard to make it better 7 years now (!) and counting). I simply believe asm.js has a good adoption/evolution story, and I want to web to continue growing and competing.
> The semantics of JS are pretty horrible too.
A cheap shot, and missing the point. The subset of JS, while being completely equivalent to JS semantics, is also equivalent to a low-level machine model. IOW, it gives you the semantics of a low-level but safe VM.
Don't think that I'm not sympathetic here; I agree that readable machine code is generally undervalued. At some point your code generator will bug out, and someone will be forced to sift through generated code and mentally interpret stuff like `(a+b)|0` to figure out where it's gone wrong. I wonder if the eventual asm.js proposal would be regular enough to define a 1:1 mapping between `(a+b)|0` and `iadd` (or some other more-conventional ASM format), to be used for human-reading and debugging rather than machine interpretation. I doubt it's possible in the general case, but I could be wrong.
Yes, absolutely. When we first started talking about the project I seriously considered designing a "disassembly" syntax, to make it easier. We can still do that pretty easily, it's just taking a bit of a back seat to getting v1 of the spec and Odin implementation out the door.
Let's assume, for a moment, that your position is basically right - that we need to come up with a language-independent VM, and that JavaScript isn't going to cut it. We'll even imagine that we have the ideal spec sitting in front of us, which does everything that anyone will ever need (which isn't going to happen).
How can we possibly get there from where we are now? Basically, we can't, for various reasons.
Anything that doesn't work on existing deployed web browsers is dead on arrival. If the VM won't work on the vast majority of deployed browsers, then I can't use it. It would take years for JavaScript-only browsers to die off enough that I can use the VM, especially Internet Explorer and various phone / tablet browsers (Android being an obvious problem, since most manufacturers do not release updates, or release them years late).
So, if I wanted to use the VM before 2020, I'd have to have a JavaScript-only fallback. I just don't have any other option, because I certainly can't use PNaCl (Chrome-only) or Flash (effectively desktop OS only). If I have a fully functional JavaScript-only solution, which works right now, in all existing browsers, what do I need the VM for?
That's assuming that we could possibly get any agreement on a VM. That's never going to happen, for a number of reasons.
The only browser manufacturer actually interested in a VM is Google. Hence, they're working on PNaCl. Microsoft, Apple and Mozilla are all against the idea for various reasons, ranging from pragmatic to idealistic.
There are far, far too many different ways to do a VM, which depends largely on what languages you want it to run. A VM designed for C / C++ code (which asm.js pretty much is, as is LLVM) is very different from something like the JVM or .NET CLR, which is different from the kind of VM a JavaScript implementation has. Different groups of people want completely different things, which are often diametrically opposed to one another. The kind of VM a game developer would want is completely different than the kind of VM you'd want for developing business applications, and either VM would be essentially useless for the other group.
Besides that, what's the point? Modern JavaScript JITs are stupidly fast, even when they have virtually no information about the code they're running (types, in particular). You can already cross-compile a number of languages to JavaScript, which often treat it as a low-level target.
Remember, the web is a huge collection of different systems (browsers, servers, websites, development tools, and so on) which really can't just be upgraded in one pop. It has to evolve, and it has to work at every step of the way. If something doesn't work, or isn't usable, it tends to die, even if it was a really good idea.
There's one possibility, though:
People start using JavaScript as a compilation target. Improvements in JavaScript engines make this approach viable, and reasonably fast. Compilers tend towards using JavaScript in a certain way, which is kind of predictable. Browser manufacturers notice this, examine the subset, and come up with a spec describing it, and a plan for optimizing it. Compilers start producing the standardized subset, and JavaScript VMs gradually start adding optimizations for the subset. Browser vendors start competing with one another for performance, pushing the optimized VM close to native speeds.
Then, eventually, someone might take what those browsers are already doing, write a bytecode implementation of it, and a JavaScript shim that turns the bytecode back into JavaScript to support existing browsers. If there's any advantage to it at all, this bytecode would be trivial to adopt in a browser (it's close to what they're already doing), so there's no reason for browser manufacturers to resist it. If there are no advantages to it, nobody will bother.
It'll probably end up being a perfectly reasonable feature, based on a design that would be completely insane if you designed it from scratch, but works anyway.
Forcing everybody to use the a ubiquitous, homogeneous VM implementation is probably not going to work. Browser makers like being able to improve their implementations on their own at a brisk clip. Also, while standardization is a good thing, competing implementations are also a good thing.
You actually said "standardized VM," which I took to mean "one VM upon which everybody standardizes."
But even now, it's unclear to me what you mean to be standardized about this VM if not its input and output (as we already have that with JavaScript VMs) and not the implementation. What are you asking for?
A VM with standardized input and output is exactly what I meant. Nevertheless, the underlying implementation of said VMs can still differ but they would all implement the same bytecode standard (similarly as both HotSpot and Dalvik both implement the JVM bytecode standard).
But JavaScript VMs already have standardized input and output. It is not a bytecode, but it is standardized input with standardized output. So we already have what you want except that the input format is different than you envisioned. It seems to me that's the whole idea behind asm.js — they're defining a subset of the language that can be implemented very efficiently with precise low-level semantics so that we get most of the benefit of a bytecode without throwing out all that we have now.
I thought we had made a standardized VM language called Javascript so we could forget this whole activeX thing ever happened..?
Not that I disagree that a VM would be very awesome, but it should take as many clues from the collected knowledge of Javascript security as possible, obviously just hooking your 'sandboxed' VM to the browser doesn't cut it. As demonstrated by SUN's Java applet :)
LLVM byte code in an OS secured jail/sandbox? The JIT is already there and BSD licensed so all the players can use it. You'd really have to trust your sandbox though.
The API for what you what you can do out of your sandbox would be the hard part. Every capability you add to the API is also a lurking attack vector in each implementation.
Microkernels in the browser to the rescue :-). But yeah, I had something like that in mind. I do realize that getting security right would be tricky, but I'm not sure if it would be that much trickier than say the security of any given JS engine. Since the semantics of said VM would be probably simpler, I would make the argument that getting the security right would be easier to do than the security of said JS engine.
I agree... I don't know who thinks having Javascript as "bytecode for the web" is any good: Even other languages(Fay, Clojurescript) need to compile to JS.
Wouldn't it be better if we had a standarized, fast, language that other languages like the aforementioned can target?
Compilers like Emscripten and Mandreel, which already generate code similar to asm.js, can be modified (we already have it implemented for Emscripten, it's not a big change) to generate valid asm.js. Then engines that recognize it can optimize the code even further than existing optimizations. Some of the technical benefits:
* ahead-of-time compilation and optimization instead of heuristic JIT compilation
* heap access can be made extremely efficient, close to native pointer accesses on most (all? still experimenting) major platforms
* integers and doubles are represented completely unboxed -- no dynamic type guards
* absolutely no GC interaction or pauses
* no JIT guards, bailouts, deoptimizations, etc. necessary
But the bottom line here is: asm.js can be implemented massively faster than anything existing JS engines can do, and it's closing the gap to native more than ever. Not only that, it's significantly easier to implement in an existing JS engine than from-scratch technologies like NaCl/PNaCl. Luke Wagner has implemented our optimizing asm.js engine entirely by himself in the matter of a few months.
As the site says, the spec is a work in progress but it's nearly done. Our prototype implementation in Firefox is almost done and will hopefully land in the coming few months. Alon Zakai is presenting some of the ideas at http://mloc-js.com tomorrow, including an overview of the ideas and some preliminary benchmarks. We'll post his slides afterwards.
The code has to be explicitly marked as asm.js, using a pragma similar to ES5 strict mode. This way if the code fails to validate, the errors can be reported to the browser's developer tools.
As for debugging, the story is the same as Emscripten. I believe there's plenty of work to do to make it better, but it's no different than the existing state of affairs. All we're doing is formalizing existing practice so that engines can capitalize on it and optimize better.
It lets a cross-compiler like Emscripten (and theoretically more weird ones like JSIL, GWT, etc) generate JavaScript that can be jitted into code that has performance (and semantics) closer to native code.
So, for example, you can provide clear hints that value x should be an int32, value y should be a float, etc. And if you create a virtual heap out of a Typed Array, asm.js lets you ensure that the JIT maps uses of that array to direct memory accesses where possible (instead of bounds-checked array lookups).
It's basically a way to ship native code on the Web that's compatible with existing browsers, with the same performance as native code in engines with full support for it. (Getting to complete parity with native code will take time, but the language has been carefully designed to allow that — and we're close already.)
I like the idea of this, but it bugs me that it still uses doubles and only simulates integers via optimizations in the JavaScript compiler. Why has no JavaScript extension arisen to supply real integers?
One notable side effect of this: asm.js only seems to support 32-bit integers, not 64-bit or larger integers.
JS has 'real integers', they're not being simulated. If you put the appropriate hints in your JS the JITted output will never use a float anywhere. Your complaint is more that all the operators (with the exception of the bitwise ones) operate on floats, and yes, that is kind of a problem.
64 bit integer support is being worked on for JS elsewhere; asm.js probably doesn't offer it yet since you can't use 64 bit ints in any browser yet.
That's true, but JoshTriplett has a reasonable point. In point of fact, we are discussing custom value types like int32 and uint32, as well as compound value types like immutable records, for the future of ECMAScript:
But standardization takes time, and we wanted to get asm.js working now.
In the future if ECMAScript gains these other features we'll happily incorporate whichever ones make sense. For example, if having more straightforward syntax can help decrease code size that's a clear win. (Though gzipping source tends to mask a multitude of sins.)
Now we have Google's Dart & Closure compilers, Microsoft's TypeScript, and Mozilla's asm, all of which essentially add types back to javascript, not to mention about two dozen other statically typed javascript compilers: https://github.com/jashkenas/coffee-script/wiki/List-of-lang...
If types were approved 13 years ago, javascript apps could have been made to run much faster (fast enough for games even), perhaps negating a need for 'native' mobile apps that we have today, and either hastening the demise or spurring the optimization of competitors like Flash and Java and .NET/Silverlight.
(I'm already aware of arguments against static typing, and against having a VM in the browser or treating javascript like one.)
It's a lot harder to add types to a general purpose programming language. Your types have to match actual programming idioms, and if you care about them being safe (which, to be fair, recent languages like Dart and TypeScript don't), you have to consider every possible loophole that could lead to a dynamic error -- and the legacy language fights you, because all of its dynamism was designed back when nobody was thinking about respecting some not-yet-existent type system.
The type system for asm.js is a far more restricted problem, which is why we were able to come up with a solution so quickly (we only started this project in late 2012). The type hierarchy is extremely spartan, and it's just designed to map a low-level language (intended for compilers to be writing in) to the low-level machine model of modern computers.
Adding a static type system to JavaScript would have, IMO, been a losing effort. You can take a look at ActionScript 3 for an example of the challenges that come up.
Once you've got a static type system, you need a GOOD static type system, and imposing one of those on JS without breaking backwards compatibility would have been an ordeal and complicated code tremendously.
The fact that you can't represent function pointers (not closures, just plain old C-style function pointers) in asm.js severely limits its usability as a target language for even C-like languages.
Oh! Closure pointers would absolutely suffice. But as far as I could tell, asm.js doesn't support closure pointers either. Obviously I could use a value of "unknown" type to pass them in, but section 2.1.9 says "Since asm.js does not allow general JavaScript values, the result must be immediately coerced to an integer or double."
So I'm under the impression that asm.js can't really deal with closures or functions as values. Am I reading the docs wrong? I hope I am.
I mean, with empscripten, Firefox OS, Firefox browser, and now asm.js... they're about to force everyone onto their own terms.
This is clearly a big move and the beginning of a major victory for Mozilla and all users and developers.
Serving native apps, in the browser, with JS. And everyone is going to have no choice but to follow them, because all other browsers will fallback to their regular JS interpretter/JIT if they don't optimize on asm.js.
We're talking games and applications that will run an order of magnitude faster on Firefox than in other platforms out of the gate. But they'll still run everywhere, just very slowly.
Any thoughts on whether there could be a JS -> Asm.JS compiler? Might be a handy way to get rid of GC pauses - and maybe even improve performance - for existing JS code.
Or even a JS -> Asm.JS compiler written in JS... so you can feature-detect and enable on demand :)
Yes, you could comple JS to asm.js, but then, unless you change the JS semantics, you'll have to implement a GC, and your GC might have pauses.
Note that an interesting possibility would be to be able to generate asm.js at run-time for a domain-specific language. You could easily implement this as a JS library.
It would be very hard to compile regular JS to asm.js: It would be exactly as hard as compiling regular JS to C, in fact. The problem is you don't know the types of everything.
However, some regular JS might be easy to compile to asm.js. For example a matrix multiplication library.
Can it still be considered a 'subset' if it adds new traits like 'int' and 'intish'? If it both adds and remove things wouldn't that make it more of a variant like scheme is to lisp?
Yes, the observable semantics is 100% identical to running the same code in an existing JS engine. That's the genius behind Emscripten (i.e., the genius of Alon Zakai) -- he figured out that you can effectively simulate the semantics of machine integer operations by exploiting the various parts of the JS semantics that internally does ToInt32/ToUint32 on its arguments.
What asm.js is simply formalize those tricks in order to guarantee that you're only using those parts of the JS semantics, so that it's sound for an optimizing engine to directly implement them with machine arithmetic and unboxed integers. But the behavior is absolutely identical to a standard JS interpreter/JIT/etc.
Have you guys thought about memory management in the ArrayBuffer "heap"? One can decommit pages from a real heap, which can be a pretty important optimization.
I've had similar ideas back when BlackBerry was popular with its Java-only SDK in order to port our huge C++ app, and also when Windows Phone was C#-only.
Eventually it's worked itself out, as C++ has become available everywhere.
Nevertheless, if they succeed, the same thing can be easily made for Java and C#, so we can make our C++ app ultra-portable.
It will be an absurd toolchain though.
People are so unable to communicate and adopt standards as if they were retarded.
They already have sample apps, including some using libraries like Qt (I believe OpenGL as well), compiling to asm.js. I believe it has potential, so long as they have good support for standard and common C libraries (i.e.: porting to asm.js is almost effortless).