> So how does this work? Before the code is deployed, as part of a build step, we run the JS code using the JS engine to the end of initialization.
I also did basically this exact same thing for Go [0] when I realized their initialization in WASM was very heavy[1]. Basically I ran up until the real Go main started which includes a ton of runtime package/data initialization, and took a snapshot of the data and baked it back into the WASM and removed all the pre-main code. Granted this was years ago so I don't know if it still works on generated code today, but the idea is the same.
I think languages compiling to WASM, if they can, should run their initialization code and snapshot the data. A lot of people don't realize the number of init instructions to bootstrap a runtime these days. Go alone has thousands of instructions just to initialize the unicode tables. Over 4 million instructions before main were removed from hello-world Go WASM init by pre-processing.
You want to be careful with this technique when payload size dominates your application's performance, like on a webpage. I was looking into improving Emscripten's pre-execution behavior, but found that ended up making perf worse when factoring in standard download speeds. The functions it removes are usually smaller than the data they add, usually with worse entropy. So while JS is faster to execute, it's usually not faster than the transfer time.
It gets a bit worse if your initialization code is doing a lot of dynamic allocation, as you're shipping
“Format compilation” in TeX also appears to be inspired by a similar snapshotting mechanism built into TOPS-20[1].
Generally, I’d say that dumping VM state (if not in the host executable format) is a perfectly ordinary thing for an isolated VM to do. It’s the default way Smalltalk systems operate, for example, being very close ideologically to Lisp machines in that they try to be an OS centered around a programming language more than a programming language implemented inside an OS.
(I should really look into that Oberon thing one of those days...)
... Aa-and I’m being stupid, because there were of course also “persistent” OSes like KeyKOS[1], where a memory image (swapped in, modified, and swapped out as necessary) is essentially the only way for a program to exist. (Good news: capability access control is easy to program to and has entirely intuitive dynamic behaviour. Bad news: noöne has worked out how to bootstrap this dynamic behaviour from a static set of rules in a reasonably modular way, to the point that making a system which can hibernate but never shut down actually starts to look like a reasonable solution.)
This is also how Emacs optimized its startup: by loading its initial Emacs Lisp code and dumping itself together with a heap snapshot into a new executable (there is a special API in glibc to do just this).
Recently Emacs has moved to a "portable dump" approach which saves the heap snapshot separately.
> Nick Fitzgerald — Hit the Ground Running: Wasm Snapshots for Fast Startup
>
> Don't make your users wait while your Wasm module initializes itself! Wizer instantiates your WebAssembly module, executes its initialization functions, and then snapshots the initialized state out into a new, pre-initialized WebAssembly module. Now you can use this new module to hit the ground running, without waiting for any of that first-time initialization code to complete. This talk will cover the design and implementation of Wizer; discuss its performance characteristics and the scenarios in which it excels and when it isn't the right tool; and finally, in the process of doing all that, we'll take a closer look at what makes up the guts of a WebAssembly module: memories, globals, tables, etc.
One reason to use WASM to run JS, even in a browser, is to allow for interruptible execution of untrusted JS code. I built a library for this (1.2mb) that uses Emscripten to build QuickJS to WASM. I know of a company using my library to implement "plugin"/user-scripting using my library.
Awesome project. Curious question: Why not just use an embeddded v8 to sandbox untrusted code? I remember some databases used this v8 approach to let users write untrusted db plugins. Any specific advantages using webassembly instead of a v8 instance?
Webassembly is getting a lot of adoption in non-browser environments. Be it extensions for larger applications (like Envoy) or "serverless" style microservice or server deployments.
There already are plenty of stand-alone WASM implementations
. (wasmtime, wasmer, wasm3, SSVM, ...)
The gamedev industry will no doubt be shocked to learn that they could have just been running native with zero overhead instead of implementing and embedding various forms of scripting.
I think WASM will have a great future in the cloud, giving cloud providers can offer their services in any programming language their users want, running in a secure sandbox.
But i'm not so sure about personal devices. It would be very hard to beat Javascript code on the Web in general. So i dont know about the general purpose target future in that case, i think it will flop, except as a fancy accelerator or to virtualize things that was once outside the web.
While there will always be the case that highly optimized javascript code will be more performant, the flexibility and performance of WASM on the web and in browsers means that a significant number of high impact products and services have already made the jump to using WASM in part or in full.
As far as I understand, Hotspot is such an excellent VM arhat some languages have been adapted for it as a compilation/execution target; it‘s not particularly well suited for running other languages. Especially dynamic languages have been struggling with inefficiencies imposed by some of the Java-orientied paradigms.
WASM does not provide a garbage collector, as another example; this probably makes non-garbage-collected languages behave more predictably.
> Especially dynamic languages have been struggling with inefficiencies imposed by some of the Java-orientied paradigms.
I don’t think it is anywhere close to the truth. They are very well suited, as the JIT compiler can specialize dynamic types (and optionally deoptimize them when the type changes). There are also clojure, jruby, a python implementation, and java can also be written with significant use of reflection.
And then there is GraalVM built on top of the JVM that has truffleruby, the fastest Ruby implementation, graaljs which has very comparable performance to v8 with comparatively much less man hours , etc, all very dynamic languages.
Was that true even before `invokedynamic`, which as far as I know was specifically added to make these non-Java languages easier to port and more performant?
All of the examples you‘ve mentioned don‘t seem like trivial ports at least from an outsider‘s point of view.
The JVM itself has definitely adapted to these use cases, but it wasn‘t designed with them in mind.
You are right about the reason invokedynamic was added, but as far as I know the JVM always supported dynamic class loading (and thus class creation as well), so while not necessarily in a too performant manner, it could always be used as a runtime for even very dynamic languages. (And I think I left out Groovy which is a quite old dynamic language on the JVM).
> dynamic languages have been struggling with inefficiencies imposed by some of the Java-orientied paradigms
If you think the jvm is bad for dynamic languages, wait'll you hear about wasm!
In fact, I would expect the jvm to work much better for dynamic languages; not only does it already have a[1] gc, it has inline caching built in, which is frequently crucial for getting good performance in dynamic languages. (Though granted, as the niblings hint at, this requires type inference without invokedynamic.)
1. Arguably 'the'. I don't know of a platform with a better one.
Yes, but this is what .net does as well, it is not a reason not to add another one. WASM is lighter and has security model that the previous generations of vm-s do not provide. I don't know if this will be enough, but so far the experiment looks interesting.
Depends on whether your JS is limited by interpreter / JIT startup time or throughput. If it's the former, then the wasm module with wizer might be significantly faster.
As someone who will shortly need to do two very compute intensive functions in the browser (filtering large datasets (100k+ rows) via client specified queries, and parsing large sting blobs) what is my best option?
I’m aware of server side rendering but I suspect the performance in the client may be better responsiveness, with a failover to the server if it’s faster to provide a query response, but is there anything I should _really_ consider for client JS performance at that level?
A WebWorker can free up the main thread so the UI doesn't grind to a halt. Unfortunately they might not help you here. Data in/out of the worker is string only[0], so the extra parsing/serializing usually ruins any perf benefit.
You can access indexedDB from the worker, though. So you load the DB from the main thread, then you can do the filtering from the worker. You could return indexes then fetch those from indexDB on the main thread.
None of this will help with parsing large strings though, you are probably stuck with doing that on the main thread. And it is also not going to reduce the total time to perform the work. But hopefully it will avoid pinning the main thread, because that is bad.
I really enjoy this stuff, so feel free to contact me if you wanna talk it through. My email address is in my profile.
I don't think this is correct. You can pass lots of types in via postMessage(), including Objects, ArrayBuffers, even a Canvas context. Some objects are transferrable as well, that is, zero copy.
As far as your point about still being stuck on the main thread, that may indeed be the case depending on the work.
For the specific case of doing a map or filter operation on a really huge list, I've found it helpful to yield the main thread every once in a while with a setTimeout(0) call. Usually this means setting your code up so it can do the processing in batches of, say, 10K items each. That way, you can do your processing without locking up other UI stuff on the page (like scrolling or animated loading spinners).
Since they are working with SpiderMonkey (Firefox JS engine), I wonder if that opens the door to a Gecko based browser on iOS with decent enough performance. That would be a game changer!
No, that's forbidden by the App Store rules, with a possible exception where the HTML is supplied by the app itself rather than being downloaded. [0] From Section 2.5.6:
> Apps that browse the web must use the appropriate WebKit framework and WebKit Javascript.
Also Section 4.7:
> only uses capabilities available in a standard WebKit view (e.g. it must open and run natively in Safari without modifications or additional software)
This is like a car company that also have a tire company, also providing the tires. So far so good, but the company that makes the car only allow you to use its own tire brand, no others are allowed.
How legislators can easily spot problems in bad company practices that forces monopoly in material objects but are so obnoxious to everything digital?
I see even tech veterans here struggling in seeing that there's a big problem in all this..
Its like Microsoft not only shipping IE with windows, but making web sites slow in other browsers on purpose so at least in Windows nobody would use the IE competitors.
Why Apple is getting away for more than 10 years to such shady practices is beyond comprehension.
This sound tempting.. I'm building an interpreter (https://tablam.org) ad one of the reason is the interpreters are better for exploration/repls and such.
This looks like it will fix a major point: I could compile everything that is typed and left the rest for runtime.
I'm wondering how much easier could be the life if I compile to WASM to be run on my host...
I think the fact that you can save WASM memory / state will yield itself for very powerful applications in the future.
I think there's a world where true time-travel debugging is possible because you execute your Python program inside of a WASM VM & with WASM, you can save the memory/local registers/etc. and do true backwards/forwards execution.
Is there any reason to believe that this will be faster or more powerful than how reverse debugging works today? rr is already pretty fantastic and handles pretty complex cases surprisingly well with decent performance. I believe the way it works is that it only records memory changes of side-effect APIs (I.e. I/O, kernel, etc) & then just runs the program instruction backwards between those points. I’m not sure that snapshotting the entire RAM for an application will prove to be more efficient/powerful (copying large amounts of RAM is generally expensive). You might try to do crazy tricks like making the pages COW (while flushing them to disk) to reduce the actual cost of how many pages need to be copied. I think that’s still tough to compete with vs just recording the little bit of state around function calls that have side effects.
I've contemplated building a reverse debugger for wasm - rr is amazing, but frustrates me with touchy dependence on hardware features (sometimes doesn't work in a VM, doesn't support some CPUs, etc) and lack of ability to fork the timeline. I'd love to be able to backup, change a variable, and then proceed forwards - tree style undo and redo, as it were.
I haven’t tried but I’m pretty sure you can change variable values. Aside from it being Linux-only I haven’t encountered the CPU challenges. VM incompatibilities are a different story as I generally only ever run directly on metals when debugging. That being said, it looks like it does work (at least if you have sufficiently new software bits) so may be worth giving it another go: https://github.com/rr-debugger/rr/issues/2806
Sorry, this issue is what I was referring to as what I would like to be able to do: https://github.com/rr-debugger/rr/issues/1622
I might have been too down on rr, it is a very cool tool! Wasm doesn't have great debugging support right now, so I was thinking of a way to fix that and get the features I wanted out of rr in one swoop.
Changing a variable and resuming execution isn't the hard part. The hard part is what happens when the resumed execution wants to, e.g., read from a socket. That socket doesn't exist in the replay. What would you do?
Reminds of Omega--you wrote scripts to operate autonomous tanks, and pitted your tanks against others' in an obstacle-filled arena. Omega didn't have the realtime scripting component though.
If anyone out there is involved with this I am very interested and very motivated to figure out if I can get a JavaScript engine running in a DFINITY canister. I just need a compiled wasm and a candid file to try it out. I think. The wasm might need to be compiled in a particular way.
From what I saw that's just about exploiting WASM memory within the sandbox - like they say "at wrost WASM can make a mess of it's own memory" and they show some implications of that. That's mildly interesting to me, you still need to go through DOM to do anything system related - you can't just read random files off of disk or use some system API to exploit other parts of the system. If API security is broken for WASM it's broken for JS as well. That's completely different from having a separate sandbox running as a native extension.
Yes, that is more or less the goal - with the major difference being cross-industry collaboration - every major player is included and even the community has its word.
If you mean the Oracle v Google fiasco, it was a legal battle on whether it is legal to copy a significant part but not exactly the whole of what is called JavaTM.
But Java is one of the few major languages that is standard-specified, not specified by the reference implementation, the reference impl. is completely open source, and the ecosystem is blooming. So, no? It is a huge platform running a very very significant chunk of all the backend servers.
If it is actually free of backwards compatibility, then it will be instantly abandonware. I think you meant to write that it is new and shiny so it doesn’t yet have a need to decide between breaking change or backwards compatibility, but it will happen. We’ll see how they handle it.
And safety has nothing to do with backwards compatibility, and the JVM is not unsafe at all against the attack vectors it is planned for.
Ok, why do you care anyways though? Java on the web failed, Wasm might too, but we won't know without trying. It's not the same thing, there are many differences especially in relation to lessons learned from Java and past 15 years of the web platform.
I'm hoping this will be useful for improving react native performance on iOS. Due to restrictions on that platform the JS is currently interpreted and performance is terrible.
I'm mostly overwhelmed by excitement & fascination. An amazing effort.
Three little notes jump out at me, first:
> WebAssembly doesn’t allow you to dynamically generate new machine code and run it from within pure Wasm code.
This is a surprise to me. For some reason I thought wasm code could be a SharedArrayBuffer. And I thought that would be mutable. You might maybe have to round trip back to JS to modify this? But I thought it was read-write. I'm probably wrong but this was quite a surprise to me & a bit of a shocker to hear! Although there's good security things you get from this, I didn't expect it to be hard set.
Second thing that jumps out at me is a little fear. I look at the various projects out there intent on replacing the common DevTools debuggable/extension-able web page with a bunch of animated pixels, via use of Canvas, like Google's CanvasKit renderer for Flutter. To me, this is scary territory, because it de-empowers the user. This project here has uses far beyond the web, that's it's real purpose I think (who wants to load the spidermonkey engine compiled to wasm to start running their js on their site), but it still just makes me a little scared of the common DevTools experience fracturing & shattering, into many pieces. This project isn't uniquely scary, versus Go on WASM, or Rust on WASM, but it's still something I'm nervous about, and this article made me think of how easy it would be to start making the JS we run considerably harder to wrangle by people writing extensions or enhancing their user agent.
Third,
> Because the JS engines used to create isolates are large codebases, containing lots of low-level code doing ultra-complicated optimizations, it’s easy for bugs to be introduced that allow attackers to escape the VM and get access to the system the VM is running on.
Color me a little skeptical about the virtue of adding another security layer into the virtual machine. I really hope we can get good at isolates, to build blisteringly fast systems, and that we don't need to have multiple (or sometimes 1) processes on a computer, each running multiple isolates, each running multiple wasm containers. It's a cool capability, but I feel like ideally we'd be better off with less levels of nesting to get down to the real "process" (process->isolate->wasm-container), & better able to trust & leverage & use isolates really well to sandbox work. It seems to be working very very well for those with neat edge tech like CloudFlare Workers.
That said, I just did a quick search & a year ago someone was saying V8 isolates take ~3MB of memory each, which is far from insignificant. Developing AOT js->wasm tech could potentially have a huge memory-saving benefit. Ah, ok, good, the lightweight nature of wizer'ed containers is emphasized fairly well in this post! It didn't leap out at me the first time but it's definitely there! This is the core possible advantage, besides simply enabling better exploration of the problem space, which is also key.
Fourth,
> We started with making initialization fast with a tool called Wizer. I’m going to explain how, but for those who are impatient, here’s the speed up we see when running a very simple JS app.
This is a huge focus of this post, and it's a great technical marvel. On the other hand, the post talks about the advantages of how Wizer can use snapshots to skip initialization & start with a pre-booted application... the thing is: v8 Isolates can do that too. Deno has been doing quite a lot of work figuring out how to wrangle & manage & make effective use of V8 Isolates snapshots, for example, and that's one of the core reasons I think it is an impressively promising advancement & upcoming core technology. Deno is really trying to wrap & better expose the V8 runtime in a better managed fashion, and I think we'll see many of the advantages claimed by Wizer start to get reported out the world via Deno as well, over time.
> For some reason I thought wasm code could be a SharedArrayBuffer. And I thought that would be mutable.
No, this was an explicit design decision for security reasons. Data is mutable, but you cannot exec it.
> You might maybe have to round trip back to JS to modify this?
Yes, this is technically possible but you're unlikely to see savings. I'd suspect a "JIT" that instantiates a WASM via the JS API using generated bytes from another WASM call won't help much (the "exports" would have to be JS calls too or you'd have to re-instantiate all the WASMs together to link exports).
> WebAssembly doesn’t allow you to dynamically generate new machine code and run it from within pure Wasm code.
I think this focus is about RUNNING the wasm. If you have are a host, I think you can generate wasm inside it and pre-cache the "std library" or something, this is what I infer from this idea (from other users that wanna implement it).
Because certainly you can generate code with control of your host:
btw. this could also be interesting for spa applications that do not use node.js on the backend.
I wonder when there will be code examples.
I would really want to look more into spidermonkey+js in wasm and trying to get spa output.
I find it interesting they compiled SpiderMonkey to WASM to run JS in its interpreter mode when it already has iOS support built-in via the interpreter. I would've thought all of the performance enhancements could be done without involving WASM at all. As far as I'm aware the only reason why Firefox for iOS doesn't use SpiderMonkey is App Store restrictions.
I also did basically this exact same thing for Go [0] when I realized their initialization in WASM was very heavy[1]. Basically I ran up until the real Go main started which includes a ton of runtime package/data initialization, and took a snapshot of the data and baked it back into the WASM and removed all the pre-main code. Granted this was years ago so I don't know if it still works on generated code today, but the idea is the same.
I think languages compiling to WASM, if they can, should run their initialization code and snapshot the data. A lot of people don't realize the number of init instructions to bootstrap a runtime these days. Go alone has thousands of instructions just to initialize the unicode tables. Over 4 million instructions before main were removed from hello-world Go WASM init by pre-processing.
0 - https://github.com/cretz/go-wasm-bake 1 - https://github.com/golang/go/issues/26622