Hacker News new | past | comments | ask | show | jobs | submit login
You can now run WebAssembly on Cloudflare Workers (cloudflare.com)
190 points by kentonv on Oct 1, 2018 | hide | past | favorite | 47 comments



Each person can take their own meaning away from this, but for me the most impressive part will always be @kentonv's hand-rolled libc replacement: https://github.com/cloudflare/cloudflare-workers-wasm-demo/b...


Everyone who's in CS at UIUC has to write their own libc replacement:

http://cs241.cs.illinois.edu/malloc.html

There is (or at least was) a score board you can compare against. By the end of the MP, there are usually handful beating libc.

The thing is.. libc is battle tested, so I'd always trust it over a hand rolled solution. It's always interesting to see improvements though (usually at a cost).


True enough, but if you follow the link in GP, it's not really a libc replacement. For example, free() is a no-op. It's a drastically small subset of libc.


> libc is battle tested

I'm confused; isn't libc an interface and not an implementation?


OP probably means glibc


  // Really trivial malloc() implementation. We just allocate bytes sequentially from the start of
  // the heap, and reset the whole heap to empty at the start of each request.
  extern byte __heap_base;   // Start of heap -- symbol provided by compiler.

  byte* heap = NULL;         // Current heap position.
  void* last_malloc = NULL;  // Last value returned by malloc(), for trivial optimizations.

  void* malloc(size_t n) {
    last_malloc = heap;
    heap += n;
    return last_malloc;
  }
Am I missing something? How can this work? __heap_base is never used, so first malloc returns a pointer to (void * )NULL;

Shouldn't it be:

  byte *heap = &__heap_base;
or something to that effect?

I haven't read the rest of the code, nor know how WebAssembly works, but I can't make sense of that snippet.


https://github.com/cloudflare/cloudflare-workers-wasm-demo/b...

    // init() is called from JS to allocate space for the image file.
    byte* init(size_t image_size) {
      // Reset the heap to empty. (See malloc() implementation in bootstrap.h.)
      heap = &__heap_base;


Ah, I just had to look harder. Thanks


TBH I did not expect people would actually be looking that closely. :)


You're no C programmer, Zack. Anyone who's done any real C programming has implemented their own malloc.


(Is this an in-joke I'm missing or something?)


It's polite ribbing amongst work colleagues where an engineer at Cloudflare (@zackbloom) offers praise to a colleague (@kentonv) for a mighty project (libc replacement) and Cloudflare's CTO (@jgrahamc) jokes that the project isn't hardcore enough ;)

tldr; Appears very much an in-joke between work colleagues at Cloudflare :)


Correct. Not so much an in-joke. I was just surprised Zack thought implementing your own libc was amazing. But I'm an old fart C/assembly programmer so YMMV.


Having the top-comment doesn't lie, THE PEOPLE ARE WITH ME!

Also, is there a badge or sash I can get somewhere saying "Not a Real C Programmer"?


How about a badge for "I got schooled publicly by my CTO."


Ha. I wasn't really schooling him, just ribbing him. He knows damn well he could run circles around me in JavaScript.


Someone in marketing probably has a button maker...


I concur on the malloc sentiment.


There is such a steady stream of excellent work and innovation coming from Cloudflare these days - it's pretty amazing.


Documentation is sparse on why emscripten compiled wasms are incompatible. Please provide more information


It's a very minor incompatibility:

Traditionally, all WebAssembly modules are essentially eval()ed. You need JavaScript to download the module into an ArrayBuffer or the like, then pass that to the WebAssembly API to compile it.

However, in Cloudflare Workers, we didn't want you to have to fetch WebAssembly remotely at startup. Instead, you upload your WASM module to the Cloudflare configuration UI/API together with your JavaScript code. At startup, the WASM is compiled and the resulting `WebAssembly.Module` appears as a global variable in your script, which you can then instantiate.

Emscripten normally automatically generates JavaScript for you to load your WASM. But Emscripten's generated script doesn't understand this delivery model where the module shows up as a global variable. It should be trivial to add support, but it would be awkward for us to try to submit a patch upstream without the functionality being public yet.


Actually, emscripten has a built-in flag (SINGLE_FILE) to embed the wasm code as base64 (which I'm the author of, incidentally).


May I recommend also supporting hex encoding the WASM? I think you'll find that while your file size is larger, it's actually significantly smaller once gzipped.

Edit: Just tested this on https://public.tableau.com/vizql/v_public-release1809140800/...

  runtimeweb.wasm       2,484,043
  runtimewebwasm.b64    3,355,640 
  runtimwebwasm.hex     4,968,086
  runtimewebwasm.b64.gz   974,065
  runtimewebwasm.hex.gz   701,052
  runtimewebwasm.b64.br   718,918
  runtimewebwasm.hex.br   466,221
So with gzip -9, hex encoding is 72% of the size, with brotli (defaults) size is 65% of the size.


Hmm, thanks for the tip, and thanks for testing that out so I don't have to! It hadn't occurred to me that that would be the case, but it makes some sense after reading through https://stackoverflow.com/q/38124361/459881. I'll go ahead and open an issue with emscripten about this.


Yep, it's one of those counter intuitive things. Another benefit from using hex is the decoder is a lot simpler and easier for the JIT to vectorize.


That actually doesn't quite solve the problem as we disallow dynamic code evaluation. This is for a couple reasons:

* Security: In case of an attack we need to be able to call up the code to do forensics, which is hard if it was downloaded dynamically at runtime.

* Optimizations: We want to keep the ability to pre-compile code e.g. into V8 code cache format before distributing it to the edge.


Ah, I thought that might've been part of it since you brought up the comparison to eval; just wanted to clarify that there are other options for avoiding reliance on external subresources.

which is hard if it was downloaded dynamically at runtime

Not sure if you mean something other than what this sounds like (I haven't used CF Workers), but just to clarify further, the option I mentioned causes emscripten to emit one single JS file with no external code downloaded at run time.

(In any case, that performance optimization alone certainly sounds like a good reason for the separation.)


Understood. While the patch is being mainlined, it would benefit emscripten users targeting CloudFlare to use whatever you have working with emscripten's build workflow. Please consider releasing this repo for the interim


We haven't actually tried making any changes to Emscripten yet. We've only used lower-level tools to build things more manually. There is a working example of this here:

https://github.com/cloudflare/cloudflare-workers-wasm-demo


What about Module.instantiateWasm()? That should let you provide a pre-compiled or even pre-instantiated module.


Really interesting development, but I do wonder how well this fits with a 5-50ms CPU time limit.

Are there plans to change this? Or allow charging on higher use or something?


Indeed, introducing WebAssembly paradoxically creates more demand to increase the CPU time limits -- because with WebAssembly, it's actually reasonable to imagine doing signal processing in Workers, whereas in pure JavaScript it made less sense. We'll probably need to go back and re-evaluate the CPU time limits in the near future. In the meantime, feel free to contact us if you have a specific use case in mind that doesn't seem to be fitting the limits, and we'll figure something out.


Yes. We plan to increase the CPU limits. In the meantime, if you're hitting limits, reach out to us and we can usually turn them up for you.


The web was supposed to be open, but if the technologies are only well implimented privately it's not open at all. Props to cloudflare certainly, but this is a further nail in the coffin of open software. Wasm was supposed to be well useable outside the browser and to date there are no well made libraries that deliver on it. It's all experimentation in the open. Technology that I cannot use well in my own applications is effectively proprietary.


You can use WASM from Node. There is even an open source module called “isolated-vm” which implements a similar security scheme to the one employed by Cloudflare. You could install that on whatever you want and run essentially the same thing (minus the rest of Cloudflare, obviously).


https://blog.cloudflare.com/introducing-cloudflare-workers/ the original workers release explains why they dont use node and instead build with v8 directly.

Node is not a cure all. It's for prototyping with wasm. You need significantly more integration with the execution environment to have DOS protection. How cloudflare does this I don't know. They'd have to remove features like multithreading and continually patch around v8 afaik. Even spidermonkies API is unuseable in a server context.


I know what Cloudflare workers are. Did you look at isolated-vm? It uses the V8 isolate API exactly how Cloudflare describes it being used, except it is exposed as a Node module so you can write the “parent” process in JavaScript instead of C++. It exposes memory limits and wall & cpu time limits. It is sponsored by the Fly.io CDN from what I can tell for exactly the same use case as Cloudflare workers.


I only found mention of v8-isolate when searching for it specifically. Coming up in only https://blog.cloudflare.com/serverless-performance-compariso... and a few comments on HN. I don't see any description of its use other than a summary of 'we use v8 isolates', which begs a /r/restofthefuckingowl. Even an interview with KV doesn't reveal much. https://softwareengineeringdaily.com/wp-content/uploads/2018... Isolate also doesn't do anything for sandboxing from what I can tell. Cloudflare quotes 'That said, we have added additional layers of our own sandboxing on top of V8.' I do not see any actual technical information on how they're using v8.

v8 and by extension isolated-vm by itself does not cover things like fetch if we're talking JS. https://github.com/laverdet/isolated-vm/issues/63 This is an immediete killer as js browser features are part of why a js engine is wanted.

Having node handling these things is a time bomb of debugging where instead of debugging v8 you're debugging node. There are node version problems with stability as mentioned on the isolate-vm readme.

DOS goes beyond just CPU time and memory. Wasm is being treated as an environment of its own with some lower level accessibility. Unless the API allows feature whitelisting - I can't find any docs relating to that - then I don't see anything stopping thread creation. I'd be interested to know how cloudflare prevents these features. isolate-vm doesn't appear to interact with the v8 api with options relating to this.

Fly.io has their fly engine meant for local development so you can deploy onto their servers. https://github.com/superfly/fly


What's the limit on code size? I imagine we'll have to be careful about bringing in large existing libraries.


The limit on the code (the Worker + WASM), after compression is 1MB. Please reach out if you run into limitations with it (rita at cloudflare).


Heh Heh Heh

Minimum size of wasm files generated by Go (so far) looks to be 2MB. That should improve over time.

Go wasm is currently hard coded to target browser environments too. Another thing which should improve down the track.


I believe the 2MB figure is before compression. After, it's 500k?


Good point. I've not tried compressing them personally, but people have indeed mentioned the 500k-after-compression thing before.


Thank you. Is that documented somewhere?



Ah, OK. That doesn't mention that the 1 MB limit is after compression, though.


Being post-compression is actually new and currently only applies if you use WASM, though we'll extend it to everyone soon (and document it).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: