Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Compiling C in the browser using WebAssembly (wasmer.io)
172 points by syrusakbary 3 days ago | hide | past | favorite | 70 comments





Couldn’t a tcc or similarly simple C compiler be used instead of a 100MB Clang? Where’s the C to wasm compiler hiding?

One issue with Wasm is you essentially can't target it with a single-pass compiler, unlike just about any real machine. Wasm can only represent reducible control flow, so you have to pass your control-flow graph through some variation of the Relooper[1,2]. I don't know if upstream tcc can do that (there are apparently some forks?..).

[1] http://troubles.md/why-do-we-need-the-relooper-algorithm-aga...

[2] https://medium.com/leaningtech/solving-the-structured-contro...


> you essentially can't target it with a single-pass compiler,

That might be true if your source language has goto, but for other languages that start with structured control flow, it's possible to just carry the structure through and emit Wasm directly from the AST.


Sure, I was speaking in the context of C specifically. (In non-simplistic compilers, you may not want to preserve the source structure anyway—e.g. in Scheme or Lua with tail calls all over the place.)

Presumably C's `switch` is also a problem.

Yes, I don't recall all the confusing elements and technicalities of what's allowed in Switch statements in C offhand but here are a few brainfscks:

https://old.reddit.com/r/C_Programming/comments/16kg48y/mind...

https://old.reddit.com/r/programminghorror/comments/ylc7f3/w...


I went down a rabbithole and wow.

Found a comment from the author of https://github.com/stclib/STC apparently and then came across this example:

https://stackoverflow.com/a/76887723

  int coro_a(struct a* g)
  {
   cco_routine (g) {
    printf("entering a\n");
    for (g->i = 0; g->i < 3; g->i++) {
     printf("A step %d\n", g->i);
     cco_yield();
    }
    cco_final:
    printf("returning from a\n");
   }
   return 0; // done
  }
gcc -E -ISTC/include co.c

After running it through a preprocessor, it gives me this.

  int coro_a(struct a* g)
   {
    for (int* _state = &(g)->cco_state; *_state != CCO_STATE_DONE; *_state = CCO_STATE_DONE) _resume: switch (*_state) case 0: {
     printf("entering a\n");
     for (g->i = 0; g->i < 3; g->i++) {
      printf("A step %d\n", g->i);
      do { *_state = 14; return CCO_YIELD; goto _resume; case 14:; } while (0);
     }
     *_state = CCO_STATE_FINAL; case CCO_STATE_FINAL:
     printf("returning from a\n");
    }
    return 0;
   }

I don’t want to become the switch-statement guy, but neither can I resist, apparently. There are no technicalities in what is allowed in a switch statement: the same things are as with bare gotos. That is, a switch statement is a fancy goto, and case labels are just labels that look a bit funny. Except for the case labels being restricted to inside of the switch body, nesting doesn’t really come into it.

So then the question becomes, which things are you allowed to jump over? In C++, I don’t really know, the restrictions seem fairly stringent. In C, you can jump over anything except a declaration using a variably modified type (i.e. a variable-length array, a pointer to one, etc.), but keep in mind that the variables whose declarations you’ve jumped over will be uninitialized even if the declaration does have an initializer.


This is true. In Theta (https://github.com/ThetaLang/Theta) this is exactly what we do -- no need for more than one pass for the WASM codegen.

If all you want to do is compile and run c code in the browser you could run tcc in the blink x86_64 emulator, running in wasm. It would take ~300Kb, less than the js & css used in the average webpage

The whole LLVM toolchain is a bit big. I think we can reduce much more the size. We actually researched on using tcc but unfortunately tcc doesn’t have a wasm backend (for generating wasm output). It would be awesome if they added it!

Check out https://github.com/tyfkda/xcc, I've only used the native backend, but it's small and fast.

Nice! I didn’t know the project. Thanks for sharing!

This project is also very much worth checking out.

https://cranelift.dev/

From the page:

Cranelift is a fast, secure, relatively simple and innovative compiler backend. It takes an intermediate representation of a program generated by some frontend and compiles it to executable machine code. Cranelift is meant to be used as a library within an "embedder".

It is in successful use by the Wasmtime WebAssembly virtual machine, for just-in-time (JIT) and ahead-of-time (AOT) compilation, and also as an experimental backend for the Rust compiler.

Cranelift is an optimizing compiler, but it aims to take a fresh look at which optimizations are necessary. We have explicitly avoided features -- such as advanced alias analysis or use of undefined behavior -- that have historically led to subtle miscompilations in other compilers. Cranelift consists of about 200 thousand lines of code; in contrast, e.g. LLVM consists of over 20 million lines of code, a hundred times larger. This difference also allows Cranelift to be relatively approachable to developers, researchers, auditors and others who wish to understand how it works.


I recently wanted to use tcc for a homebaked programming sideproject and was surprised to find it's no longer supported anymore, at least not by Fabrice Bellard. Upstream git still has some light activity but no releases. I wasn't sure how good of an idea it is to rely on it as a code generator.

It's alive and kicking my friend https://repo.or.cz/tinycc.git/shortlog

We wait for grischka to decide when to announce a new release https://lists.nongnu.org/archive/html/tinycc-devel/2024-10/m...


I see thanks, that's great.

clang can target wasm already.

100MB on every page refresh just to compile C is a pretty bold direction to go in.

Except if/when it's cached.

I don’t want my cache requirements ballooning by 100mb.

Very cool! I've been watching the "toolchains in Wasm" landscape for a while, and seeing a Clang/LLVM toolchain running in Wasm is awesome!

YoWASP has also had an LLVM toolchain working in Wasm for a while too[1], although it seems like this version solves the subprocess problem by providing an implementation of `posix_spawn` whereas the YoWASP one uses some patches to avoid subprocesses altogether

My biggest question marks around this version are about runtime/platform support. As I understand it, this toolchain uses WASIX, which (AFAICT) works with Wasmer's own runtime and with a browser shim, but with none of the other runtimes. Are there plans to get WASIX more widely adopted across more runtimes, or to get WASIX caught up to the latest WASI standard (preview2)? Or maybe even better, bring the missing features from WASIX to mainline WASI like `posix_spawn`[2]? I'd love to be able to adopt this toolchain, but it doesn't seem like WASIX support has really caught on across the other runtimes

[1]: https://discourse.llvm.org/t/rfc-building-llvm-for-webassemb... [2]: https://github.com/WebAssembly/WASI/issues/414


Cling (the interactive C++ interpreter) should also compile to WASM.

There's a xeus-cling Jupyter kernel, which supports interactive C++ in notebooks: https://github.com/jupyter-xeus/xeus-cling

There's not yet a JupyterLite (WASM) kernel for C or C++.


It's pretty misleading not to mention the performance overhead. That's an obvious downside and quite easy to benchmark. Skipping any discussion of performance feels like sweeping it under the marketing rug :/

> Skipping any discussion of performance feels like sweeping it under the marketing rug

Expecting performance while compiling C in the browser feels redundant right now though.


A few weeks ago, I tried to compile Clang to WebAssembly, but got several different errors, and tried fixing a lot of them, but some of them seemed kind of impossible to fix, so I thought I would try again at a later date. However it seems I will not need to try again. I feel angry that someone made a convenient solution before I did, but also happy, because this probably implies that they made a consistent process to compile Clang for WASM.

Didn't Gary Bernhardt do this in 2014? /sarcasm


Is it possible/already existing to have interactive C++ lessons where the user's C++ code is compiled an run client-side in a web page?

Absolutely! You can even run clang in wasm targeting x86_64, and then emulate the resulting program using the blink x86_64 emulator.

I'm working on something similar, where students can compile intel assembly and run it client-side: https://github.com/robalb/x86-64-playground



Thanks, I'm seeing but the documentation is so scarce and I'm not a proefficient C expert.

What syntax can be used to run emception? Thank you.


It’s sadly a bit more of a proof of concept than a hackable project. The docker build in the readme did work last time I tried, and there is a demo site at https://jprendes.github.io/emception/, but I’ve failed to modify it in the past to do other things

There is a fork at https://github.com/emception/emception that is trying to make it more production ready, but it looks like that may have stalled


I guess the search for compiling C++ on some kind of bytecode continues. Thanks a bunch for the links and details, much appreciated.

Definitely been possible for at least 5 years now. Would probably be a weekend project now.

GCC? That's easy! :-) What about a complete system? https://webvm.io

Shameless plug: we are hosting a WebVM Hackathon next week (11-14 October) over Discord. For more information: https://cheerpx.io/hackathon


What about something that doesn't even require WebAssembly and is faster? https://bellard.org/jslinux/

very unscientific benchmark of `clang hello.c`, after a few runs to make sure the code is downloaded/cached:

jslinux: 4.7s

wasmer: 1.3s

webvm: 1.2s


Nice, thanks for the benchmark Yuri!

It is a bit unfair to Wasmer, because it incur in the (presumed) overhead of `wasmer run ...`, but I could not figure out if the actual clang binary is directly available after it is downloaded the first time.

Can I compete with exaequOS?

No, the competition is explicitly based around CheerpX: an X86 virtualization technology built on top of WebAssembly

You can compile C using JavaScript and target DOS if you are hard core enough. https://github.com/Mati365/ts-c-compiler

If what I want is not an executable but a shared library, does this get me anything?

I currently have a use case that uses a server running an emscripten build (using SMODULARIZE and some exports, I suppose it’s not a true dylib)


Importing a wasm module from a wasm module is (non)surprisingly impossible to do -- you have to have a linker, abi and all that.

It is possible provided some care. I was looking into this with WAForth which compiles the wasm and loads it via a host function (ie. it is the hosts responsibility to make it available). I wanted to enable dynamic loading of words from disk which requires some book keeping and shuffling a bunch of bytes around during compilation to write out the bits necessary to have the host do that linking. It isn't impossible to do, just tedious and in my case, having to write it in WAT is a pain.

Yep, you need to do the nasty bits by hand, that's what I mean.

Not really, on Firefox

    panicked at /Users/syrusakbary/Development/wasmer/lib/api/src/js/instance.rs:62:84:
    called `Result::unwrap()` on an `Err` value: JsValue(Function(bound 846))

    Stack:

    fe/_.wbg.__wbg_new_abda76e883ba8a5f@https://wasmer.sh/assets/index-CgFg6VHw.js:17:6582
    @https://wasmer.sh/assets/wasmer_js_bg-BruS15W0.wasm:wasm-function[1125]:0x2b4276
    @https://wasmer.sh/assets/wasmer_js_bg-BruS15W0.wasm:wasm-function[2888]:0x3ab373
    @https://wasmer.sh/assets/wasmer_js_bg-BruS15W0.wasm:wasm-function[8254]:0x435ed3
    @https://wasmer.sh/assets/wasmer_js_bg-BruS15W0.wasm:wasm-function[4825]:0x3fa7de
    @https://wasmer.sh/assets/wasmer_js_bg-BruS15W0.wasm:wasm-function[517]:0x1af753
    @https://wasmer.sh/assets/wasmer_js_bg-BruS15W0.wasm:wasm-function[294]:0xbed03
    @https://wasmer.sh/assets/wasmer_js_bg-BruS15W0.wasm:wasm-function[2039]:0x34b10e
    @https://wasmer.sh/assets/wasmer_js_bg-BruS15W0.wasm:wasm-function[2896]:0x3abaa1
    @https://wasmer.sh/assets/wasmer_js_bg-BruS15W0.wasm:wasm-function[9393]:0x43fde1
    S@https://wasmer.sh/assets/index-CgFg6VHw.js:11:424
    c@https://wasmer.sh/assets/index-CgFg6VHw.js:11:264

> note: it requires a 100MB download

Is this how big a clang toolchain usually is?


The Clang WASI SDK weights about 100Mb compressed. We optimized things a bit but still have a way to go (we are not yet compressing in the network). I believe we can serve everything in about 30Mb

MB, right? Mb is megabit

I only have to bring this up because network providers still insist on measuring bits


They insist on it because it is the proper way to measure data rates on serial bit streams where out-of-band encoding doesn't divide up on octet boundaries.

They insist on it because big number sell better.

What's the use case?

Like most things in software the use cases are the limits of one's imagination. The browser has always been a Turing complete development environment so this is just another demonstration.

I was also asking exactly the same question.

Every few years, new progresses might remind me of this talk by Gary Bernhardt:

https://www.destroyallsoftware.com/talks/the-birth-and-death...


Yeah, mostly because WebAssembly is the new kid in bytecode town.

Now all this needs is a simple OS running in a browser, that can edit and compile itself, post the resulting binary onto a WebDAV somewhere, and reload itself from there.

Then it becomes a fully self-sustaining OS that can live forever in a browser.


Something like exaequOS? https://exaequos.com

Check out Jeff Lindsay's Wanix project: https://wanix.sh/

And then use webrtc (or ideally someone can revive a webtransport-p2p please!) to serve itself from a page to other people.

Ideally http3 over webtransport-p2p!

Then add some network discovery so we can advertise & find what's available on our networks!


Do you have a proper link to the webtransport-p2p idea? I've done a few searches but I think there's some mix of current implementation and deprecated implementation somehow.

What is it that needs reviving?


The spec is inactive, afaik, no implementations. Got it backwards, pardon, p2p-webtransport. https://github.com/w3c/p2p-webtransport

I don't know why it's fallen off, to be honest, or what was raised against it. Highly desireable to a lot of p2p folk, a very promising webrtc datatransport replacement.


Very interesting idea but I have to say that those goals are not possible with a simple OS, at least by OS definitions of simple :P

The old https://webassembly.sh/ and the new https://wasmer.sh/ came a long way already.

All you need is a virtual filesystem of some sort, a way to download, a way to upload, an editor, a compiler, and a VT100 JS library. We already have WASI for the rest.

If the JS is too undesired, then perhaps go the old framebuffer graphics mode (e.g. a region of the WASM memory that is interpreted as an ASCII screen, or maybe even as a full bitmap buffer). Then JavaScript side just needs to forward keyboard/mouse into memory and that screen region out of memory.


The framebuffer idea is used in this wasm doom port: https://github.com/diekmann/wasm-fizzbuzz/tree/main/doom

WASIX already does all the other stuff you mentioned, including in the browser. The one thing it's missing is GUI, mainly because there's no standard GUI interface in POSIX.


It is possible. I already embedded chibicc in exaequOS. I will continue with xcc and clang

"Yeah, yeah, but your scientists were so preoccupied with whether or not they could that they didn't stop to think if they should." ....

"We do what we must

Because we can"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: