llama.cpp typically needs to be compiled separately for each operating system an...

diimdeep · on Nov 13, 2023

> Wasm-compiled program can efficiently utilize different hardware resources (like GPUs and specialized AI chips) across various devices

I do not buy it, but maybe I am ignorant of progress being made there.

> A direct C++ implementation might require specific optimizations or versions for each type of hardware to achieve similar performance.

Because I do not buy previous one, I do not buy that similar performance can be painlessly(without extra developer time) achieved there, and that wasm runtime capable to achieve it.

zmmmmm · on Nov 13, 2023

So the magic (or sleight of hand, if you prefer) seems to be in

> You just need to install the WasmEdge with the GGML plugin.

And it turns out that all these plugins are native & specific to the acceleration environment as well. But this has to happen after it lands in its environment so your "portable" application is now only portable in the sense that once it starts running it will bootstrap itself by downloading and installing native platform-specific code from the internet. Whether that is a reasonable thing for an "edge" application to do I am not sure.

kamray23 · on Nov 13, 2023

Basically, WASM is now what the JVM was in 2000. It's portable because it is.

pjmlp · on Nov 13, 2023

Where have I seen this WORA before, including for C and C++?

WASM does not provide access to hardware acceleration on devices with heterogeneous hardware accelerators, even its SIMD bytecodes are a subset of what most CPUs are capable of.

tomalbrc · on Nov 13, 2023

Just use Comsopolitan at this point.