By the way, if you’re interested, I’m the guy behind Microvium. It’s been my pet project for the last few years. It’s very motivating to see all the positive and encouraging feedback. Thank you all!
Electron is kind of a non starter because the hard part isn't the JS interpreter it's the renderer. Node alternative? Fairly trivial, but then you're missing NPM, and pretty much everything else that makes Node not a toy language.
Calling Microvium an "engine" without a runtime parser is a bit inaccurate imo. The entire purpose of Javascript is that it's a "scripting" language, so having an interactive runtime is kind of the point. And as @dmitrygr already pointed out, not using malloc/free is actually a feature, not a bug[1].
> The entire purpose of Javascript is that it's a "scripting" language, so having an interactive runtime is kind of the point.
If your use case is plugging a keyboard into your MCU or connecting a terminal prompt on the UART, then the MCU needs to be running a parser, yes. But there may be many real-world use cases where your "scripts" will be going through some kind of deployment and release process, and adding a step to compile them to bytecode is easy. But YMMV depending on what you need.
> not using malloc/free is actually a feature, not a bug
Microvium maintains its own managed heap which uses a compacting garabage collector, so that heap allocations there are fast O(1) and have no fragmentation. But it needs to occassionally get chunks of memory from the host/OS to expand its managed heap, which is where it uses malloc.
A UART REPL is always the use case that advocates of Python and JS on microcontrollers bring up as one of the main reasons to use it. I haven't tried it myself but it does sound compelling.
So I think excluding that use case is a bit odd.
Maybe you could have two configurations - one with a parser and one without. Most microcontrollers are not restricted to 16kB of RAM these days.
The main concern seems to be memory usage. Dynamic allocation reduces the memory footprint, the author compares it to the best alternative which requires pre-allocation of a large chunk of memory.
You’d be compiling the majority of other options too for embedded, so that’s not really a drawback.
> Dynamic allocation reduces the memory footprint, the author compares it to the best alternative which requires pre-allocation of a large chunk of memory.
Architecturally speaking, it makes almost no sense to have dynamic memory allocation in Microvium. You could theoretically even do static analysis on whatever chunk of code you're about to run and figure out exactly what your upper bound for memory usage would be. (Remember, we don't have an interactive runtime.)
So now that I actually think about it more, not pre-allocating in this particular case is actually an antipattern.
The static analysis in Microvium does do this for the stack variables, but the heap is not bounded. It's trivial to write a for-loop or something that creates N new objects.
Impressively small and trade-offs seem pretty reasonable. (namely lack of runtime parser, minimal built-ins).
Initially thought the lack of built-ins would be annoying but it looks very easy to simply "re-implement" them given how easy the C-interop is to use.
JS isn't my favourite but with such a small VM implementing what is a fundamentally a very modern scripting language I would be tempted to use it on my next embedded project.
The other seemly killer feature from my perspective is ability to snapshot the VM state to stable storage and resume on reboot rather than start from new execution context.
> Impressively small and trade-offs seem pretty reasonable. (namely lack of runtime parser, minimal built-ins).
Not really. If you remove the runtime parser, then you're effectively just a VM and there are lots of VMs significantly smaller than this.
Anytime you attempt to implement a language in a tiny amount of memory, a ridiculous chunk of it goes to implementing the parsing system. Infix parsing is particularly expensive to implement--which is why the only two languages in the extremely tiny space are Forth and Lisp.
Fair, I should have qualified my statement to "for JS".
I don't think it's reasonable to compare to Forth/Lisp (or even Lua despite similar language capabilities) because of how pervasive JS is from a runtime perspective. Especially because this implements a sufficiently good subset that you can run isomorphic code on device, backend, browser.
This audience is probably a little more used to measuring things like ram in gigabytes and storage in tera/petabytes. It’s nice to be reminded that not all our devices are like that but high level languages are still possible
How about an (actually small but ES6 feature complete) JS engine for general computers? QuickJS showed so much promise (especially given the pedigree) but it's been dead for a year and it never really delivered on the promise of being small (I guess the code is small, but at runtime it takes more memory than engines with full JIT like V8 or Chakra).
Hermes looked interesting, but it's clear it will never be an actual JavaScript engine, it's more like a React Native engine that interprets some subset of JavaScript.
Duktape struggles with larger bundles and especially with deep call stacks (same as QuickJS) and the C API gives me strong Lua vibes (which may be a positive for some).
The last commit on QuickJS is from March. Do you have a source for that memory use comparison with V8? I haven’t had first-hand experience with it.
I do have experience using Duktape and haven’t had any issues with large call stacks, but also haven’t tried to run anything like a React app in it. The C API was indeed very enjoyable to use.
> never really delivered on the promise of being small
I wonder why you say this. I just ran both qjs and node's REPL and qjs is taking 2.4MB vs 21.5MB for node. Are you referring as QuickJS running with a specific workload? Do you happen to have more info on this measurement?
I don't have anything in Open Source for this, but ran an internal test on a large bundle a while back (7Mb of minified JavaScript with a fairly complex execution pattern) and after the test setup idles, I was seeing 67Mb with V8, 94Mb with Chakra and 109Mb with QuickJS on used memory (also, QuickJs needed a stack increase because it's passing large arguments as values). This was on x64 Windows, it's possible results differ on other platforms.
I worked on Hermes and I would say it's absolutely an actual JavaScript engine, targeting the ES spec. Its focus on startup performance means it makes different tradeoffs than other engines.
I respectfully disagree. In order to debug Hermes, you need to build React Native (the inspector / CDP component is not part of the Hermes Git repo). There are React Native features inside Hermes (like drain microtasks, etc.). And you need special Metro bundler configs to produce bundles that are compatible with Hermes (somewhat ES5, definitely not ES6).
Nice! A few years ago I took a stab at this problem space with https://github.com/cesanta/v7 ; with fun tricks like in-place compacting GC, stdlib JS object graph "frozen" in rom etc
I totally appreciate the effort, but without a runtime parser you're essentially compiling the JS to another language, and then running it with a non-standard engine?
Couldnt we simply compile a JS file through eg LLVM to a normal binary, in that case?
Not really. It's using a bytecode IR (intermediate representation).
An IR is effectively a mid-point between JS and machine code. The benefit of using it is a) it's generally executable by the interpreter with minimal processing b) it's still architecture independent so you can ship the same bytecode to multiple different MCUs. Fundamentally though it's just JS in a different representation rather than a "different language".
You probably could generate machine code directly. You can do this already (but generating large binaries unsuitable for embedded) with tools like GraalVM/SubstrateVM. It's actually a fairly difficult problem though due to nature of escape analysis and other things needed to compile highly dynamic languages efficienty.
> Microvium on the other hand uses malloc and free to allocate when needed and free when not needed
on micros which lack an MMU (all of the ones allegedly targeted), this is how you fragment your memory to hell and eventually cause hard-to-debug crashes
The Malloc/free thing is weirder and weirder the more I think about it. Js is a managed language. Instead of pointers you can use handles (double printers, via an indirection Table) and with a custom allocator you could compact the heap and thus avoid fragmentation. (This is what I remember doing in my tiny Java vm a decade ago). Malloc/free actively preclude this and force the fragmentation on you.
But the malloc/free thing referenced in the article is mainly about the stack space and core registers, which would just be wasting space if the VM was idle.
So what's the allocation pattern like (regarding malloc/free)? Last time I used scripting language in constrained environments it was Duktape and one of the issues was wildly varying allocations ranging from few bytes to kilobytes depending on usage. We ended up implementing TLSF like allocator which worked well in the end.
It calls malloc once or twice when you resume a snapshot. The first malloc includes core persistent state (e.g. memory stats counters, etc), and global variables in the script. If there is any heap in the snapshot, the second malloc is for the initial heap.
Then it calls malloc when you call from C in JS -- about 300B by default. This is to allocate the core registers and virtual stack. It frees it when the JS has fully completed and control returns to the host.
Then it also calls malloc each time it runs out of space in the managed heap and needs to expand. By default these will be 256B chunks. If you use a lot of heap space, maybe you want to increase this chunk size to reduce overhead here.
And it calls malloc about once or twice each time there is a GC collection, regardless of heap size.
In a project like this, it seems it would be possible to combat fragmentation by simply moving data to be contiguous. The engine should be able to find all necessary pointers and update them. Yes, that's a slow operation, but for many usecases speed doesn't matter much.
Maybe it has support for typed arrays. Even for high level languages, usually on a micro, the main loop doesn't use any memory, or at least doesn't allocate.
I am not sure of a practicality of JS Engine on those tiny microcontrollers. But I absolutely salute the author for squeezing so much features in so little space. This requires some very serious programming skills.
Working with J2ME many years ago, I ran into a form of scalability issues I hadn’t seen that often before, which is how well can you scale something down, rather than up. I kept having to reimplement things that the JDK had, and with such a small payload size that hurt my budget quite a bit.
At the time it was starting to become popular to try to make self-hosted VMs, where parts of the JIT were written in the language instead of C code. We also see languages cycle through a fast interpreter to a fast but sloppy JIT and a slow but effective one. There is still a question from that era that I haven’t been able to explore, and that is what is the distance from a VM that is suitable for embedded work and one that is suitable as the kernal for a JIT. My suspicion is that it’s 10%, perhaps extending to 20% once you start trying to attain very high performance.
The ffi in mJS is something that would need to be a built-in in most engines but Microvium’s unique snapshotting approach makes it possible to implement the ffi as a library just like any of the other functions
There is a description of this interplay between snapshotting and FFI?
I think a proper FFI library should be distributed with Microvium for convenience, but I just haven’t gotten around to it yet. But let me try summarize how snapshotting relates..
The snapshotting paradigm means that your JS code starts executing at “compile time” and has access to the compile-time host. Microvium by default provides access to the built-in node modules while the script is executing at compile time (these become detached at runtim), so the JS code can do things like fs.writeFileSync to code-generate C code at compile time.
This means you can have a higher-level library that wraps the built in vmExport/vmImport functions while simultaneously generating the corresponding glue code.
So what's the real use case here? Microcontrollers are generally pretty single purpose, you aren't exactly swapping apps in and out of them like you are with phones. Anything that needs that functionality is using an embedded CPU (MPU) with an MMU rather than an MCU. There are plenty of languages that are better for MCU development than JS so your main use case would be interop with something JS is good for...like a web page served on the device. This is actually not super uncommon but you'd need to be able to interface with some really awful Wifi/Eth drivers that MCUs tend to use so you're still knee deep in C.
- If you have an IoT device and want to share logic between the device and cloud. A concrete example might be a smart parking meter where you can pay for parking on the meter itself or online or by a mobile app, and you want common business logic, tariff calculations, parking rules, user workflow, etc. You could represent the logic in C but arguably something like JS is better suited.
- You want downloadable behavior that's separate from the firmware. Maybe you're making a product with one base version of C firmware but customizable by downloading scripts. Or maybe you're an enterprise company and your customers all want behavior customization and you don't want the nightmare of managing separate firmware for each of them. In C, maybe you could do this with position-independent-code modules, but JS might just be more friendly.
- Maybe you just like JS. JS has garbage collection and closures, for example. Callback-based async logic is easier in JS. Functional-style code is easier to write. Depends what you feel comfortable with.
Perhaps not to replace C firmware, but to be able to script the higher-level behavior in a language that's more comfortable and expressive.
> If you have an IoT device and want to share logic between the device and cloud ... You could represent the logic in C but arguably something like JS is better suited
Is this ever realistically the case? JavaScript aside, I would expect there to be an interface layer in the form of some compact protocol (e.g. CoAP) for communication and for the IoT device to be doing as little as humanly possible.
Yes, the parking meter example is an exact case I ran into in real life. But Microvium didn't exist back then.
But in general, an IoT solution may have some business logic or rules that transcends the specific device, and some may prefer to represent that logic in a high level language like JS, and some situations may benefit from being able to access that logic from multiple places -- e.g. front-end, back-end, and device.
Is it not enough for people to simply like coding in JS? I like the future that we are moving towards where people can code in the languages they like, irrespective of the platform they are targeting. Just look at the possibilities that WASM has opened up
> Is it not enough for people to simply like coding in JS?
Irrespective of whether people _like_ a language, there are other considerations when evaluating whether something is the best tool for the job. JavaScript has an entire class of bugs in the form of countless unexpected type coercions and equality behaviours that simply don't exist in more compact statically typed languages that would typically be used to target the average microcontroller. When deploying something to the edge, where it becomes significantly harder to maintain, I would want the least surprising behaviour in the language as possible.
> JavaScript has an entire class of bugs in the form of countless unexpected type coercions
Contra this tired sentiment, JS does not have "unexpected type coercions". If you're going to call what happens when, for example, adding an object and a boolean together "unexpected", the question becomes "What were you expecting when you wrote that code?" JS _will_ evaluate these kinds of expressions, but the results are well-defined and predictable, and (crucially) you are not _forced_ into writing any programs that use these dubious patterns. (Although, if you want to, that's your prerogative.)
There's a double standard here, in which people who don't do any operations on wacky combinations of disjoint types in their preferred language will gleefully jump into JS and start doing those things seemingly just so they can declare that JS is broken.
Take any of these cases of "unexpected type coercions" and show me some production code from a program in your preferred language where you're actually doing something similar with disjoint types and describe its superior behavior instead.
> behaviours that simply don't exist in more compact statically typed languages
You're confusing matters. If you want static type checking when writing your program, then use a static type checker. Refusing static type checking and then complaining about it is more than a little bit silly.
> JS _will_ evaluate these kinds of expressions, but the results are well-defined and predictable
Well-defined maybe, but predictable? Not really. Intuitive? Absolutely not.
> There's a double standard here, in which people who don't do any operations on wacky combinations of disjoint types in their preferred language will gleefully jump into JS and start doing those things seemingly just so they can declare that JS is broken.
On the contrary, I don't believe that this is really the case. People can and often do run into these bugs in the wild because they take some input from either a user who does something unexpected, an API that returns something unexpected or a library that acts in an unexpected way and, rather than failing a logical assertion, the program continues with effectively junk data. Sanitising inputs can be surprisingly complicated too, especially when you don't know or understand the conditions you are supposed to guard against.
> Take any of these cases of "unexpected type coercions" and show me some production code from a program in your preferred language where you're actually doing something similar with disjoint types and describe its superior behavior instead.
My point is not that the alternative is superior, for whatever definition of "superior" you are choosing, but instead that many statically typed languages will force you to think about that behaviour rather than proceed in ignorance.
> If you want static type checking when writing your program, then use a static type checker.
If you have to rely on what is effectively a linter to type-check for you to make sure your program won't do surprising things, that's not an incredibly positive indictment of dynamic languages as a whole.
> If you have to rely on what is effectively a linter to type-check for you to make sure your program won't do surprising things, that's not an incredibly positive indictment of dynamic languages as a whole.
Using TS or Flow definitely goes beyond what linters accomplish. That being said, your criteria for language quality sounds squarely-rooted in Static typing. Dynamic languages trade upfront taxonomic activity for rapid prototyping and then you come back post-prototype to analyze and ensure correctness. It's just a different approach. There are tradeoffs both ways.
You can find me @microvium on twitter