One compelling use case is getting a REPL on any microcontroller with a UART and, say, ≥64K RAM (I remember lbForth[1] being particularly portable, and there are other implementations you could use as well).
I don’t know if you’d want to have that be your first experience with the Forth itself, though: there’s inherent fiddliness involved in bringing up hardware; the win is that Forth doesn’t really add any of its own once you’re vaguely familliar with the internals. If you can get it to boot and send and receive bytes, you can get an interactive Forth on it—or if the available resources don’t permit that, on an imaginary machine spanning it and your PC (a “tethered” Forth).
I didn’t, no, but perhaps this merits a clarification.
The native habitat of a traditional Forth is an 8-bit microcomputer, and compared to modern microcontrollers those had not a lot of compute but fairly abundant RAM (not to mention permanent rewritable storage). So to run a Forth organized along the usual lines, with a flat address space and a singular dictionary space and code loaded from textual blocks and so on, you do want 64K of RAM, I think. You could push that down to maybe 16K, with some limitations[1].
But on a modern 8-bitter with 2K or 4K, you are going to need a system that can compile things offline, and then you can flash the resulting image and do your interactive work within a small in-RAM layer on top of that. That’s absolutely a thing people do, but it’s not what Starting Forth, Thinking Forth, and various other sources about Forth describe.
I looked into tiny Forth and Lisp interpreters as the foundation for bootstrapping bigger things.
I expected Forth to be better for this because it seems like a smaller language, minimal even, that would have the smallest interpreter.
Forth feels lower level and simpler. After all, on the face of it, Forth is mostly a flat sequence of tokens and simple commands, run one after another liike assembly language.
Whereas Lisp has obvious tree data structures up front in a prettier syntax, like higher level languages, and you're encouraged to use them. Even lexical scope, closures and macros if you want them. That seemed like it must be bigger and heavier.
But I found the small Forth and Lisp interpreters came out about the same size.
This is backed up by the existence of 512-byte boot sector implementations of both Forth and Lisp.
So I decided for applications needing a tiny interpreter, or for bootstrapping, to not pursue Forth any more for those things, as a tiny Lisp does the job with (in my opinion) better ergonomics and versatility, and negligible cost difference.
Forth is known for extremely compact representation of code, and very portable, because standard have very few registers and work on stack machine.
Unfortunately all these at cost of slow execution and not easy to make serious projects (hard to deal with large codebase chaos, but read more).
So it definitely, platform for slow embedded applications, keyword Arduino, may be Raspberry.
Sure, compactness is huge advantage in some other cases, for example, known boot loader shell for UEFI is written on Forth, so you could write applications for it (and yes, for this use case any modern motherboard except Apple).
For large codebase works, some Forth people learn Scheme and technically switched to Lisp programming paradigm. Yes, it interest phenomena, Lisp techniques work with Forth good enough if developer disciplined enough. Sure, Lisp will open for you whole new world.
Lots of languages are very easy to embed, why not something more common and popular? Seems generally worth the effort to pick an easier to use language.
Unless you know ahead of time your target audience probably likes it, or you're on a tiny embedded system.
According to ChatGPT, there's GForth, SwiftForth, Forth-83, eForth and picoForth, but I cannot really tell how true it is. I use my own implementation, which I wish to publish someday but that's extra work for me to "anonymize" it.
Probably the closest to it among ChatGPT's suggestion is eForth, but it's unfortunate IMHO they use C++.
Forth-83 is not an implementation but a standard, superseded by the ANSI standard in 01994, though sometimes people confuse it with F83, which is Laxen and Perry's public-domain implementation of the Forth-83 standard. It came out in 01984. It runs on CP/M and MS-DOS, but the MS-DOS version is limited to a single 64-kibibyte segment. It has no facilities to support embedding it in a program written in another language.
eForth, if by that you mean Bill Muench's eForth, is written in, mostly, Forth, on top of a small assembly-language core. It also has no facilities to support embedding it in a program written in another language.
GForth is pretty usable and can be embedded, as documented in https://gforth.org/manual/Integrating-Gforth.html (which is down at the moment but will probably be back up soon along with forth-standard.org.)
I don't know anything about SwiftForth and picoForth, but so far ChatGPT is 1 for 3, so I wouldn't bet much on those either.
The eForth I commented on indeed derives from Muench and Dr Ting [1].
About GForth, it doesn't look convincing - actually the feature looks like an afterthought (like those libraries that have been teared out of a program, e.g. libCurl).
This use-case isn't well covered for Forth it seems, because the focus has always been mostly on the other embedding. That's why I'd recommend again the DIY route.
To me it seems like the one of the least useful language out there, but a lot of people love it.
The one use case where it seems good is I think people were saying you could use it to write a compiler that you could bootstrap some other language in, because it's so small and presumably easy to audit at the assembly language level to avoid Trusting Trust problems.
Everything else seems rather like suckless, it appeals to people who live simplicity for it's own sake, and want maximum control and understanding of the system, not so much to who want to support every file format ever made and handle every possible user error and hardware failure and last minute use case change, and reuse as much well known black box code as possible.
In terms of compelling, though, I think it's mostly interesting as a study of how small and simple you can make a computing system—and what you give up when you do.
Forth lets you get an eminently hackable REPL up and running in a couple of thousand lines of code and a few thousand bytes of memory.
I have a side project to try to make an extreme minimal size compiler for a Forth inspired language and implementation. It's a compiler that generates a Forth like interpreter and byte code. The bytecode is huffman encoded. Nothing works yet but its an idea.
Are you planning to decompress the bytecode before execution? You might be interested in file `ingram-token-threading.md` in http://canonical.org/~kragen/sw/pavnotes2.git/, which describes a possible way to efficiently interpret Huffman-encoded bytecode on CPUs with wide registers.