IIRC, in the 90's there were several Forth implementations that worked as described and appeared to have been somewhat successful in the embedded market. Those were before my time though.
However, with the amount of RAM/FLASH on modern MCU's you can implement the "compiler" code in Forth. I used a similar scheme for a while during experimental phases, but have moved away as Forth code becomes tedious to modify after you haven't looked at it for a while (e.g. a few weeks). It's just easier to setup OTA firmware updates.
I assume Forth, Inc's stuff works and I assume it's still around, but I've never used it. Forth is lots of fun compared to languages with curly braces so it's too bad it was already on its way out back then.
Modern MCUs have so much RAM and flash you could probably run a whole 1980s style development enviroment on them (think say, Turbo Pascal and a collection of tools) on a console, TUI and all.
However, with the amount of RAM/FLASH on modern MCU's you can implement the "compiler" code in Forth. I used a similar scheme for a while during experimental phases, but have moved away as Forth code becomes tedious to modify after you haven't looked at it for a while (e.g. a few weeks). It's just easier to setup OTA firmware updates.