I hate to be pedantic here but I see the term TCO thrown around all the time. The Scheme standard requires that unbounded tail call functions can run in constant space.
This is a semantic difference, not an optimization. Turning off optimizations should never cause your program to crash.
I wrote TCO in the title because with "proper tail calls" the title exceeded the allowed limit on hacker news! Moreover most people will know the term TCO and less the more precise term proper tail calls. You are of course correct that tail calls are a required aspect of the Scheme specification even though many implementations that call themselves "Scheme" don't implement this fully.
If we're being pedantic, it's really a problem of degrees - any space-based optimization, like the padding and ordering of struct members, could cause a program to crash by running out of stack or heap. In that case, turning off (or on!) optimizations may cause a program to crash.
If (let x () (x)) is allowed to take more than a constant amount of space, then for any finite memory there is always some finite runtime after which the program will exhaust the memory.
With proper tail calls, there is always some finite memory above which the program will not consume, even given an arbitrarily long runtime.
So, while it is a problem of degrees, one degree is infinite while the other is finite.
I am not sure if I understand correctly. So this is a scheme running on a scheme. You need e.g. Chicken to compile/minify it and then run it via Gambit.... Why? Why would I do, except of learning puroposes of course, do such thing. I clearly am missing something.
You need a Scheme interpreter (Gambit, Chicken or Guile have been tested) to run the Ribbit AOT compiler. The Ribbit AOT compiler will compile a Scheme program to one of the supported target languages: C, JavaScript, Python, Scheme or POSIX shell script. Ribbit has been designed for a small footprint, so the generated target code is quite compact (as small as 2-4 KB for small programs). This is much more compact than any other existing Scheme system. Moreover the system is very portable to other target languages because Ribbit is implemented with a tiny virtual machine. There are ports underway to Java, Scala, Lua, Rust, go, Ruby, Haskell, and Idriss. This means a larger program in any of these languages can embed the Ribbit VM for scripting or other purposes.
Ribbit is also bootstrapped (it can compile itself). This means that you only need another Scheme system for the initial step of the bootstrap to compile the Ribbit compiler to one of the supported target languages. After this step the Ribbit compiler only needs an implementation of the target language to run. For example, you can use the Gambit interpreter to compile the Ribbit compiler to a POSIX shell script (about 64K bytes). After this is done then you only need a POSIX shell to compile any Scheme program to C, JavaScript, Python, or POSIX shell script.
In its current state the POSIX shell target is mostly a proof of concept to demonstrate that the Ribbit VM is extremely portable. One area of practical use is for bootstrapping compilers for other languages using minimal tools (see the Mes project https://bootstrappable.org/projects/mes.html for bootstrapping gcc to have a reproducible build process for gcc). With Ribbit's POSIX shell target it would be possible to build gcc on any system that has a POSIX shell (without needing other build tools like a C compiler, linker, etc).
The go implementation of the RVM is not yet finished. I would say the reference implementations are the JavaScript, Python and Scheme ones. They were written in that order and build up on various ways of expressing things compactly. Originally there was no minification step so the code was written with short identifiers and few comments (the size of the source code is the metric for the "footprint" of the system and comments count). To understand the code it helps to read the following paper that explains the Ribbit VM: http://www.iro.umontreal.ca/~feeley/papers/YvonFeeleyVMIL21.... .
I’m no compiler guy but I’ve written a fair amount of shell script and I’m going to take an educated guess.
My guess is that in order to do proper text processing and transformation in shell-script you have to call many external tools (from the posix-compliant toolchest) many times, meaning that you’ll be paying a lot for fork+exec.
An important design goal for the POSIX shell version of the RVM is to depend on the fewest external tools possible. Each dependency brings with it a portability risk because unix tools can be subtly different between environments (for example macOS vs linux, and even between linuxes, and even between versions of the same distribution of linux).
So the core of the RVM has no dependencies with external tools (i.e. it is in "pure" shell script). The only dependencies are for I/O, namely for the "putchar" RVM primitive and the "getchar" RVM primitive (both of which do byte-at-a-time binary I/O on stdin and stdout). For putchar the shell "printf" (which is usually builtin) command is used, specifically
printf \\$((byte/64))$((byte/8%8))$((byte%8))
For getchar is is not possible to use the shell "read" command because it does not handle null bytes on stdin (and we want Ribbit to support binary I/O). This is admittedly somewhat of an artificial constraint given that Ribbit will probably be mostly used for text processing, like to write compilers that map source code text to target code text. So getchar is implemented with
set `sed q | od -v -A n -t u1`
and then the next byte on stdin is available using $1 and "shift" is used to move to the next byte. The call to sed is to read each line separately (useful to implement a REPL where you enter a command which is executed before having to enter the next command). Both "sed" and "od" are standard unix tools. Moreover we checked that those set of options have existed for a long time (at least the early 1990's). It would be interesting to test many POSIX shells to verify Ribbit's portability (we used bash, dash, zsh and ksh for our testing).
The low performance of the POSIX shell RVM is due to two things. First, the shell's interpreter is not fast. The simple loop
n=10000000; while [ $n -gt 0 ]; do n=$((n-1)); done
runs about 5000 times slower with bash than the equivalent JavaScript code run with nodejs, which is implemented with a JIT. Secondly, the POSIX shell does not have arrays so it is necessary to use the shell "eval" command to do indexing. For example a[i]=0 becomes
eval a$i=0
As you can imagine the RVM implementation uses arrays all over the place for implementing access to the RVM's stack and Scheme objects (both of which are implemented using a garbage collected heap, which is conceptually an array of 3 field records and implemented in the shell as 3 "arrays").
The POSIX shell version of Ribbit was designed to be as portable as possible, only relying on a POSIX shell to run Scheme code. This can be useful in a context where you only have a POSIX shell at hand and you want to bootstrap a more efficient set of development tools (C compiler, editor, etc) and you don't mind waiting a day or two for the build to complete. It is also useful when you want a reproducible build pipeline and the only tool you trust is the shell.
Ribbit achieves a 4K footprint, but that just means the size of the Ribbit VM plus the size of the compacted compiled code. The uncompacted code is stored in RAM in the form of linked "ribs" (3 cell objects, or 6/12/24 bytes depending on the word size) and this is not very memory efficient compared to a bytecode representation. So it takes on the order of 64K RAM to run the REPL, which is about 1000 lines of Scheme code. So a rule of thumb is about 64 bytes of RAM per line of Scheme code. You can use that ratio to determine how large of a Scheme program will fit on your specific device.
The Ribbit design is optimized for a setting where the program is communicated say over the Web before it is executed, so it is important to minimize the footprint = transfer time.
This is a semantic difference, not an optimization. Turning off optimizations should never cause your program to crash.