I have a few questions about the garbage collection. One of the hard parts of implementing a garbage collector is making sure everything is properly rooted (especially with a moving collector). you have the `do_garbage_collection` method marked unsafe[1], but don't explain what the calling code needs to do to ensure it is safe to call. How do you ensure all references to the heap are rooted? This is not a trivial problem[2][3][4].
Also note that I cloned the repo and tried to run `cargo test` every test fails with 'should be able to add entries to the classpath: InvalidEntry(".../vm/rt.jar")' vm/tests/integration/real_code_tests.rs:15:10
It's pretty straightforward. Their VM maintains its own notion of a callstack instead of using the native callstack. That lets them iterate over it and find all of the parameters and locals on the VM's callstack and use them as roots.
There is a performance cost for a VM having its own virtual callstacks like this, but it makes GC tracing much simpler. (It also makes implementing interesting concurrency and control flow primitives like coroutines or continuations much easier too.)
Seems like that would take care of roots for the bytecode's themselves, but not for "native" functions[1]. Allocating a new object could call gc[2], and native functions are using the native callstack. It seems like it would be easy to allocate in a native function and any unrooted references would be invalidated. In fact I see a case like that here[3]. That method creates a reference with `expect_concrete_object_at` and then calls gc with `new_java_lang_class_object`. It avoids UB by not using `arg` after the call that gc's, but there is nothing stopping you from using `arg` again (and having an invalid reference).
Indeed you are right, this is definitely a bug and could cause errors.
I guess the solution would be to add an explicit API to create a GC root, invoked by native methods (which is a bit complicated by the fact that I use a moving collector).
Many years ago I was using SpiderMonkey in a c++ project and I seem to remember there were some APIs for native callbacks to invoke that rooted values. Same problem and similar solution. :-)
> I guess the solution would be to add an explicit API to create a GC root, invoked by native methods (which is a bit complicated by the fact that I use a moving collector).
This is why I do in the Wren VM. Any time a native C function has the only reference to a GC-managed object and it's possible for a collection to occur, it calls a function to temporarily add the object to a list of known roots.
Also note that I cloned the repo and tried to run `cargo test` every test fails with 'should be able to add entries to the classpath: InvalidEntry(".../vm/rt.jar")' vm/tests/integration/real_code_tests.rs:15:10
[1] https://github.com/andreabergia/rjvm/blob/be9c54066c64a82879...
[2] https://manishearth.github.io/blog/2021/04/05/a-tour-of-safe...
[3] https://without.boats/blog/shifgrethor-iii/
[4] https://coredumped.dev/2022/04/11/implementing-a-safe-garbag...