LLVM code generation passes can introduce any number of register-to-register moves and perform arbitrary arithmetic that we don't track. We have to be able to find all derived pointers so we can write those into the stack map and update register/stack locations if we move an object.
The right solution is to have MachineInstr-level getelementptr instructions, so that we can discover all derived pointers and copies from a GC root after codegen passes are done. Right now, getelementptr instructions get lowered into "add"/"sub"/"mul"/"lea"/etc instructions too early and we can't track them.
What we can do relatively easily (with the series of patches I've been working on) is to track one instance of each GC root across a call (basically with a fake instruction after each call for each root that represents a virtual "use" of that root). That we can at least find the pointers.
Okay, so you can precisely identify all objects pointed-to by roots on the stack or in registers, but you can't identify all such roots so you can update them. From what you describe, mostly-copying collection is perfect for your use-case.
Basically, mostly-copying collection allows you to perform copying collection in a situation where certain objects cannot be moved. The original use case was for conservative stack scanning. In that situation, you cannot move objects pointed-to by an ambiguous root because you cannot ensure it is an actual root. So while you're scanning the root set, you're "promoting" the target objects to to-space instead of copying them into to-space. But the mechanism is more general than just conservative root scanning. E.g. I'm using it because I need to keep potentially global objects from being moved during a local collection.
In your case, you would walk the stack/register roots precisely using maps, but promote target objects to-space rather than copying them, so derived pointers would not have to be updated. Then you could copy the remaining objects.
The right solution is to have MachineInstr-level getelementptr instructions, so that we can discover all derived pointers and copies from a GC root after codegen passes are done. Right now, getelementptr instructions get lowered into "add"/"sub"/"mul"/"lea"/etc instructions too early and we can't track them.
What we can do relatively easily (with the series of patches I've been working on) is to track one instance of each GC root across a call (basically with a fake instruction after each call for each root that represents a virtual "use" of that root). That we can at least find the pointers.