Hacker News new | past | comments | ask | show | jobs | submit login

Summary:

Generational garbage collectors needs to know about pointers in the old generation that points to objects in the new generation. These objects in the new generation can only be correctly garbage collected together with the old generation. In a mutable language you need to have write barriers to detect when an object in the old generation sets a pointer into the new generation. This has been difficult in Ruby because C-extensions gets direct access to the memory locations. If a C-extension would modify the Array-pointer directly there's no way to detect this.

RGenGC solves this by adding another flag on objects: an object is either sunny or shady. All objects start as sunny. If you e.g. use a C-extension that tries to access the Array-pointer (using the RARRAY_PTR-macro) the object becomes shady. So: An object is sunny if we know write barriers are used for all updates; an object is shady if we do not know that. A shady object may or may not be modified; all we know is that a C-extension has access to the internal pointers and can do whatever it wants.

In addition, Ruby can't use a copying collector (because a C-extension might store the memory location somewhere). Instead it stores both new and old objects in the same space, only distinguished by a flag. When it does a minor collection (new generation) it will traverse/mark all roots, but ignore objects that are in the old generation. All traversed sunny objects will be promoted to the old generation (causing them to be ignored in the next collection). All traversed shady objects will stay in the new generation and also be added to the shady-set. This shady-set is a part of the roots that are traversed in the minor collection.

If a sunny object is in the old generation and becomes shady (e.g. RARRAY_PTR gets called) it will be demoted (to the new generation) and added to the shady-list.

See the attached PDF for more details (pros/cons, comparisons to other Ruby implementations, some performance numbers, internal API details).

All in all: A decent improvement of the garbage collector without breaking compatibility with C-extensions.




Thanks for the excellent summary! For anyone else, I highly recommend reading the paper itself, it's very interesting.

Correct me if I'm wrong, but my interpretation is that this trades memory for speed, so the average memory consumption of Ruby programs would increase. If memory is exhausted, a full 2.0.0-style mark and sweep GC would occur.

Programs that are heavily dependent on C extensions would not see much of a GC speed increase.

Most would, however, and they could continue to improve GC performance piecemeal by moving Ruby classes (starting with common classes like Array or String) to use the new write barrier methods.


I think what you're assuming will be relevant for old generation. Most object allocations should be short-lived so that memory should clear up.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: