New Garbage Collector

pcwalton · on June 1, 2012

This is very interesting, because it's extremely similar to the design that we need for Rust. We need:

* A generational GC, preferably bump-allocating so that we can win on allocation-heavy workloads.

* A non-thread-safe GC, because all GC'd data is local to a single task in Rust (and we enforce this in the type system).

* A precise GC, because (a) we don't want accidental or malicious leaks due to misinterpreting non-pointers as pointers; and (b) because we need precise GC metadata to clean up shared data and run destructors when tasks fail.

* An incremental GC, because minimizing pause times is absolutely critical for browsers.

* A non-copying GC, because LLVM doesn't currently support copying GC satisfactorily (i.e. when allowing pointers to live in registers). Fixing this would require a significant portion of all target code to be rewritten.

Also of particular interest is the ability for individual allocation sites to choose to bump allocate only, in the interests of performance (but at the expense of fragmentation). This could potentially allow Rust programmers to fine-tune their allocation behavior and trade performance for fragmentation when it makes sense to do so.

Very cool, and I'll be sure to keep an eye on this.

Raphael_Amiard · on June 1, 2012

> A non-copying GC, because LLVM doesn't currently support copying GC satisfactorily

This is a shame that this is so badly documented. I stumbled upon this way after having coded the whole code generator of my Z3 project (https://github.com/raph-amiard/Z3 - Ocaml bytecode -> LLVM compiler). The fact that you have to explicitly handle live roots with stacks instructions defeats a lot of the purpose of LLVM IR.

pcwalton · on June 1, 2012

I've got a bunch of code in a branch to allow register roots: https://github.com/pcwalton/llvm/tree/noteroots-ir

It just automatically roots live values that are pointers in a nonzero addrspace. Still doesn't support copying GC, and only works in the fast instruction selector at the moment, but it goes a long way toward making GC performant in LLVM.

cwzwarich · on June 1, 2012

The fast instruction selector only supports some constructs, whereas others are handled by fallback to the standard SelectionDAG instruction selection. In order to use this, do you have to ensure that you never trigger any of these fallbacks?

pcwalton · on June 1, 2012

Yes.

This is just for now, as fast isel is much easier to work with. I have other branches that implement most of the SelectionDAG support needed as well, but they aren't updated to the most recent work.

rayiner · on June 1, 2012

This sounds like a perfect fit for you guys: http://www.hpl.hp.com/techreports/Compaq-DEC/WRL-91-8.pdf

It's the same algorithm used in SBCL (in non-incremental form), and in Dalvik. It's a good fit for LLVM because it's mostly precise. I.e. it scans the stack and registers conservatively, and prevents objects referred-to from those roots from moving, but scans everything else precisely.

It's also quite simple to implement, as far as garbage collectors go. I'm working on an extension of the algorithm to support local heaps that can be collected independently of each other. The flexibility of the mostly-copying algorithm is hard to beat for stuff like this.

pcwalton · on June 1, 2012

Unfortunately, we need precise stack and register maps because we have pointers into the exchange heap, which are collected eagerly. If the GC tries to follow those, then it'll crash. At the same time, we have to store a liveness-accurate stack map of those pointers so that they can be collected during unwinding.

rayiner · on June 1, 2012

Interesting... If you have precise stack and register maps, why does the collector need to be nonmoving?

pcwalton · on June 1, 2012

LLVM code generation passes can introduce any number of register-to-register moves and perform arbitrary arithmetic that we don't track. We have to be able to find all derived pointers so we can write those into the stack map and update register/stack locations if we move an object.

The right solution is to have MachineInstr-level getelementptr instructions, so that we can discover all derived pointers and copies from a GC root after codegen passes are done. Right now, getelementptr instructions get lowered into "add"/"sub"/"mul"/"lea"/etc instructions too early and we can't track them.

What we can do relatively easily (with the series of patches I've been working on) is to track one instance of each GC root across a call (basically with a fake instruction after each call for each root that represents a virtual "use" of that root). That we can at least find the pointers.

rayiner · on June 1, 2012

Okay, so you can precisely identify all objects pointed-to by roots on the stack or in registers, but you can't identify all such roots so you can update them. From what you describe, mostly-copying collection is perfect for your use-case.

Basically, mostly-copying collection allows you to perform copying collection in a situation where certain objects cannot be moved. The original use case was for conservative stack scanning. In that situation, you cannot move objects pointed-to by an ambiguous root because you cannot ensure it is an actual root. So while you're scanning the root set, you're "promoting" the target objects to to-space instead of copying them into to-space. But the mechanism is more general than just conservative root scanning. E.g. I'm using it because I need to keep potentially global objects from being moved during a local collection.

In your case, you would walk the stack/register roots precisely using maps, but promote target objects to-space rather than copying them, so derived pointers would not have to be updated. Then you could copy the remaining objects.

EDIT: David Detlef's PhD thesis contains a pretty good description of the algorithm (starting on pg. 16): http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.38....

BruceM · on June 1, 2012

Have you or Mike Pall looked at the Memory Pool System from Ravenbrook: http://www.ravenbrook.com/project/mps/ ? This used to be the GC at Harlequin and is still used today in some commercial projects (and by OpenDylan on some platforms).

I've been talking to them about re-licensing under something more liberal and there's hope that that might happen. (If others expressed a similar interest, it might help.)

It is in need of some love:

* Support x86_64

* Support ARM

* Support threads on OS X

* Compilation and/or crash issues on modern Linux

I'd personally like to see some other improvements to the build system and have started on a reworking of the documentation into something more cohesive ...

carterschonwald · on June 2, 2012

You might want to look at how ghc does copying gc via its llvm backend

pcwalton · on June 2, 2012

According to what I've read, in GHC there are only four STG registers that are mapped to fixed registers in the x86 backend (and only one of them can contain a value). All roots are guaranteed to be either in this STG register or in the STG stack, which is kept separate from the C stack and is tagged with pointer/nonpointer info. This is certainly an option, but it seems suboptimal to me; we want LLVM to have the freedom to assign registers as it likes.

Raphael_Amiard · on June 1, 2012

This is great news ! I don't know if Mike Pall reads Hacker News, but a couple ideas come to my mind :

- May it be a good idea to create a KickStarter page ? I know LuaJit has its own donation mecanism, but it may help it gain further exposure. I know i'd be willing to donate for a better GC in either case.

- Mike mentions it has to be a non-moving collector, which is usually known for having worse performance than moving collectors. Also, a non-moving generational collector (and incremental, since this is also a requirement) is hard to get right.

I tried implementing a GC along the lines of this paper : http://www.pllab.riec.tohoku.ac.jp/papers/icfp2011UenoOhoriO... which presents a fast non moving collector. Incidentally it has ideas similar to those presented in this article. Maybe it could be a good source of inspiration ? I realize the ideas are maybe not as production grade since there is only one language implementation with a GC of this sort.

mikemike · on June 1, 2012

Yes, I thought about crowdfunding. But I fear that a) the Lua community is way too small to gather enough interest and b) it looks like the whole crowdfunding idea is rapidly deteriorating into an arms race of marketing experts. So many people are now jumping on that bandwagon. You'll never make it, unless you stay on the frontpages somehow.

Alas, I'm not good at marketing and a garbage collector is a very technical and very unsexy project (for most people, anyway). I should make up a silly name for it, that bears no relation to what it does. Yeah, that would do ...

Thank you for the link to the paper! I'll check it out.

I have good evidence that it's possible to create a non-moving GC that doesn't suck. It's a bit like marrying the best-of-breed of malloc implementations with the best-of-breed of incremental mark & sweep collectors. Plus some crazy ideas I'll have to experiment with first ...

Raphael_Amiard · on June 1, 2012

The paper i linked outlines a GC that uses bitmap marking, which is i believe the same technique you use in arenas.

The main innovation consists basically of having separate heaps for different objects sizes. It starts with a size of i, and has heaps for sizes i^1, i^2, ..., i^n for bounded n. An object that has a size of m > i^n goes into the i^n+1 heap.

It does thus waste memory at the end of object blocks, but they seem to indicate this is on par with memory wasted by fragmentation (this of course needs to be proofed).

The very big advantage is basically costless deallocation and cheap maintenance of free lists (you just mark the block as unused). It does also trigger several optimizations idea regarding lost space at the end of objects.

I don't know if you'd consider it battle proofed enough though, but i really think it's an interresting idea ! And the benchmarks seem to indicate very good performance.

Anyway, congrats on the 4-colors algorithm, and thank you for this article, it is actually a quite thorough explanation of mark & sweep techniques, i enjoyed reading it !

richcollins · on June 1, 2012

Any ideas why you get more sponsorship from gaming companies? Seems like this would be really useful for mobile devices.

Also, has anyone tried to pre-jit then load an image on iOS?

mikemike · on June 1, 2012

Err, I think you mean "why DON'T you get ...". :-/

I guess this has to do with the way the gaming industry works. The smaller game companies have maybe two or three big hits until they go under and/or the talent is bought out. The bigger ones are just wrappers for financial purposes nowadays. Yes, there are a few exceptions (no point in listing them). But overall, there's little continuity.

Sponsoring is an investment into the future. Most game companies drown in their daily business and never get to look beyond that. Making the next deadline is all that counts. Après nous le déluge.

Ok, so maybe this is just my personal impression and I'm all wrong. In fact, I'd love to be corrected ...

jacktoole1 · on June 1, 2012

Unfortunately, I think most mobile devices still forbid JIT (aside from integrated javascript implementations). The technical limitation was removed in iOS 5.0, but I believe whether Apple will accept apps using JIT is still an open question. Where I learned this: http://news.ycombinator.com/item?id=3818994

richcollins · on June 2, 2012

Right that's why you jit ahead of time and then load the result into executable memory when the app starts (vs while it's running)

jacktoole1 · on June 2, 2012

I didn't know that was possible - thanks for the insight!

ScottBurson · on June 2, 2012

Permit me a shameless plug for my new crowdfunding site, BountyOSS: https://bountyoss.com/

It works quite differently from Kickstarter; I think it's more appropriate for Open Source software projects.

It's still a bit rough -- I even hesitate to mention it here on HN just yet -- but I think it's usable.

kingkilr · on June 1, 2012

Kickstarter (last I checked) requires you to be American, so it won't fly for Mike (who is German).

simcop2387 · on June 1, 2012

The more and more I read about modern garbage collection design the more and more it seems to be remeniscent of filesystem design. It makes me wonder if there's some large parallels between them that can be used advantageously to benefit both. Things that people have learned work well on filesystems might apply to garbage collection and vice versa.

ajross · on June 1, 2012

I don't see the technical reasoning there, but I agree that there's a thematic parallel: they're both reasonably straightforward ideas with very clean user-facing stories. They can both be implemented very simply (literally as part of an undergraduate course -- I remember writing both).

And they both have undesirable performance problems in certain regimes, the solution to which has been the unending quest of generations of incredibly smart programmers and academics. They have led to staggering complexity, weird bugs, huge near-unmaintainable codebases, and uncounted PhD theses. Whole careers have been aimed at this stuff.

And frankly it's been, from many perspectives, mostly a waste. ZFS/btrfs are, for 95% of problems, indistinguishable from FFS/ext2 implementations that are a tiny fraction of their size. Modern Java and .NET VMs still have latency issues under GC pressure that were equally visible in Emacs Lisp a quarter century ago.

Applications which have hard requirements that fly in the face of these systems don't use them. Serious storage servers don't use the filesystem except vestigially (i.e. they do manual sync on very large files, or they use a block device directly). Serious realtime apps don't do GC and manage memory themselves. And that's not going to change no matter how many PhD's we throw at the problem.

cokernel_hacker · on June 1, 2012

I am afraid that is not even close to accurate. If you define "95% of problems" to be "reading and writing data such that data is read and written" sure, great.

However, there are two little, minor things that file systems care about: performance and safety. FFS/ext2 have neither of things.

Neither ext2 nor FFS contain a journal nor do the copy-on-write metadata. Heck, if you "upgrade" to ext3, you get the journal but nothing that protects you from bit-rot.

If you look at most drives on the market, you will see devices capable of corrupting data for certain after three years. Your journal'd filesystem does jack in this case, all it can do is ensure proper ordering of writes to metadata, no guarantees that the data will be any good once written.

How about performance? Well, if you look at FFS/ext2 they are essentially terrible. Block-at-a-time allocators with no extent configuration. Good luck getting the most out of your storage media when you have your block tree data-structure. Granted, ZFS suffers from the same issue but btrfs's extent tree configuration certainly does not. IIRC, the state of the art ext[23] implementations use read ahead to ameliorate the problem but does not fundamentally cure it. If you look at ext4, they have adopted extent trees via their Htree structure.

A filesystem like zfs/btrfs is pretty imune to bitrot, they can easily mirror their metadata and avoid overwriting their metadata like FFS/ext[234] making torn writes non-issues. They avoid the many pathologies that your "simpler" filesystem and trades the complexity for not needing a fsck mechanism in the face of data-corruption, one should only be needed in the face of implementation bugs (you should note that ZFS has no fsck, btrfs just recently obtained one).

Oh, and if any of you think soft updates work, they don't. While it would be great if they really did work but in a world where drives actively reorder writes and do not respect SCSI/SATA/misc transport commands to flush their internal caches, then you do not get safety. This set of drives is considerably huge.

tl;dr You are oversimplifying the complex.

ajross · on June 1, 2012

Clearly I hit a nerve.

>If you define "95% of problems" to be "reading and writing data such that data is read and written"

Pretty much. You have a competing definition? With the added point that "95% of problems" are mostly I/O bound reads of unfragmented write-once files and won't see benefit from extent allocation. And of the remaining 5% most of them are database systems which are doing their own journaling and block allocation.

Does btrfs have nice features (I like snapshots myself)? Sure. Do I use it in preference to ext4? Yes. But be honest with yourself: it's only incrementally better than ext2 for pretty much everything you use a computer for. And yet it sits at the pinnacle of 40 years of concerted effort.

And garbage collection is much the same: a staggeringly huge amount of effort for comparatively little (but not zero, thus "worth it" by some metric) payout.

Edit: just to throw some gas on the^H^H^H^H point out the narrowness of vision that I think is endemic in this kind of thought:

> If you look at most drives on the market, you will see devices capable of corrupting data for certain after three years.

If you look at most actual filesystems on the market, you'll find they're embedded in devices which will be thrown out in two years when their contract expires. They'll also be lost or stolen with vastly higher frequency than that at which they will experience NAND failure. If you look at most high-value storage deployments in the real world, they have redundancy and backup regimes in place which make that filesystem feature merely a downtime/latency improvement.

Basically, if someone waved a magic wand and erased all fancy filesystems and GC implementations from the world... how much would really change? Apple has deployed a pretty good mobile OS without GC, after all. Oracle made a business out of shipping reliable databases over raw block devices 20 years ago. Try that same trick with other "difficult" software technologies (video compression, say) and things look much more grim.

cokernel_hacker · on June 1, 2012

My competing definition includes safety and performance. Not about how the system works a single epsilon after an operation but perhaps reading data that we wrote last month, or last year and doing it using the hardware's resources efficiently.

Clearly either your or the ext[234] developers are mistaken as they have gone ahead and implemented extent based allocation and file management.

btrfs's design is inherently safe while ext2's design is inherently unsafe. To make the analogy work, I would say that ext2 does not fit the design spec for a file system; it would be like a garbage collector that does not actually collect garbage.

The payout for data integrity, which no filesystems really handled until the WAFL era, is huge. If you cannot see, this discussion has no point.

The only thing that I must be honest about is that I would not trust ext2 to store my data any more than I would trust you to implement a scheme to store my data.

ajross · on June 1, 2012

It's really not worth continuing the argument. But I'll just point out that again you're arguing about features in isolation (e.g. "safety" instead of the value of what you want to keep safe or dangers from things other than filesystem failure). It's easy to justify one filesytem as better than another. Our disconnect is that I'm talking about whether or not that filesystem is so much better as to justify the effort spent to develop it.

To make the point again: without btrfs (or really "the last 30 years of filesystem research") we'd all be OK. Likewise if V8 or LuaJIT had to do mark & sweep, or if everyone had to write in a language with manual storage allocation, things would look mostly the same.

Without video compression probably two thirds of the market for semiconductor devices and half the internet would simply evaporate (I pulled those numbers out of you-know-where, but I frankly doubt they're all that wrong). That tells me something about the relative value of these three applications of software engineering talent.

nightski · on June 1, 2012

The question is, would we have known any of this if that research had not taken place? It is easy to have 20/20 hindsight. Even with it though I am pretty sure we still would of expended the effort to get where we are today.

DrJosiah · on June 4, 2012

I disagree with you on both points.

You can't solve all (FS) problems for all people simultaneously. Custom FS solutions by Apple (recently), Oracle (20 years ago), and countless other vendors in the last 3 decades are about choosing different trade-offs, learning from mistakes, and growing the technology with the hardware (fat-12 -> fat-16 -> fat-32 was all about growing with increasing drive size). But if you truly believe that nothing substantial has been gained, why did you bother upgrading to btrfs?

Older filesystems suffered from a variety of problems that could and did ruin data for a lot of people. Modern filesystems like NTFS5, ZFS, btrfs, etc., make their best effort to ensure data can actually be consistently read and written, which while being outside of the most common path of reading non-fragmented files sequentially (I'd even say 99.99% of actual IO in non-server configurations), it is still a use-case that is important for the vast majority of computer users. You know, people who do business, homework, take pictures, ...

On the GC side of things, the Azul VM has shown to be quite effective at addressing many of the performance issues with garbage collection in Java (which gets even better when you stop doing stupid crap in Java). And there are garbage collected languages that are not Java that don't suffer the same stuttering problem of the Java GC (that have also been around for 20+ years, and have steadily gotten better over time).

fanf2 · on June 2, 2012

FreeBSD FFS with softupdates has been doing metadata copy on write since about 1998. Recent versions have a journal too, to deal with the cases not covered by softupdates.

ZFS also relies on drive write barriers for safety. There is no hope if your disk lies.

stiff · on June 1, 2012

Modern Java and .NET VMs still have latency issues under GC pressure that were equally visible in Emacs Lisp a quarter century ago.

Aren't you jumping to conclusions a little bit too fast? That there are still issues in some applications means all the work on Garbage Collection was wasted? I would like to remind you that modern GC-ed Java programs are often close in performance to C++ code with hand-written memory management, this clearly was not "visible in Emacs Lisp a quarter century ago".

finnh · on June 1, 2012

I believe that ZFS clocks in at fewer lines of code than competing filesystem + volume manager combos. That certainly was touted as a benefit of ZFS's "rampant layering violation" when first introduced.

end point: ZFS packs a huge quantity of functionality into 80K lines of code (probably closer to 90 or 100 now), which is quite a bit smaller than the combined size of (less featureful) file system + volume manager implemenations that it replaces.

See [1] for a comparison of ZFS vs. UFS+SVM. See [2] for Jeff Bonwick's discussion of the complexity win in ZFS.

[1] http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg0...

[2] https://blogs.oracle.com/bonwick/entry/rampant_layering_viol...

dons · on June 1, 2012

Convergence.

When we all have persistent, solid state drives, your filesystem will be your heap and GC...

jmaygarden · on June 1, 2012

To what solid state drive technology are you referring that would make good use as a heap? I wouldn't think that NAND flash would be suitable due to the write/erase constraints.

Tuna-Fish · on June 1, 2012

NAND flash is going away. Right now it loses half it's write endurance, and it's eraseblocks double in size, every shrink. Eraseblocks are already at 4MB and write endurance is below a 1000 writes, pretty soon additional shrinks will bring no value. This happens because of fundamental physical limits, and no-one believes they can be overcome -- even the biggest proponents are talking about delaying the inevitable for a few years instead of avoiding it.

Right now, there is a hundred-billion-dollar stampede going on for finding out it's successor. Players include:

MRAM (Toshiba, Hitachi, Hynix, IBM, Everspin(Freescale spinoff), Samsung, NEC)

FeRAM (Ramtron, IBM, TI, Fujitsu, Samsung, Matsushita, Oki, Toshiba, Infineon, Hynix, Symetrix)

ReRAM (HP, ITRI, IMEC, Panasonic, Rambus)

CBRAM (NEC, Sony, Axon, Micron)

PRAM (Intel, IBM, ST Micro, Samsung, Numoxys)

I might have missed a few backers. Also, all names after a technology are not working together, especially in FeRAM there are multiple competing approaches.

An interesting commonality about these technologies is that they all aim to be universal memories. That is, they intend to eventually replace both DRAM and Flash, and some are even aiming for the last cache levels in cpus. This will probably lead to some changes in system design. Although I think the people who are calling for the death of filesystems are going about it all wrong -- I expect the role of filesystems to expand, not shrink. All they need is new block layers.

Note that while everyone says memristors (ReRAM), they aren't even the most likely candidate (if I'd have to pick, I'd say PRAM), HP just has the best marketing.

jmaygarden · on June 2, 2012

Leon Chua also considers PRAM to be a memristor technology [1].

  All 2-terminal non-volatile memory devices based on
  resistance switching are memristors, regardless of the
  device material and physical operating mechanisms.

[1] http://www.springerlink.com/content/f41r8m054x550430/

Klinky · on June 2, 2012

IMFT got NAND down to 20nm with 3K - 5K write endurance. Hynix has plans to produce 15nm soon(unclear on endurance). Many thought getting NAND to it's current level would be impossible, the same could be said about hard drives.

Write amplification isn't much of an issue with smart firmware.

The new memory techs aren't market viable as we speak and it's questionable if first generation commercial devices featuring them will even be available within a few years and at what cost? Many of them also have drawbacks which hamper their viability or parallel the drawbacks of NAND.

I still think a real replacement for NAND is probably a decade out at least.

VMG · on June 1, 2012

MRAM maybe? (http://en.wikipedia.org/wiki/Magnetoresistive_random-access_...)

dons · on June 1, 2012

Was thinking of this article earlier this week: http://lwn.net/Articles/498289/ (which I'm presuming is some MRAM thing)

jasonwatkinspdx · on June 1, 2012

1.) If capacity increases, for a given workload, overwriting updates save less capacity. So even without new technology, there's a trend towards less overwrites of a given block for a fixed workload.

2.) Memristor Memory

mburns · on June 1, 2012

Memristors?

cpeterso · on June 1, 2012

Multics already implemented "single-level storage" in the 1960s. Effectively, the disk was all swap space for a huge RAM disk.

https://en.wikipedia.org/wiki/Multics#Novel_ideas

dons · on June 1, 2012

The research group I was part of worked on single address space OS's from the early 90s to the late 00's, http://www.ertos.nicta.com.au/research/old/mungi//manifesto.... , and yep, these ideas have been around for a long time.

whatusername · on June 2, 2012

Looks like IBM i is still/currently shipping with a single address space.

Wouldn't surprise me if Oracle and HP also have something similar.

nwmcsween · on June 1, 2012

'Filesystems' as in arbitrary data serialized to an arbitrary backing is absurdly stupid when said data has a structure as with all data. A semantic system that understands the data needs to be implemented at some point - why have n parsing implementations for a single data format? With a semantic 'object system' data becomes apart of programming.

oconnor0 · on June 1, 2012

How so? Could you explain in more detail?

simcop2387 · on June 1, 2012

A lot of what I've seen in the structures of how they keep track of things efficiently seem to be very similar. Though filesystems are adding a bit of extra data to it compared to the GC, (filenames, locations, etc) a number of the design constraints and things seem to be very very similar. Like the idea to keep track of what's free and doing the allocation seem to have very similar solutions even given the different constraints. I think part of this might also be because some of the more modern filesystems are being designed to also consider SSD which end up even closer to normal memory. The way that (at least at a layman's understanding) btrfs seems to handle COW looks very similar to how this new garbage collection is being designed for lua. And the arenas look very much like the extants in ext4 (not sure about that in btrfs).

This also makes me wonder if a design that takes caching of the drive into account could make some small improvements on things.

EDIT: added small snippet about the arenas.

ithkuil · on June 2, 2012

Filesystems and memory allocation algorithms do have something in common. The problem of allocating memory and managing free space etc exists also without GC.

When hierarchical filesystems behave as pure trees, it's always possible to simply deallocate the storage of a deleted file, because the file cannot be pointed by more than one entry in the tree.

Unix filesystems, however, do have a way of creating several references to the same file, namely hard links. In order to handle these multiple references (like having a program with multiple variables pointing to the same object) a kind of GC algorithm is used, which is simply a "reference counting GC".

Reference counting is a simple technique which is appropriate in several cases but has it's drawbacks and it's One of these drawbacks can be illustrated by this quote:

"One day a student came to Moon and said: “I understand how to make a better garbage collector. We must keep a reference count of the pointers to each cons.”

Moon patiently told the student the following story:

“One day a student came to Moon and said: ‘I understand how to make a better garbage collector... ”"

caf · on June 2, 2012

This drawback is why sensible UNIX filesystems don't allow you to create hardlinks to directories.

dllthomas · on June 1, 2012

>It must be non-copying, due to various constraints in the Lua/C API.

I don't see this. The Lua GC has to know what's visible from C, in order to avoid freeing those objects. If it knows what's visible from C, then it strictly just needs to avoid copying those objects. If it moves them to an uncollected area when C references them, and back when C dereferences them, then it can use a copying collector on the Lua-only objects.

This actually might work very well, if C-referenced objects are going to tend to be much longer lived. Of course, copying the objects into and out of the uncollected region will itself be some work.

I have no idea if the ideal algorithm is in this space, I just think that either they discarded copying collectors prematurely, or there's something I'm missing, so either I get to learn something or they do; is there a constraint I've missed?

mikemike · on June 1, 2012

There's no explicit API to pin objects from the C side. Ok, so there's an implicit contract on which pointers to internal VM objects are still valid, once the C side has gotten them (const char *lua_tostring() is the biggest offender). But this contract a) does not lead to a practical implementation and b) is violated by plenty of C modules for Lua, because Lua didn't have a moving GC, so nobody bothered to follow it.

Now ... the interesting objects, that benefit most from the defragmentation effects of a moving GC, are the variable-sized objects. Sadly, userdata cannot be moved at all and strings are the only other interesting object type. Well ... see above.

[Table objects are not variable-size. Only their array and hash parts are. But these are singly-linked from the table object and can easily be moved around, anyway. The addresses of these memory blocks are never exposed by the VM.]

tedunangst · on June 2, 2012

If you had a bit free, you could track which objects escape to C. But it's probably not worth going that way. Well, maybe. Programs that generate tons of strings may only occasionally escape them, so maybe moving them as part of the C call would be a good idea.

daurnimator · on June 2, 2012

I would argue in recent years people have been more aware of the non-guarantee of static pointers in the lua api; and a moving GC would be feasible. There was talk that the next lua version (after 5.2) would have a moving GC...

snprbob86 · on June 1, 2012

This document uses a bunch of jargon that I couldn't find good explanations/definitions for. Is there a good GC survey out there? If I wanted to write a modern GC and knew nothing about it, where would I look to get started? What should I read to get up to speed?

silentbicycle · on June 1, 2012

Wilson's "Uniprocessor Garbage Collection Techniques" (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.47.2...) is what you want: an excellent survey paper of fundamental garbage collector designs.

I highly recommend the Jones/Hosking/Moss book, but it assumes familiarity with the basics. Pick it up if the Wilson paper intrigues you. The previous edition of the book (Jones & Lins) is good, too, but the new one adds a lot of important material on real-time GC, concurrent GC, and interactions between GC and runtime systems.

cjensen · on June 1, 2012

"Garbage Collection: Algorithms for Automatic Dynamic Memory Management" by Richard Jones and Rafael D Lins.

Very readable and approachable. I read the entire book, which I almost never do with technical books.

me2i81 · on June 1, 2012

There's a newer book (2011), "The Garbage Collection Handbook: The Art of Automatic Memory Management" by Richard Jones, Antony Hoskins and Eliot Moss. I haven't read the new one, but it seems to cover a lot of the recent activity around Java GC.

cwzwarich · on June 1, 2012

The recent followup "The Garbage Collection Handbook: The Art of Automatic Memory Management" is much more up-to-date, but it has a bit less coverage of the more elementary techniques.

snprbob86 · on June 1, 2012

Thanks. I'll add it to the queue :-)

kokoloko · on June 2, 2012

Each language is rolling out its own GC. Why not share some efforts in a common library?

BruceM · on June 2, 2012

There's Boehm, MPS (http://www.ravenbrook.com/project/mps/) Apple's AutoZone, and others.

As mentioned above, I'm regularly talking with Ravenbrook about liberalizing the license on MPS and doing some work on modernizing it. I've hopes that that might happen this year.

And some of the language specific ones are pretty good ... SGen in Mono seems interesting. The GC in ClozureCL is interesting as well.

ufo · on June 2, 2012

garbage collectors are tightly integrated with the programming alnguage and runtime. Different languages, architectures, etc have different requirements from a GC...

draegtun · on June 2, 2012

libgarbagecollector - http://www.dekorte.com/projects/opensource/libgarbagecollect...

libgarbagecollector is a plug-in garbage collector library designed to be used by high level language implementors. It's used in the Io programming language.

pjmlp · on June 2, 2012

It's called JVM and .NET.

warmwaffles · on June 2, 2012

Can't really use that with Lua and Ruby

pjmlp · on June 3, 2012

Really?!

Lua:

http://sourceforge.net/projects/luaj/

https://bitbucket.org/xixs/anlua

http://luaforge.net/projects/luanet/

Ruby:

http://jruby.org/

http://www.ironruby.net/

caf · on June 2, 2012

It would be easier to follow if you used actual colours rather than several shades of grey in your description of your 4-colour design.