Apple's Leopard Garbage Collector Now Open Source

icey · on Nov 12, 2008

AutoZone? What a terrible name for a piece of software.

http://maps.google.com/maps?hl=en&um=1&ie=UTF-8&...

Hexstream · on Nov 12, 2008

I have some questions about garbage collectors...

If you decided to make a whole OS in say, Lisp, with a garbage collector, wouldn't that mean you couldn't make games because of the garbage collection pauses incurred by the OS, even if the game process itself uses manual memory management?

And is it possible to make a garbage collector that can somehow run "continually" (it would detect garbage at fairly regular intervals), so that you get lots of micro-GC-pauses that are unnoticeable to the user instead of a few ones that break the gameplay?

edit: One last question: is it feasible to make some kind of funky Lisp dialect with manual memory management or Lisp fundamentally requires a GC? Sorry for sounding like a CS professor asking rhetorical questions again ;P

rcoder · on Nov 13, 2008

Short answer(s): no, and yes.

Longer answer:

Memory management is hard in soft real-time systems like video games, no matter whether you use GC or not. If you do any dynamic memory allocation, you're going to run the risk of eventually hitting an malloc() or free() call that blocks your main thread.

However, avoiding memory leaks is especially important for gaming applications, especially if you're targeting console hardware. Game code tends to create very large numbers of tiny objects during each session, and need to be able to re-use the memory taken by those objects quickly as the game context changes. GC is a great way to protect yourself from leaks, if you can avoid long GC pauses during gameplay.

There are a large number of strategies for keeping GC from blocking time-sensitive threads in your program. The solution you describe is a little like generational garbage collection (which AutoZone, incidentally, does). In generational GC, rather than scan the entire heap on every pass, you focus on recently-created objects, as they're most likely to be garbage.

Also, using GC for some parts of your application doesn't mean you have to use it everywhere. You might divide your application such that real-time threads use a more deterministic memory-management system, while less performance-critical code gets the productivity and stability advantages of GC.

procrastitron · on Nov 13, 2008

I have a Lisp OS that I'm developing, and it has a real-time garbage collector.

The way it does this is that the garbage is detected using the weizenbaum variant of reference counting instead of some form of mark-and-sweep. There are a couple reasons that people tend to prefer mark-and-sweep over reference counting:

1. The reference counts have to be maintained on every assignment operation.

2. Reference counting can break if the heap-allocated memory has cycles.

The first issue is basically what you described when you asked 'is it possible to make a garbage collector that can somehow run "continually"'. The answer is yes, and the way you do that is with reference counting. However, this extra work means that you have a performance hit when compared to mark-and-sweep GCs. It's an example of the classic tradeoff between low latency and high throughput; and most applications are better off with high throughput.

The second issue is the more fundamental one, because cycles in heap allocated memory mean that the reference counts on those memory cells will never drop to zero; hence they will never be reclaimed. Most GCs that use reference counting fix this problem by also using a mark-and-sweep pass to collect cyclic data structures. Of course, this would still not be a real-time collector, so it doesn't fit my needs. Instead I changed the language to not allow cyclic data structures to be created in the first place. The way I did this was to simply make my Lisp dialect purely functional.

jdvh · on Nov 13, 2008

Nope, lisp without garbage collection isn't feasible. Tried it - didn't work out. Let's start with an ordinary lisp implementation, and replace garbage collection with stacklocal and manual allocation. Then using a couple of examples I'll show that you end up with a monstrosity.

You can do a lot with stack-local allocation, but not everything. I think it's fair to say that a Lisp without closures is a useless Lisp.

A clever compiler can optimize something like

    (print (reduce (list 1 2 3 4) (lambda (x y) (+ y x))))

such that it works without any garbage collection. The list you create is a local variable, the lambda function is a pure function without state (no closure necessary) and reduce is also a pure function.

You might think that if you can do this, you can express almost everything without garbage collection. Unfortunately, not really.

Suppose you have `revere-list` function. E.g.

    (reverse-list (list 1 2 3 4)) # '(4 3 2 1)

We previously assumed that the list was a local variable, created as a function argument. Now reverse-list must return a copy of the list, as the original list ceases to exist when the function returns. That's a problem, because it means you have to make deep-copies of every object on function calls. No big deal, you say: a lot of languages call-by-value. Well, yes, but it _is_ a big deal. Because it means you have to introduce & and * so you can describe when you want a by-copy and by-reference assignment.

So you get a lisp with pointers. A lot of fun, but not all that practical.

   (= *(elt my-list 3) (+ *(elt my-list 3) 1))

Yes, it's exactly what you think it is. 'elt now returns a reference to a list element, that you have to dereference before assignment. Ugly, but it works right?

Oh no. It gets worse. What if you create a function that returns a reference to a function argument (stack-local!)? Segfault.

A lisp with pointers that can segfault at any point? That's it. I'm going back to C.

I've skipped a lot of steps here (I could go on for hours), so you have to read really carefully to see how every problem is a logical consequence of each assumption. Sorry for that...

I've been trying to find the perfect balance between stack-allocation (RAII) and garbage collection, and it's a lot tougher than it seems.

warwick · on Nov 13, 2008

To answer your second question, you might want to look at concurrent garbage collection.

Here's the obligatory wiki link: http://en.wikipedia.org/wiki/Garbage_collection_(computer_science)#Stop-the-world_vs._incremental_vs._concurrent

maximilian · on Nov 13, 2008

Its almost like you are some sort of CS professor who wanted to ask rhetorical questions about garbage collection, but then didn't answer them.

Hexstream · on Nov 13, 2008

Any sufficiently ignorant programmer is indistinguishable from a CS professor asking rhetorical questions?

maximilian · on Nov 13, 2008

Its like he was walking through a question a professor might ask in front of his class before pausing for that slightest of moments before pressing a button to get the next slide and the inevitable answer.

Most of my ignorant questions are more ignorant than his question. Its like he's thought of it before and was wondering aloud.

sahaj · on Nov 13, 2008

just get a really fast proc with a lot of mem, and you won't have to worry about shit.