Meanwhile an OS uses the filesystem for just about everything and it is also a g...

troutwine · on Aug 21, 2024

I'm not tracking how your question follows. If by garbage collection you mean a system in which resources are cleaned up at or after the moment they are marked as no longer being necessary then, sure, I guess I can see a thread here, although I think it a thin connection. The conversation up-thread is about runtime garbage collectors which are a mechanism with more semantic properties than this expansive definition implies and possessing an internal complexity that is opaque to the user. An allocator does have the more expensive definition I think you might be operating with, as does a filesystem, but it's the opacity and intrinsic binding to a specific runtime GC that makes it a challenging tool for systems programming.

Go for instance bills itself as a systems language and that's true for domains where bounded, predictable memory consumption / CPU trade-offs are not necessary _because_ the runtime GC is bundled and non-negotiable. Its behavior also shifts with releases. A systems program relying on an allocator alone can choose to ignore the allocator until it's a problem and swap the implementation out for one -- perhaps custom made -- that tailors to the domain.

amelius · on Aug 21, 2024

An OS has the job of managing resources, such as CPU, disk and memory.

It is easy to understand how it has grown historically, but the fact that every process still manages its own memory is a little absurd.

If your program __wants__ to manage its own memory, then that is simple: allocate a large (gc'd) blob of memory and run an allocator in it.

The problem is that the current view has it backwards.

201984 · on Aug 22, 2024

An OS would have a very hard time determining whether a page is "unused" or not. Normal GCs have to know at least which fields of a data structure contain pointers so it can find unreachable objects. To an OS, all memory is just opaque bytes, and it would have no way to know if any given 8 bytes is a pointer to a page or a 64-bit integer that happens to have the same value. This is pretty much why C/C++ don't have garbage collectors currently.

amelius · on Aug 22, 2024

> To an OS, all memory is just opaque bytes, and it would have no way to know if any given 8 bytes is a pointer to a page or a 64-bit integer that happens to have the same value.

This is like saying to an OS all file descriptors are just integers.

201984 · on Aug 22, 2024

That's because they are :P

I doubt GC would work on file descriptors either. How could an OS tell when scanning through memory if every 4 bytes is a file descriptor it must keep alive, or an integer that just happens to have the same value?

Not to mention that file descriptors (and pointers!) may not be stored by value. A program might have a set of fds and only store the first one, since it has some way to calculate the others, eg by adding one.

gpderetta · on Aug 22, 2024

A gargbage collector need not be conservative. Interestingly linux (and most posix compliant unices I guess) implements, as last resort, an actual tracing file descriptor garbage collector to track the lifetime of file descriptors: as they can be shared across processes via unix sockets (potentially recursively), arbitrary cycles can be created and reference counting is not enough.

LegionMammal978 · on Aug 21, 2024

The OS already does that, though? Your program requests some number of pages of virtual memory, and the OS uses a GC-like mechanism to allocate physical memory to those virtual pages on demand, wiping and reusing it soon after the virtual pages are unmapped.

It's just that programs tend to want to manage objects with sub-page granularity (as well as on separate threads in parallel), and at that level there are infinitely many possible access patterns and reachability criteria that a GC might want to optimize for.

PaulDavisThe1st · on Aug 21, 2024

AFAIK, no OS uses a "GC-like mechanism" to handle page allocation.

When a process requests additional pages be added to its address space, they remain in that address space until the process explicitly releases them or the process exits. At that time they go back on the free list to be re-used.

GC implies "finding" unused stuff among something other than a free list.

LegionMammal978 · on Aug 22, 2024

I was mainly thinking of the zeroing strategy: when a page is freed from one process, it generally has to be zeroed before being handed to another process. It looks like Linux does this as lazily as possible, but some of the BSDs allegedly use idle cycles to zero pages. So I'd consider that a form of GC to reclaim dirty pages, though I'll concede that it isn't as common as I thought.

the-smug-one · on Aug 21, 2024

> An OS has the job of managing resources, such as CPU, disk and memory.

The job of the OS is to virtualize resources, which it does (including memory).

jcelerier · on Aug 23, 2024

> Meanwhile an OS uses the filesystem for just about everything and it is also a garbage collected system ...

so many serious applications end-up reimplementing their own custom user-space / process-level filesystem for specific tasks because how SLOW can OS filesystems be though