Level 1 systems programmer: "wow, it feels so nice having control over my memory...

matklad · on Aug 21, 2024

The answer is clear: just don’t have a malloc implementation in your process' address space!

thebruce87m · on Aug 21, 2024

Welcome to embedded! It’s no heaps of fun!

eschneider · on Aug 21, 2024

I'm always surprised how much I don't miss dynamic allocation. :)

hinkley · on Aug 21, 2024

> no heaps

Angry upvote

sgt · on Aug 21, 2024

pg needs to build that. Hold upvote icon for 5 secs=angry upvote

hinkley · on Aug 21, 2024

“I’m doing this as hard as I can”

poikroequ · on Aug 21, 2024

A bump allocator is all anyone really needs

mcguire · on Aug 21, 2024

"Eh, it'll crash before it runs out of memory."

zokier · on Aug 21, 2024

In some cases in a very literal sense (cue story about missiles)

seanthemon · on Aug 21, 2024

At the very bottom of everything is a garbage collector..

hinkley · on Aug 21, 2024

Soil is just the biggest swap meet in the world. Where every microbe, invertebrate and tree is just looking for someone else’s trash to turn into treasure.

riwsky · on Aug 21, 2024

Market forces: the ultimate garbage collector

ckocagil · on Aug 21, 2024

"stackoverflow please help me how do i fix memory fragmentation"

amelius · on Aug 21, 2024

Level 3 system programmer: "get me out of this straight jacket and give me my garbage collector back so I can get stuff done"

ComputerGuru · on Aug 21, 2024

That's not how system programmers think..

troutwine · on Aug 21, 2024

I agree. If we were to try and pin a thought process to an additional level of systems programmer it’d involve writing an allocator that’s custom to your domain. The problem with garbage collection for the systems’ case is you’re opting into a set of undefined and uncontrolled runtime behavior which is okay until it catastrophically isn’t. An allocator is the same but with less surface area and you can swap it at need.

amelius · on Aug 21, 2024

Meanwhile an OS uses the filesystem for just about everything and it is also a garbage collected system ...

Why should memory be different?

troutwine · on Aug 21, 2024

I'm not tracking how your question follows. If by garbage collection you mean a system in which resources are cleaned up at or after the moment they are marked as no longer being necessary then, sure, I guess I can see a thread here, although I think it a thin connection. The conversation up-thread is about runtime garbage collectors which are a mechanism with more semantic properties than this expansive definition implies and possessing an internal complexity that is opaque to the user. An allocator does have the more expensive definition I think you might be operating with, as does a filesystem, but it's the opacity and intrinsic binding to a specific runtime GC that makes it a challenging tool for systems programming.

Go for instance bills itself as a systems language and that's true for domains where bounded, predictable memory consumption / CPU trade-offs are not necessary _because_ the runtime GC is bundled and non-negotiable. Its behavior also shifts with releases. A systems program relying on an allocator alone can choose to ignore the allocator until it's a problem and swap the implementation out for one -- perhaps custom made -- that tailors to the domain.

amelius · on Aug 21, 2024

An OS has the job of managing resources, such as CPU, disk and memory.

It is easy to understand how it has grown historically, but the fact that every process still manages its own memory is a little absurd.

If your program __wants__ to manage its own memory, then that is simple: allocate a large (gc'd) blob of memory and run an allocator in it.

The problem is that the current view has it backwards.

201984 · on Aug 22, 2024

An OS would have a very hard time determining whether a page is "unused" or not. Normal GCs have to know at least which fields of a data structure contain pointers so it can find unreachable objects. To an OS, all memory is just opaque bytes, and it would have no way to know if any given 8 bytes is a pointer to a page or a 64-bit integer that happens to have the same value. This is pretty much why C/C++ don't have garbage collectors currently.

amelius · on Aug 22, 2024

> To an OS, all memory is just opaque bytes, and it would have no way to know if any given 8 bytes is a pointer to a page or a 64-bit integer that happens to have the same value.

This is like saying to an OS all file descriptors are just integers.

201984 · on Aug 22, 2024

That's because they are :P

I doubt GC would work on file descriptors either. How could an OS tell when scanning through memory if every 4 bytes is a file descriptor it must keep alive, or an integer that just happens to have the same value?

Not to mention that file descriptors (and pointers!) may not be stored by value. A program might have a set of fds and only store the first one, since it has some way to calculate the others, eg by adding one.

gpderetta · on Aug 22, 2024

A gargbage collector need not be conservative. Interestingly linux (and most posix compliant unices I guess) implements, as last resort, an actual tracing file descriptor garbage collector to track the lifetime of file descriptors: as they can be shared across processes via unix sockets (potentially recursively), arbitrary cycles can be created and reference counting is not enough.

LegionMammal978 · on Aug 21, 2024

The OS already does that, though? Your program requests some number of pages of virtual memory, and the OS uses a GC-like mechanism to allocate physical memory to those virtual pages on demand, wiping and reusing it soon after the virtual pages are unmapped.

It's just that programs tend to want to manage objects with sub-page granularity (as well as on separate threads in parallel), and at that level there are infinitely many possible access patterns and reachability criteria that a GC might want to optimize for.

PaulDavisThe1st · on Aug 21, 2024

AFAIK, no OS uses a "GC-like mechanism" to handle page allocation.

When a process requests additional pages be added to its address space, they remain in that address space until the process explicitly releases them or the process exits. At that time they go back on the free list to be re-used.

GC implies "finding" unused stuff among something other than a free list.

LegionMammal978 · on Aug 22, 2024

I was mainly thinking of the zeroing strategy: when a page is freed from one process, it generally has to be zeroed before being handed to another process. It looks like Linux does this as lazily as possible, but some of the BSDs allegedly use idle cycles to zero pages. So I'd consider that a form of GC to reclaim dirty pages, though I'll concede that it isn't as common as I thought.

the-smug-one · on Aug 21, 2024

> An OS has the job of managing resources, such as CPU, disk and memory.

The job of the OS is to virtualize resources, which it does (including memory).

jcelerier · on Aug 23, 2024

> Meanwhile an OS uses the filesystem for just about everything and it is also a garbage collected system ...

so many serious applications end-up reimplementing their own custom user-space / process-level filesystem for specific tasks because how SLOW can OS filesystems be though

amelius · on Aug 21, 2024

A new generation of system programmers is tired of solving the same old boring memory riddles over and over again and no borrow checker is going to help them because it only brings new riddles.

__s · on Aug 21, 2024

gc replaces riddles with punchlines

pjmlp · on Aug 21, 2024

Those of us that actually used systems programming languages with automatic resource management, do think that way.

Unfortunately science only evolves one funeral at a time.

mike_hearn · on Aug 21, 2024

It's how some think. Graal is a full compiler written in Java. There's a long history of JVMs and databases being written in GCd Java. I think you could push it a lot further too. Modern JVM GCs are entirely pauseless for instance.

pebal · on Aug 22, 2024

They are not. All Java GC introduce pauses, only short ones. SGCL for C++ is a true pauseless GC.

mike_hearn · on Aug 23, 2024

It depends how you define "pause" but no, modern GCs like ZGC and Shenandoah are universally agreed to be pauseless. The work done at the start/end of a GC is fixed time regardless of heap size and takes less than a millisecond. At that speed other latencies on your system are going to dominate GC unless you're on a hard RTOS like QNX.

forrestthewoods · on Aug 21, 2024

No. Just no.

For as painful as the debugging story was I have spent vastly more amounts of time working around garbage collectors to ship performant code.

0x457 · on Aug 21, 2024

What, you don't like doing GC only N requests (ruby web servers), disabling GC completely during working hours (java stock trading), fake allocating large buffers (go's allocate and don't use trick)?

mike_hearn · on Aug 21, 2024

The Java shops you're thinking of didn't disable GC during working hours, they just sized the generations to avoid a collection given their normal allocation rates.

But there were / are also plenty of trading shops that paid Azul for their pauseless C4 GC. Nowadays there's also ZGC and Shenandoah, so if you want to both allocate a lot and also not have pauses, that tech is no longer expensive.

0x457 · on Aug 21, 2024

> The Java shops you're thinking of didn't disable GC during working hours, they just sized the generations to avoid a collection given their normal allocation rates.

Well, I just trivialized it. However, in one case in mid 00s, I saw it disabled completely to avoid any pauses during trading hours.

pton_xd · on Aug 21, 2024

Ain't nothin' wrong with configuring V8 to have unbounded heap growth, disabling the memory reducer, and then killing the process after a while.

wpollock · on Aug 21, 2024

Used to do this in C decades ago. Worked on Unix but I doubt it works on Linux today, unless you disable memory overcommit completely.

gnuvince · on Aug 21, 2024

I need to find a pithy way to express "we use a garbage collector to avoid doing manual memory management because that'd require too much effort; but since the GC causes performance problems in production, we have spent more effort and energy working around those issues and creating bespoke tooling to mitigate them than the manual memory management we were trying to avoid in the first place would've required."

habibur · on Aug 21, 2024

RAII <-- best of both worlds.

chubot · on Aug 21, 2024

If you are talking about C++, it’s nice when RAII works. But if it does work, then in some sense your problem was easy. Async code and concurrent code require different solutions

neonsunset · on Aug 21, 2024

I'd wager it was an issue with the language of choice (or its GC) being rather poorly made performance-wise or a design that does not respect how GC works in the first place :)

forrestthewoods · on Aug 21, 2024

I’m sure you would! GC is like communism. Always some excuse as to why GC isn’t to blame.

> or a design that does not respect how GC works in the first place

It’s called shipping a 90 Hz VR game without dropping frames.

neonsunset · on Aug 21, 2024

Aside from finding the analogy strange and unfortunate, I assume you're talking about Unity, is that correct?

(if that is the case, I understand where the GC PTSD comes from)

o11c · on Aug 21, 2024

> I’m sure you would! GC is like communism. Always some excuse as to why GC isn’t to blame.

To be fair, there are about 4 completely independent bad decisions that tend to be made together in a given language. GC is just one of them, and not necessarily the worst (possibly the least bad, even).

The decisions, in rough order of importance according to some guy on the Internet:

1. The static-vs-dynamic axis. This is not a binary decision, things like "functions tend to accept interfaces rather than concrete types" and "all methods are virtual unless marked final" still penalize you even if you appear to have static types. C++'s "static duck typing" in templates theoretically counts here, but damages programmer sanity rather than runtime performance. Expressivity of the type system (higher-kinded types, generics) also matters. Thus Java-like languages don't actually do particularly great here.

2. The AOT-vs-JIT axis. Again, this is not a binary decision, nor is it fully independent of other axes - in particular, dynamic languages with optimistic tracing JITs are worse than Java-style JITs. A notable compromise is "cached JIT" aka "AOT at startup" (in particular, this deals with -march=native), though this can fail badly in "rebuild the container every startup" workflows. Theoretically some degree of runtime JIT can help too since PGO is hard, but it's usually lost in the noise. Note that if your language understands what "relocations" are you can win a lot. Java-like languages can lose badly for some workflows (e.g. tools intended to be launched from bash interactively) here, but other workflows can ignore this.

3. The inline-vs-indirect-object axis - that is, are all objects (effectively) forced to be separate allocations, or can they be subobjects (value types)? If local variables can avoid allocation that only counts for a little bit. Java loses very badly here outside of purely numerical code (Project Valhalla has been promising a solution for a decade now, and given their unwieldy proposals it's not clear they actually understand the problem), but C# is tolerable, though still far behind C++ (note the "fat shared" implications with #4). In other words - yes, usually the problem isn't the GC, it's the fact that the language forces you to generate garbage in the first place.

4. The intended-vs-uncontrollable-memory-ownership axis. GC-by-default is an automatic loss here; the bare minimum is to support the widely-intended (unique, shared, weak, borrowed) quartet without much programmer overhead (barring the bug below, you can write unique-like logic by hand, and implement the others in terms of it; particularly, many languages have poor support for weak), but I have a much bigger list [1] and some require language support to implement. try-with-resources (= Python-style with) is worth a little here but nowhere near enough to count as a win; try-finally is assumed even in the worst case but worth nothing due to being very ugly. Note that many languages are unavoidably buggy if they allow an exception to occur between the construction of an object and its assignment to a variable; the only way to avoid this is to write extensions in native code.

[1] https://gist.github.com/o11c/dee52f11428b3d70914c4ed5652d43f... - a list of intended memory ownership policies. Generalized GC has never found a theoretical use; it only shows up as a workaround.

neonsunset · on Aug 22, 2024

re 1. C# dispatch strategy is not Java-like: all methods are by default non-virtual unless specified otherwise. In addition, dispatch-by-generic-constraint for structs is zero-cost, much like Rust generics or C++ templates. As of now, neither OpenJDK nor .NET suffer from virtual and interface calls to the same extent C suffers from manually rolled vtables or C++ suffers from virtual calls. Because both OpenJDK/GraalVM and .NET have compilers that are intimately aware of the exact type system they are targeting which enable advanced devirtualization patterns. Notably, this also works as whole-program-optimization for native binaries produced by .NET's nativeaot.

re 4. there is some understanding gap in programming community to the kind of constraints imposed by lifetime analysis on dynamicity allowed by JIT compilation, which comes at a tradeoff of being able to invalidate previous assertions about when the object or struct truly no longer referenced, whether it escapes or else - you may be no longer able to re-JIT the method, attach a debugger or introduce some other change. There is still also lack of understanding where the cost of GC comes from and how it compares to other memory management techniques, or how it interacts with escape analysis (which in many ways resembles static lifetime analysis for linear and affine types), particularly so when it is inter-procedural. I am saying this as a response to "GC-by-default is an automatic loss" which sounds overly generalized "GC bad" you get used to hearing from audience who never looked at it with a profiler.

And lastly - latency-sensitive gamedev and predictability tends to come with completely different set of constraints to regular application code, and tends to require comparable techniques regardless of the language of choice provided it has capable compiler and GC implementations. It greatly favours low or schedulable STW pause GC (pause-less-like and especially non-moving designs tend to come with very ARC-like synchronization cost and low throughput (Go) or significantly higher heap sizes over actively used set (JVM pauseless GC impls. like Azul, maybe ZGC?), ideally with some or most collection phases being concurrent that performs best at moderate allocation rates. In the Unity case, there are quite a few poor quality libraries, as well as constraints of Unity specifically in regards to its rudimentary non-moving GC, which did receive upgrades for incremental per-frame collection but still would cause issues in scenarios where it cannot keep up. This is likely why the author of the parent comment is so up and arms about GC.

However, for complex frequently allocated and deallocated object graphs that do not have immediately observed lifetime constrained to a single thread, good GC is vastly superior to RC+malloc/free and can be matched by manually managing various arenas at much greater complexity cost, which is still an option in a GC-based language like C# (and is a popular technique in this domain).

forrestthewoods · on Aug 22, 2024

> I assume you're talking about Unity, is that correct

That particular project was Unity. Which, as you know, has a notoriously poor GC implementation.

It sure seems like there are a whole lot more bad GC implementations than good. And good ones are seemingly never available in my domains! Which makes their supposed existence irrelevant to my decision tree.

> good GC is vastly superior to RC+malloc/free

Ehhh. To be honest memory management is kind of easy. Memory leaks are easy to track. Double frees are kind of a non-issue. Use after free requires a modicum of care and planning.

> and can be matched by manually managing various arenas at much greater complexity cost, which is still an option in a GC-based language like C# (and is a popular technique in this domain).

Not 100% sure what you mean here.

I really really hate having to fight the GC and go out of my way to pool objects in C#. Sure it works. But it defeats the whole purpose of having a GC and is much much more painful than if it were just C.