Hacker News new | past | comments | ask | show | jobs | submit login

Yeah, I'm being unfair in naming Go & Java specifically. But these stories of 'fixing' garbage collection come up all too often.

I wonder when we'll see a further GC update that trades latency for throughput...

The problem seems to be that no matter how you tweak GC, you will always have a class of program that it performs terribly for (and it seems to impact a large group of programs, never just some obscure corner case). So I suspect that this latest GC tweak will have unexpected results on some other class of program, leading to another tweak, and so on...




The problem seems to be that no matter how you tweak GC, you will always have a class of program that it performs terribly for

For casual use, most programs can treat GC like magic, but if you are doing serious work in a language with GC, then you should learn about the GC's characteristics. That bit of due diligence and up front design effort is still often going to be tons cheaper than doing the manual memory management.

Reducing latency in exchange for throughput is the right decision for the vast majority of programs that will be written in Go. It was already a very attractive language for writing a multiplayer game server, so long as I didn't have very large heaps. (Even so, I can still support 150-250 players and 10's of thousands of entities.) With the "tweak," that limitation is much relaxed.


> often going to be tons cheaper than doing the manual memory management.

And on top of that, manual memory management is not free. I maintain a simple but high-throughput C++ server at Google, and tcmalloc is never less than 10-15% of our profiles.

Don't get me wrong, I'm not saying that Go is faster than C++ or ever will be. I'm just trying to counter the notion that "GC is expensive, manual memory management is near zero runtime cost."


I bet that if someone who knew what they were doing decided to optimize that, you'd get the cost WAY down, possibly almost to zero. (If you are using std::string, that is your problem right there).

But the very important difference here is that in your case you have a choice and it is possible to optimize the cost away and to otherwise control the characteristics of wheyou pay this cost. In GC systems it is never possible to do this completely. You can only sort of kind of try to prevent GC. It's not just a difference in magnitude, it's a categorical difference.


Perhaps. The team is a group of seasoned veterans of high performance server engineering. But perhaps there are others who could improve on our efforts by a significant margin.

Of course we do not use std::string.


If you really really want to, you can allocate a buffer for all your data.


This solves little. What do you think the system allocator is doing under the covers?


It's doing a lot less, if you're allocating one buffer for your data instead of many.


Just curious: Have you tried jemalloc, and what numbers did you get?


We haven't. Google infrastructure uses tcmalloc. Is there a reason to believe it offers a significant win?


I'd expect similar performance but less fragmentation?, less memory used by the process if you aren't regularly calling MallocExtension::instance()->ReleaseFreeMemory() as a tcmalloc user.

The first answer at https://www.quora.com/Is-tcmalloc-stable-enough-for-producti... (by Keith Adams) is completely consistent with what I've seen. Rust went with jemalloc for some reason too.


IIRC jemalloc is somewhat better about releasing memory in a timely fashion, at least by default.


"That bit of due diligence and up front design effort is still often going to be tons cheaper than doing the manual memory management"

That's just a pipe dream. I say this having spent inordinate amounts of time trying to tune myriad parameters in JVM GC for large heap systems without ultimate success. What it always comes down to is, how much extra physical RAM you're willing to burn to get some sort of predictable and acceptable pauses for GC. It's usually an unacceptable amount.


That's just a pipe dream. I say this having spent inordinate amounts of time trying to tune myriad parameters in JVM GC for large heap systems without ultimate success.

Patient: Doctor, it hurts when I do this!

Doctor: Don't do that!

Possibly, divide your heap into smaller pieces with their own GC? Restructure your system, such that most of your heap is persistent and exempt from GC? I don't know the details of the system you're trying to build, of course. It sounds interesting and challenging.


"Possibly, divide your heap into smaller pieces with their own GC? Restructure your system"

That's the common recommendation. (resisting calling it "pat answer"). Suffice it to say, this is not always possible. Apart from all the business related issues with rewriting a complex system from scratch, breaking up a large shared memory system into smaller, communicating processes multiplies both the software complexity (roughly by O(N^2) where N is the number of new components created) as well as hardware requirements in it's own right -- think of all the overhead of marshalling/demarshalling, communication latencies, thread managements, increased missed cache-hits because of fragmenting that nice giant cache you were hosting in that big JVM heap.


I'm curious how much physical ram is an unacceptable expense to you, given how cheap it is.


Even the amount of RAM parceled out for virtual servers is an embarrassment of riches, provided you pay for something other than the bottom tier!

In the context of games, and other ones as well, I think there's too much attention paid to pushing the envelope and not enough to how much awesome can be had for what is readily available.


> That bit of due diligence and up front design effort is still often going to be tons cheaper than doing the manual memory management.

Calling shenanigans. No it's not, unless the person doing the manual solution is a novice.


Despite the drastic page limit in the category I was submitting in, I made sure to include a paragraph about how GC enable sharing and how the only reasonable alternative when implementing a similar system in a non-GC language is a lot of gratuitous copying to solve ownership issues in http://frama-c.com/u3cat/download/CuoqICFP09.pdf

(The page limit was 4. Organizers only raised it to 6 after seeing submitted papers.)

I can also confirm the “bit of due diligence” part, and the fact that it's cheaper that the aggravation of not having memory management at all. In the example that I can contribute to the discussion, the due diligence amounted to two more short articles: http://cristal.inria.fr/~doligez/publications/cuoq-doligez-m... and http://blog.frama-c.com/public/unmarshal.pdf


> GC enable sharing and how the only reasonable alternative when implementing a similar system in a non-GC language is a lot of gratuitous copying to solve ownership issues

The solution to unclear or shared ownership is generally reference counting. There's a reason why shared_ptr is called that.


With the usual set of locks, cache contention and pauses on cascade deletions of deep datastructures it brings.


You don't need locks to RC immutable structures, just atomics (and not even that if the system is single-threaded)


Reference counting is a garbage-collection system like the others (and if you are going to use a garbage-collection system, you can for many usecases do better than reference counting).


> Reference counting is a garbage-collection system like the others

Reference counting is a form of automated memory management which can easily be integrated and used in a manually-managed system, and can be used for a specific subset of the in-memory structures (again see shared_ptr). Not so for more complex garbage collection systems which tend to interact badly with manual or ownership-based memory management. Putting the lie to your assertion that the only way to implement sharing in a non-GC language is "gratuitous copying".


Yes, it's a shame that you were not a reviewer, mid-2009, of my article published in September 2009.


It's not the writing of manual memory management in the usual case/happy path that's the problem. It's the very occasional mistake and the debugging time involved. (Though to be fair, automated static analysis tools have taken great strides, and this is not as big a problem as it used to be.)

What GC often gets you is a program that doesn't crash but instead has performance problems, but these are often more easily profiled and found and less severe than a crash. (Manual memory management isn't immune from the same performance problems in any case.)

In other words, GC gets you to "Step 1 -- Get it Correct" faster so you can play with running code faster. The cost/benefit may not fit your situation. In that case, use a different tool.


> I wonder when we'll see a further GC update that trades latency for throughput...

This GC update in Go already trades latency for throughput, because of the added write barrier.

There is no free lunch in GC. Most features that reduce latency reduce throughput. For example, Azul C4 has lower throughput than HotSpot's GC does.


I hope you realize that malloc is far from free in a non-GC world right? (In a GC world allocating is just moving a pointer forward.) You pay the cost somewhere.

The CLR has also done a lot of GC work to enable concurrent GC, thread-local heaps, and "zero pause" (in reality extremely low constant time pauses).

The only way to avoid paying the cost for managing memory is to allocate everything you need once and never release it.


I hope you realize that stack allocation can replace a lot of allocation that would be done by a GC? And that having control over memory layout can lend itself to better performance? And that naively mallocing everywhere is not the only or fastest way to manually manage memory, and sometimes isn't even the easiest.


Go also has stack allocations for objects based on escape analysis; basically, if the compiler can prove that a variable doesn't escape, it is allocated on the heap, otherwise on the heap. Improvement on escape analysis in the compiler thus reduce also the heap size by allocating more things onto the stack.


Many GC'd languages also have stack allocation (see dynamic-extent in Common Lisp, for example); when talking about GC vs malloc (already an over-simplified dichotomy), we should be talking about heap allocations of indefinite extent.


Many GC languages have stack and global static memory allocation as well.

Go being one of them.

Others, Oberon family of languages, Modula-3, D, Eiffel and even .NET to a certain extent.

Having managed heap doesn't mean other allocation types aren't available.


Too true! I've written a couple of different mallocs before, and I'd recommend it as a project to anyone who thinks malloc() is just a simple, lightweight operation.

It's not an either/or choice though, picking malloc or GC. There is a whole spectrum of allocation styles you can do that might be better for a particular application. For example, a server could use per-request memory pools, which effectively can turn related mallocs into a 'move the pointer forward' operation and the whole lot can be free()d together.

I'm not saying GC is worthless. I just have a distaste for GC because it doesn't truly deliver on the promise of removing worries about memory management. You still pay the cost and can be tripped up by nasty GC performance. Even worse, the garbage collector behaviour can change between language versions and a well-tested application can suddenly hit dire performance problems. Once you have to consider GC problems, IMO you might well be better off doing old fashioned app-controlled memory allocation.


"malloc is far from free". Hehe. Pun intended?

I always thought that malloc was further from free with GC :P


> In a GC world allocating is just moving a pointer forward.

which is usually for short lived objects only, and in a non-GC world these get put on the stack anyway.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: