Hacker News new | past | comments | ask | show | jobs | submit login

Try doing C with a garbage collector ... it's very liberating.

Do `#include <gc.h>` then just use `GC_malloc()` instead of `malloc()` and never free. And add `-lgc` to linking. It's already there on most systems these days, lots of things use it.

You can add some efficiency by `GC_free()` in cases where you're really really sure, but it's entirely optional, and adds a lot of danger. Using `GC_malloc_atomic()` also adds efficiency, especially for large objects, if you know for sure there will be no pointers in that object (e.g. a string, buffer, image etc).

There are weak pointers if you need them. And you can add finalizers for those rare cases where you need to close a file or network connection or something when an object is GCd, rather than knowing programmatically when to do it.

But simply using `GC_malloc()` instead of `malloc()` gets you a long long way.

You can also build Boehm GC as a full transparent `malloc()` replacement, and replacing `operator new()` in C++ too.




> Try doing C with a garbage collector ... it's very liberating.

> Do `#include <gc.h>` then just use `GC_malloc()` instead of `malloc()` and never free.

Even more liberating (and dangerous!): do not even malloc, just use variable length-arrays:

    void f(float *y, float *x, int n)
    {
            float t[n];  // temporary array, destroyed at the end of scope
            ...
    }
This style forces you to alloc the memory at the outermost scope where it is visible, which is a nice thing in itself (even if you use malloc).


At first I really liked this idea, but then I realised the size of stack frames is quite limited, isn't it? So this would work for small data but perhaps not big data.


In theory, this is a compiler implementation detail. The compiler may chose to put large stacks in the heap, or to not even use a stack/heap system at all. The semantics of the language are independent of that.

In practice, stack sizes used to be quite limited and system-dependent. A modern linux system will give you several megabites of stack by default (128MB in my case, just checked in my linux mint 22 wilma). You can check it using "ulimit -all", and you can change it for your child processes using "ulimit -s SIZE_IN_KB". This is useful for your personal usage, but may pose problems when distributing your program, as you'll need to set the environment where your program runs, which may be difficult or impossible. There's no ergonomical way to do that from inside your C program, that I know of.


Its a giant peeve of mine that automatic memory management, in the C language sense of the resource being freed at the end of its lexical scope, is tied to the allocation being on the machine stack which in practice may have incredibly limited size. Gar! Why!?


Ackshually, it has nothing to do with the C language. It's an implementation choice by some compilers. A conforming implementation could give you the whole RAM and swap to your stack.


Yes, but does any implementation actually do that?

AFAIK Ada is typically more flexible, but that has to do with the language actually giving you enough facilities to avoid heap allocations in more cases - e.g. you can not only pass VLAs into a function in Ada, but also return one from a function. So it becomes idiomatic, and compilers then have to support this (usually by maintaining a second "large" stack).


Yea, usually the stack ulimit is only a few KiB for non-root processes by default on linux.

It is easy enough to increase, but it does add friction to using the software as it violates the default stack size limit on most linux installs. Not even sure why stack ulimit is a thing anymore, who cares if the data is on the stack vs the heap?


As FW engineer, i do


It isn't a practical pattern for anything beyond the most trivial applications. Consider what this would look like if you tried to write a text editor, for instance - if a user types a new line of text, where is the memory for that allocated?


Those would be the difficult questions one would be forced to confront ahead of time with this technique. That's not a bug; it's a feature!

Similar to what Ada does with access types which are lexically scoped.


The problem is that regardless of the amount of confrontation it does not have an answer for any infinite run time event-loop based program, other than "allocate all of memory into a buffer at startup and implement your own memory manager inside that".

Which just punts the problem from a mature and tested runtime library to some code you just make up on the spot.


Heap was invented for a reason, and some tasks are naturally easier to model with it.

The problem is that once it's there, people start using it as the proverbial hammer, and everything looks like a nail even if it isn't.

Note though that ""allocate all of memory into a buffer at startup" is a lot more viable if you scope it not to the start of the app, but to the entrypoint of some code that needs to make a complicated calculation. It's actually not all that uncommon to need something heap-like to store temporary data as you compute - e.g. a list or map to cache intermediary results - but which shouldn't outlive the computation. Ada access types give you exactly that - declare them inside the top-level function that's your entrypoint, allocate as needed in nested functions as they get called, and know that it'll all be cleaned up once the top-level function returns.


That works for something where the events being handled are like "serve a web page" or "compile a C function". It doesn't work for a spreadsheet or word processor or a web browser.


It would be more accurate to say that it doesn't work for some of the allocations in a spreadsheet or word processor app. Which is why you still have the global heap, but the point is that not everything needs to be on the same heap that has the same overall lifetime. That spreadsheet might still be running some algorithm that can do what it needs to do with a local heap.

And that aside, there are still many apps that are more like "serve a web page". Most console apps are like that. Many daemons are, too.


I'm not convinced it even works very well for either of those cases. It's common in many applications to return the result of a computation as an object in memory, like an array or string of arbitrary length or a treelike structure. Without the ability to allocate memory which exists after a function exits, I'm not sure how you'd do that (short of solutions which create arbitrary limits, like writing to a fixed-size buffer).


Well, yes, but I'm trying to be generous to the PoV.

My preferred solution is definitely to use the GC. With some help if you want. You can GC the nursery each time around the event loop. You can create and destroy arenas.


C with dynamic arrays and classes? Object pascal says hello…


I think one of the nice things about C is that since the language was not designed to abstract e.g.: heap is that it is really easy to replace manual memory management with GC or any other approach to manage memory, because most APIs expects to be called with `malloc()` when heap allocation is needed.

I think the only other language that has a similar property is Zig.


Odin has this too:

> Odin is a manual memory management based language. This means that Odin programmers must manage their own memory, allocations, and tracking. To aid with memory management, Odin has huge support for custom allocators, especially through the implicit context system.

https://odin-lang.org/docs/overview/#implicit-context-system


Interesting that I was thinking of a language that combined Zig and Scala to allocate memory using implicits and this looks exactly what I was thinking.

Not that I actually think this is a good idea (I think the explicitly style of Zig is better), but it is an idea nonetheless.


A lot of our Odin procs take an allocator as a required argument specifically so they force you to choose one at the call site, because yes, often you want it to be explicit.


I am thinking now a language that had the default allocator be a GC, but you could change it to something else using implicits/context and this is enfoced everywhere (e.g.: there is no way to allocate memory "from magic", you need to call the allocator, even if it is from context).

At least thinking in Scala as a base syntax here, it would mean:

    def foo(name: String)(implicit allocator: MemoryAllocator): StringBuf = {
      val buf = allocator.new(100, String)
      buf.append("Hello ")
      buf.append(name)
      buf
    }

    println(foo("John")) // implicit allocator from context, default to GC, no need to free

    implicit val allocator: MemoryAllocator = allocators.NewAllocator()
    println(foo("Alice")) // another implicit allocator, but now using the manual allocator instead of GC
    defer allocator.free()

    val explicitAllocator = allocators.NewArenaAllocator()
    foo("Marie")(explicitAllocator) // explicit allocator only for this call
    defer explicitAllocator.free()

I don't think there is any language right now that has this focus, since Odin seems to be focused in manual allocations.


>Try doing C with a garbage collector ... it's very liberating.

Doing that means that I lose some speed and I will have to wait for GC collection.

Then why shouldn't I use C# which is more productive and has libraries and frameworks that comes with batteries included that help me build functionality fast.

I thought that one of the main points of using C is speed.


Every time I've tried it, adding GC to a C program makes it faster. It will increase the RAM usage by a bit, and exactly when pauses happen is less predictable (unless you explicitly trigger a GC in e.g. your main loop), and possibly individual pauses larger, but overall program execution time -- throughput -- increases.

Malloc() and free() aren't free, and in particular free() does a lot more bookkeeping and takes more time than people realise.


C# has disgusting code style. IWillNotReadCode where variable or (and thats even worse) function/method names LookLikeThis.


Which GC is that you’re using in these examples?


I'm not OP but the most popular C GC is Boehm's: https://www.hboehm.info/gc/




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: