In C, pointers require you to think deeply about the ownership and lifetime of a...

qsort · on April 23, 2021

I don't disagree with that, but most cases fall within a pretty clear pattern:

- typedef struct { ... } foo

- foo *foo_create()

- void foo_destroy(foo *)

- a bunch of functions that take foo* as their first arg

which is kind of the same as a class and only more error-prone.

I say this as someone who actually _likes_ C, but the manual memory management model is very often unnecessary, confusing, repetitive. There was an idea some time ago of a language extension that would extend the concept of automatic storage duration to allow an explicit destructor to be called when the variable goes out of scope, like <close> variables in some languages. I genuinely think things like that would make the language a bit more ergonomic without fundamentally changing its nature.

bluetomcat · on April 23, 2021

That's why I tend to always prefer automatic and static storage to dynamic allocation wherever possible, especially in cases where you don't have "N" items or "N" cannot possibly exceed a certain small value. Also, allocation/deallocation of the certain object need not be defined within its module. It should be up to the caller to decide whether to allocate the object on the stack, statically or dynamically depending on the caller's situation:

    foo f;
    foo_init(&f);
    foo_destroy(&f);
    ...
    foo *g = malloc(sizeof(foo));
    foo_init(g);
    foo_destroy(g);
    free(g);

unwind · on April 23, 2021

That is of course fine, with the drawback that it requires a public definition of the foo type.

I would write the allocation as

    foo * const g = malloc(sizeof *g);

to avoid repeating the type name and "lock" the allocation to the variable. If the type on the left hand side ever would change, this still does the right thing.

quelsolaar · on April 23, 2021

Interesting, I tend to prefer a create and destroy function that allocates and frees the structure. That way you can have foo without it being initialized, and you cant have foo been freed without being de-initialized. Where do see the value in being able to move it between different memory types?

dahart · on April 23, 2021

In games and embedded systems, a very common pattern is batched heap allocation: a single allocation of an array of foo, and/or an array of foo mixed with other types, or sometimes a memory managed data structure like a memory pool. This is one big reason that the C++ Standard Template Library was shunned in game studios for a long time; it automatically did it’s own heap allocations, sometimes at inopportune times IIRC like in constructors, and didn’t let the caller choose. EA wrote their own version (EASTL) that was game friendly because it allowed control of the allocator.

In GPU programming, there are actual different memory sub-systems, with different sizes and different performance implications, so it’s critical that the caller is able to do the allocation & deallocation any way they want. This is why most well designed GPU APIs rarely allocate GPU memory inside the API, but instead are designed to work with caller-provided pointers to buffers.

TorKlingberg · on April 23, 2021

Coming from the embedded world, I am far more used to the style bluetomcat showed where the caller handles allocation. It takes more work to use, and you have to make sure to not use it before initializing or after destroying, as you said. The advantage is that the caller has full control of memory allocation. The object can be a local variable, a static, malloc:ed, come from a custom allocator (say a slab allocator), or be a part of a larger struct.

quelsolaar · on April 23, 2021

That makes a lot of sense. I do some embedded development, but when I dont, I tend to prefer heap just because my memory debuggers are much more capable working with heap memory.

In the case you show, foo would have to be a struct that doesn't contain pointers to additional allocated memory, but its a entirely valid use case and pattern.

Calling malloc and free, has a cost associated with it, that the stack doesn't. But stack can in some cases, like with recursion be scary to use, because you dont know where it ends. If malloc returns NULL you know you have found the end and can do something reasonable.

Thanks for you insight!

krychu · on April 23, 2021

I always lacked a good discussion on the subject of allocation/deallocation strategies. Is there some recommended good read?

alexhutcheson · on April 23, 2021

It’s not standardized, but if you’re using gcc, you can use the cleanup attribute for this: https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attribute...

IIRC clang implements it as well, but I wasn’t able to find a reference to it in their docs.

AlbertoGP · on April 23, 2021

I’m actually implementing that right now for my own use, as a pre-processor.

There is a much more advanced design and implementation at “A defer mechanism for C” (December 2020): https://gustedt.wordpress.com/2020/12/14/a-defer-mechanism-f...

For my own purposes, I think I can live without handling stack unwinding so I continue working on my pre-processor.

Since the pre-processor is not yet finished, there I use a vector¹ of {.pointer, .destructor} where I put objects after initialization, with one macro at the end of each managed scope that calls the destructors for that scope in reverse order, then another macro meant for the function exit points that calls all the destructors in that vector. This has been built many times before by other people of course, it’s just an exercise to see which difficulties arise.

¹ Vector, growable array: I did my own trivial type-generic growable array, with the classic {.capacity, .len, .items}, but again there are many previous implementations. The two I’ve found more interesting are:

- “C Template Library (CTL)”: https://github.com/glouw/ctl

- “Klib: a Generic Library in C”: https://github.com/attractivechaos/klib/

tirrex · on April 23, 2021

For a vector alternative, I find this one easy to read and use :

https://github.com/tezc/sc/tree/master/array

It is just an array of your type, e.g int *numbers, so you have type info in debugger as well.

Arch-TK · on April 23, 2021

No type safety and broken alignment. No thanks.

tirrex · on April 23, 2021

Isn’t it type safe? e.g you can’t add char* to a double array.

What do you mean by broken alignment?

Arch-TK · on April 24, 2021

It lacks type safety because it's not possible to distinguish between an sc_array and a pointer so there's no way of detecting that someone passed a char * that nobody had ever called sc_array_create on to sc_array_add for example.

The alignment is broken because nothing in the C standard guarantees that the elems member of sc_array will be aligned correctly for any possible element type.

I also spotted another problem, in sc_array_init the code `void *p = a` is also not guaranteed to work. In an example snippet such as `int iv; sc_array_create(iv, 0);` expands to `sc_array_init(&iv, sizeof iv, 0)` so the type of the expression `&iv` is `int *` which is then being converted to `void *` in the function which is actually not allowed by the standard. This is also the reason why if you were writing a wrapper around realloc which exited if the allocation failed you would still have to pass in the current pointer with void * and return the new pointer with void *. This could be applied here actually as an easy fix but it indicates even further to me that the author of the library is taking a very leisurely approach to writing conforming C. This pattern also appears in the other two functions though and I'm not sure if in those cases it's something which can be easily fixed.

glouwbug · on April 23, 2021

With a bit of trickery CTL works much like the STL, in that containers can be built of containers:

    #define P
    #define T int
    #include <vec.h>

    #define T vec_int
    #include <deq.h>

A deq_vec_int - analogous to std::deque<std::vector<int>> - is a neat example.

cygx · on April 23, 2021

There's also a CTL fork that might be of interest: https://github.com/rurban/ctl

oscargrouch · on April 23, 2021

I also like the Ken Thompson extensions from which part of Go was inspired by

    struct X { int a; }
    struct Y { *X; int b; int c;}

    void add(Y* self, int number) {
      self->a += number;
    }

    Y y;
    y.a = 10; // composition
    y.add(1) // y.a = 11 now

This alone would simplify C coding so much without taking any power out of it.

The other extension i would add is some sort of interface or protocol.

As soon as you can do something like "y.add(1)", having a generic contract to refer to things without having to know its concrete type is some of the good things from the OOP world.

With this you would also be able to call some cleanup code and even a initializer.

This is still C and its still much simpler than C++, and yet almost as powerful.

C should propose these kind of things even if it was not that conservative and it would retain a lot of coders that migrate instead giving C barely evolved from its 70's roots.

Joker_vD · on April 23, 2021

Yeah, except when it doesn't follow the pattern: for example, there is no foo_destroy(), you're supposed to "just" call free() when you're done with it. Used to be very common (not sure how it is now), and very frustrating when you link against non-standard allocators.

jhgb · on April 23, 2021

> In C, pointers require you to think deeply about the ownership and lifetime of any "allocated object" at runtime.

Unless you decide you use libgc, presumably?