Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The pervasive effects of C's malloc() and free() on C APIs (utcc.utoronto.ca)
154 points by todsacerdoti on Aug 7, 2022 | hide | past | favorite | 104 comments


I'm convinced there was another style of C API where the callee would malloc a struct, populate it, and free it immediately before returning a pointer to it. Of course, the only way the caller can use the result is after it has been freed.

Naturally there was a dependency on the exact behavior of the allocator, specifically, that it had to leave a freed block of memory untouched sufficiently long for the caller to be able to use the results. I seem to recall the stipulation was that freed memory was left untouched until the next memory allocation operation. The caller also had to be careful about using or copying the results immediately, before doing too much work.

I have dim memories of people talking about this sort in thing at university in the early 1980s; we were a 4.2 BSD shop. I also recall debugging some old C source code (srogue, which also has BSD heritage) decades later, and encountering use-after-free crashes. There were several instances of this. There were too many to be accidental; it seemed deliberate to me.

I suspect the reason for this "technique" was to relieve the caller of the burden of freeing the memory. It allowed the caller to return variable-length data easily, which couldn't be done if the pointer was to a static data area. And finally it relieved the callee of defining an explicit "free" API.

Frankly I think this is a terrible API style. However, code that used it properly and that was sufficiently careful would actually function properly. But it seems like an incredibly fragile and sloppy way to design a system.


Sounds like just a common mistake where people used the "stack" accidentally.

Ex:

    struct someStruct* badfunc(){
        struct someStruct toReturn;
        toReturn.a = foo();
        toReturn.b = bar(); 
        return &toReturn; 
    }
In most people's code, this would probably work...

    struct someStruct a = *badfunc();
    func2(); // This will overwrite the "toReturn" 
             // from the last call, but as long as the 
             // struct was copied before any other function 
             // call, you're probably fine though in
             // undefined-behavior land
-----------------

Either that, or you're talking about strtok (and other non-reentrant functions).


I’ve used this before in embedded code to save rom space by not having to include malloc. The stack _is_ the heap! Just have to be very careful about when you can call another function.

The best version of this is where you allocate a block on your stack then pass that as a pointer up to the next function to use. The one who owns the memory is the one who allocates it (Rust style?). Or, have the linker allocate global blocks works too.


Just bump the stack pointer after call to function and use the allocated memory without risk of overwrite, like alloca() does.


Question, why would this "probably work"? I would have agreed with your assessment if `a` was used for its one and only purpose before `func2` was called ... but as it stands as soon as `func` is called, the content of `a` will most likely be replaced by garbage (and therefore definitely will not "work" in any reasonable sense of the word), no?

Or do you mean something else by "probably work"? (like, in the sense that it will output "something").


Since func2 doesn't have any parameters, the only thing that calling it will put on the stack is the return address. In fact, if it's the last function call in the code block, even that might get optimized out. The struct should be intact and available via a local variable in func2.


It's because `a` is not a pointer to the result of badfunc; it is a copy. So the return value of `badfunc` might be overwritten if func2 is non-trivial, but the copy will not.


It was definitely malloc'd memory, as I remember removing free() calls from the callee and adding them to the caller.


That's the old "source/sink" pattern.

"Source" functions return a malloc'd pointer. You have to manually call free on it, or pass it to a "sink" function (which internally calls free). In C, this is seen in strdup.

C++ took this pattern and formalized it into auto_ptr<>, and later unique_ptr<> when RValue references became a thing in C++11.


Man this sounds like something some students came up with after partaking in the ganja. And it coming out of Berkeley in the 80s certainly tracks with that...

Not a good way of doing things. I mean, have fun using Valgrind. Or switching out libc, etc. And what about key material? There you would still have to do a second step of zeroing or junking the memory when done with it anyway.


Yes, grad students at Berkeley in the early 1980s. For some reason I associate this technique with Bill Joy (who obviously was a major influence on a lot of what went into the 4bsd releases). However I have no evidence of this, nor whether any or what kinds of substances might have been involved.


It also fails on multithdreaded programs, unless you add even more assumptions on the allocator.


"Multidreaded programs" :-)

This was at least a decade before multithreading in C and Unix. But yes this "technique" would have failed miserably in a multithreaded environment.


I mean, it wouldn't even work with interrupts, technically (though an interrupt that plays "nice" shouldn't allocate).


In GC languages a common approach is to have "finalizers" to make something like this a possible and sometimes convenient way of dealing with a foreign API, I wonder if what you saw was something similar? The idea is to allocate the foreign memory, then make a finalizer which is just a hook that will (eventually) call the foreign free only when some object (like a wrapper for the foreign memory) is collected. Something similar could be hidden behind some preprocessor macros, with guarantees only until the next OUR_MALLOC...

The problems for the GC languages tend to be fewer but if the wrapper is out of scope but someone grabbed and maintains a hold on the foreign memory directly, they're playing with fire as for when the GC will execute the finalizer hook and make that memory invalid. It's also a frustrating technique when foreign APIs -- particularly in certain graphics contexts -- require allocation threads to be the same as freeing threads, and of course depending on the implementation of free and the GC it might be an expensive operation to have a bunch of them suddenly happen at once when all you were expecting was a new native object and not a bunch of GC work behind the scenes.


That is why finalizers are yesterday solution, most modern GC based languages have eventually catched up with Common Lisp and offer region based resource management (try-with-resources, use, using, defer, with,...), and in some cases trailing lambdas, which completly hide the resource management from the consumer.

For scenarios like you're describing, .NET has SafeHandles for example.


Definitely not GC. This was K&R C on BSD Unix around 1984.


Sounds like it was just using the heap to badly imitate returning variable-length data on the stack under a callee-preserved calling convention. (Callee writes the variable-length data inside their own stack frame, pops the stack frame in the function epilogue, and “leaks” the pointer and size of the data in caller-expected return registers. Caller uses the dangling data — carefully not pushing to the stack until it has finished. Everything works out.)


> I'm convinced there was another style of C API where the callee would malloc a struct, populate it, and free it immediately before returning a pointer to it. Of course, the only way the caller can use the result is after it has been freed.

That's more than a bad API design, that's undefined behavior -- squarely in nasal-demons territory. Depending on the compiler, the callee, the caller, or the entire observable universe can be optimized away into a no-op.


Oh yes, totally undefined.

But consider the time frame, early 1980s K&R C on 4bsd Unix on a VAX. This predates ANSI/ISO C and Posix. It even predates “nasal demons.” There was no specification; or perhaps the implementation was the specification. The fact was that at some point the bsd allocator did leave freed memory untouched until the next memory allocation operation, and so people wrote programs that relied on this.

Again, I’m not defending this, but this seemed to be the way that some people thought about things. I even remember questioning some code that used memory after having freed it. It was explained to me that this was “safe” because the memory wouldn’t be modified until the next malloc!

Also, remember that BSD was the system where if you did

    printf("%s", NULL);
it would print “(null)” instead of getting SIGSEGV. And in general, deferencing a null pointer would return zero. The rationale for this was that it “made programs more robust.” (Again, I disagree, don’t argue with me about this!)

One more common technique from the BSD era (srogue again, but other programs did this too). To save the state of a program, write to a file everything between the base of the data segment to the “break” at the top of the data segment. To restore, just sbrk() to the right size and read it all back in, overwriting everything starting at the base of the data segment. I always found it surprising that this worked, but it worked often enough that people did sh!t like this.


Well, this sounds like it was before ANSI C, so there was no defined notion of undefined behaviour - I think that term of art came with the later standardisations. And if it was written to run on a specific OS or compiler (4BSD), one can argue that it was a really bad design, but it worked reliably on what was essentially the implementation-defined platform it was targeting.


In practice this is mostly avoidable. There are many libraries that do not allocate at all by forcing the caller to provide the memory. The library may tell in advance how much memory is necessary, or report in hindsight whether it was enough for the operation. This leads to a style of allocating ahead of time and also considering worst case memory size.

Otherwise I prefer libraries that allow setting the allocation function at least.


The especially great thing about this approach is it lets you put things on the stack when appropriate too. It's really annoying when some library forces you to spam the heap with short lived, easily scoped objects.


True. Also great for thread safety. And avoiding shared access performance pitfalls


This is the correct approach. Let the caller allocate the memory, and if necessary provide a function to calculate the required memory beforehand, and for the caller to specify the size of the memory passed to the primary function. For dynamically-sized results, another option is to provide iteration functions (like when walking a directory tree) in order to simplify memory sizing for each individual function calls.


Yes, it is educated that way. However, for the case requires some scratch space, dynamic allocations can be easier to manage, less error prune and with less overall memory usage (you don't need to preallocate worst case amount).

That's why even people knew, many C APIs still return dynamic allocated objects or simply let you inject malloc / free if you want more control.

This is a roundabout way to say: if you aspired to provide APIs with zero dynamic allocation, go ahead. But if you find yourself struggling with more complicated code as a result, think about just letting a little bit dynamic allocations may help.


There is a funny version of this:

Caller provides memory and function fails if there is to little memory, but writes the the required (variable) amount of memory to an `out length` pointer.

Effectively leading to a common pattern of:

1. call function with empty buffer (or buffer or arbitrary size)

2. allocate buffer

3. call function again with properly sized buffer

4. add a loop if between 1 & 3 the required buffer size can change

its fascinating how well it works for some of it's common use cases and how subtle but badly broken it can be for other use cases :=)


That's classic "Microsoft style" and I've always hated APIs like that because they add unnecessary complexity to the code of all their callers.


I once had the pleasure of using the Microsoft C API for creating an HMAC:

https://docs.microsoft.com/en-us/windows/win32/seccrypto/exa...

They managed to mix both patterns into a ghastly 9 function call!

The example above speaks for itself but I kind of see the "we provide you with memory you need to call 'FreeX' on" and the "call the function twice, once to find out how much memory you need to allocate" patterns to be equally annoying to use.


It’s nice if you would prefer to use a custom allocator, or put small buffers on the stack. But yeah, it is more annoying to use.


You can’t just dump ‘small’ buffers on the stack in security sensitive code; you never know how big ‘small’ can get and you have to define what happens if it’s too big anyway.


Yeah, but that is what allows Windows to keep its ABI, while extending some struct across Windows versions.

Callers allocate the right size no matter what OS version, and can only see the fields exposed on the version they were built for.


Of course this very paradigm, combined with the fact that C doesn’t have a proper pointer-and-length type, led to `gets` and around n million other security disasters when bad APIs would just take a pointer without a length and assume there’s "enough" space…

Anyway, the whole "caller allocates" concept doesn’t work too well in the particular case of `gethostbyname`, which as the author mentions, is a complex struct containing several pointers and double pointers, with a potentially unlimited number of different-length allocations one would have to make!


Caller allocates works fine. gethostbyname would effectively serialize all of the result data into the supplied buffer. All of the pointers in the struct would be internal to the buffer (they point to the serialized data) so they don't need to be freed.


Yeah, but it would be nigh impossible to guess in advance how much space would be needed. The "solution" where the caller tries again with exponentially increasing buffer sizes really doesn't fit my concept of a good API. And if the callee calculates and returns how much memory it would need, then the caller needs to heap allocate to make use of that information (or resort to nonstandard functions like `alloca`). Honestly best just to return a pointer to a dealloc function (which could simply be `free` if the struct and the data it points to are indeed allocated as a contiguous block of memory).


For things that create complex in-memory data structures, an alternative is to use callbacks instead. Users can still create such data structures if they want to but at least that way heap allocations are optional.


Please excuse my little experience with this topic, but when the article says this:

> If this structure is dynamically allocated by gethostbyname() and returned to the caller, either you need an additional API function to free it or you have to commit to what fields in the structure have to be freed separately, and how

With your approach, wouldn't you need to free all the fields in the structure separately as well? Because, what's the difference between the library allocating the memory -with the problems the article points out- and allocating it yourself?


Just an interesting aside: returning structs directly (e.g. `struct foo bar();`) was added in V7 UNIX (1979). That said, the convention for it was pretty archaic: behind the scenes PCC used a static return buffer and the caller knew to copy the struct from the returned pointer afterwards. So what looked like thread-safe code was actually totally broken by today's standards.

GCC still supports this with -fpcc-struct-return[1] (though, the modern man page doesn't seem to mention the static return buffer).

Also just because there were no threads back in the day doesn't mean static return buffers were okay. In some cases, invoked signal handlers could still call something and corrupt your statically allocated return buffer. So making any system call after receiving your static return pointer was a footgun to watch out for:

    struct foo *bar = some_lib_func();
    time(0); /* potential breakage */
[1] https://gcc.gnu.org/onlinedocs/gcc-3.2/gcc/Incompatibilities...


I think the GCC switch only changes the threshold when registers aren't used. The address of the temporary struct that the called function writes to is always passed as a hidden argument. The compiler will allocate the memory on the stack, so there's no impact on thread safety.


That’s really surprising to me to read; surely figuring out a better convention for returning a struct would never have been a challenge, right?


One slight variation on the getaddrinfo()/freeaddrinfo() approach is what (among many others) GMP [1] and its derivatives do: For every struct or custom type, you systematically get

    void type_init(type *t, [...]);
    void type_clear(type *t, [...]);
This is essentially explicit constructors and destructors in C, and one can legitimately argue that it is clunky, verbose and error-prone.

However, if we are constrained to a C API, it does have one important practical quality in my experience: Because it is always the same, it eases the mental load on both the API's user and the API's implementer, especially if there are many such types involved.

[1] See e.g. https://gmplib.org/manual/Initializing-Integers


> and one can legitimately argue that it is clunky, verbose and error-prone.

Clunky and verbose, yes, error-prone no. The APIs that do that are generally much more clear about ownership, and thus much easier, IMO, to write correct code for. Much worse is the API that returns you a pointer with no obvious mechanism to free it. Is it tied to the lifetime of an input to the function that returned it? Is it a global and this API is completely thread-unsafe? Am I leaking memory?


> explicit constructors and destructors in C

Exactly, and that’s the right way to do it. In a language without implicit destructors/finalizers, you need a way for callers to say “okay, I’m done with this thing.” And even with GC, you need finalizers to take care of non-memory resources. This may be clunky in C, but that’s what you get in a language that makes you be explicit.


CoreFoundation at Apple had a similar convention. Any memory obtained by an API named Create or Copy would have similar Delete method that always had to be called.


Especially for certain generics (e.g. hashmaps) you might want to have different allocators without creating a whole separate type: for example a global one, a threadlocal arena, etc.


This is unavoidable in any language that (supports dynamic memory allocation and) moves dynamic memory allocation into a library.

And it goes even further than the article claims: even functions that allocate a flat structure on behalf of the caller and return it should provide a companion function to free it. Reason is that the caller and the called function might have a different idea about what the memory allocator is. That’s rare on unixes, but was reasonably common on Windows with cross-DLL calls (https://codereview.stackexchange.com/questions/153559/safely...)

Also, say a DLL function returns a char pointer containing a string. How would you know whether to call free or delete on it? Or, maybe, the equivalent of free in Frob, the language that DLL happens to be written in?


> This is unavoidable in any language that (supports dynamic memory allocation and) moves dynamic memory allocation into a library.

Not necessarily. The Zig[1] standard library forces callers to provide at runtime an allocator to each data structure or function that allocates. Freeing is then handled either by calling .deinit()—a member function of the datastructure returned (a standard convention)—or, if the function returns a pointer, using the same allocator you passed in to free the returned buffer. C's problem here is it doesn't have namespaces or member functions, so there's a mix of conventions for what the freeing function should be called.

C++ allows this as well for standard library containers, although I've rarely seen it used.

> Also, say a DLL function returns a char pointer containing a string. How would you know whether to call free or delete on it? Or, maybe, the equivalent of free in Frob, the language that DLL happens to be written in?

I have to concede this one. I can't see a way out of this other than documentation.

[1]: https://ziglang.org/


If `push_back()` & friends took an (optional) extra allocator parameter, it'd be pretty ideal. It'd be nicer if the implementations were forced to be single-word containers like Stepanov wanted...


std::vector can take an allocator as a template parameter though? For a list, sure I can see that having a separate allocator per element could work, but for a vector, surely you'd want the same allocator for the entire range?

EDIT: assuming we're talking about C++, if not please don't hesitate to corrrect me.


Stateful allocators require a word be dedicated to the allocator in the header. In my use cases, I always have access to the allocator, but I need to create a lot of containers — most staying empty, to boot! Paying the extra overhead is morally objectionable.


C++ might have the advantage here then: it's a template parameter, in other words, the allocator is decided at compile-time, meaning that no additional storage space would be needed if you know in advance which allocator you intend on using.


You provide a memory allocation interface either way: it may be a special function per type, it may be a generic allocator.


> How would you know whether to call free or delete on it?

By making ownership part of the API (and ABI).

Sadly C is unable to express this, and thus so are FFI layers.


C is unable to express this as a machine-readable attribute, but you can certainly document it as part of the API contract and teach the FFI layer about it. This doesn’t scale, but an FFI layer rarely translates directly into the language’s idiom without some manual effort.


[flagged]


I'm unsure why you're saying there's rust spam here?

"lifetime" is not a rust only concept, syntactic lifetimes might be, but the idea of C APIs specifying the lifetime of a returned value is not new, novel, or rust specific.

Many C APIs have SomeLibraryCreateObject(...), SomeLibraryRetainObject(...) and SomeLibraryReleaseObject(...) - or a more basic but less flexible SomeLibraryCreateFoo() SomeLibraryFreeFoo(...).

The important thing is that the API specifies the lifetime of the returned value, idiomatic APIs do stuff like "SomeLibraryGet..." does not transfer ownership, "SomeLibraryCreate..." "SomeLibraryCopy..." etc do. Generally this works more robustly with some variation of refcounting, but you can be copy centric and say "if you want to keep this data call SomeLibraryCopy(..)".


While not common to run into bugs because of it on unix semantically its a problem all the time.

While multiple statically linked C libs normally use the same allocator, the moment you link in any other language in any way (static,.so) the guarantee is gone.

So you `dart:ffi.allocate` `C-malloc` and rust `std::alloc::alloc` might in the end all use different allocators or might happen to use the same allocator, but as long as you don't carefully control all parts involved the all bets are off.

And it can make a lot of sense to use different allocators in FFI-libraries in some use cases (mainly as a form of optimization).


On Windows it is clear, memory allocated by DLLs belongs to them, and should be deallocated by APIs exposed by them, and to play safe you should use Win32 APIs for memory management and not rely on the free()/malloc() provided by the compiler.


> and to play safe you should use Win32 APIs for memory management

Which ones? HeapAlloc/HeapFree? LocalAlloc/LocalFree? GlobalAlloc/GlobalFree? CoTaskMemAlloc/CoTaskMemFree? VirtualAlloc/VirtualFree? Something else? If the answer is HeapAlloc/HeapFree, which heap? Should you enable the low-fragmentation heap or not?


Doesn't matter to the caller, because they aren't supposed to be clever and call any of them instead of the APIs exposed by the respective DLL for resource management.

The DLL authors should better know what APIs to call internally on their own code.


> the free()/malloc() provided by the compiler

Nitpick: they're provided by the C runtime library.


Which on non-UNIX platforms means the C compiler that one bought, not necessarly from the OS vendor, as libc isn't traditionally part of the OS APIs.


I don't think there's a perfect 1:1 correspondence between compilers and libc versions on Windows today, but even if there were, it's still a distinction worth making. For example, if two libraries both statically link the C runtime, that counts as separate ones (and so a malloc in one paired with a free in the other will wreak havoc), even if they're the exact same version.


One solution I've sometimes seen or the wild is to mandate that the library allocates just one big malloc chunk and arranges pointers inside that chunk. So the caller has to free just one thing. It's more inconvenient for the library, though.


Such problem is pretty widespread and not limited by just structures, C-style strings are there too.

For example:

    const char* getenv(const char* name);
Do we need to free the string? And if "yes" then how? It is not realistic to provide free** for each such API function...

In Sciter API (https://sciter.com) I am solving this by callback functions:

    typedef void string_receiver(const char* s, size_t slen, void* tag);

    const char* getenv(const char* name, string_receiver* r, void* tag);
So getenv calls string_receiver and frees (if needed) stuff after the call.

This is a bit ugly on pure C side but it plays quite well with C++ where you can define receiver for std::string for example and define pure C++ version:

    std::string getenv(const char* name);
It would be nice for C to have code blocks a la Objective-C ( https://www.tutorialspoint.com/objective_c/objective_c_block... ), with them solution of returning data is trivial.


You don't free the pointer returned by getenv(), because the environment variables are already in memory and getenv() is just giving you a pointer to one of their values.

The most comfortable way to do a C API is to make the caller allocate space for return values (that aren't something simply copiable like int), and take a pointer to it as a parameter. The few standard library functions that malloc things are annoying, because you might not want to do that.


getenv() is just a sample. But even with it... it puts some limitation on potential API implementation and overall system performance. E.g. const char* user_password() cannot store the data anywhere, right? All that...

> The most comfortable way to do a C API is to make the caller allocate space for return values

That's even worse. How will caller know size of the buffer upfront?

With the callback approach that is trivial - you get the size on call - no need to call the API function twice - for size of the buffer and then for real copy.


Yeah, it’s easy until someone calls setenv ;)


In C the way to solve this is to look at the man page for the function and see what they say about memory allocation. There is no magic involved.


Documentation solves just one problem: to free or not to free.

But there are performance, security and other issues.

What if it is significantly more performant for getenv() (or whatever) to fetch needed data using alloca (on stack, with fallback to heap/malloc)?

Returning naked pointer is far from being flexible really.


This became a serious issue when Unix added threads (this static area isn't thread safe)

I'm not convinced it's "serious" --- thread-local-storage easily solves that.

Since this structure contains embedded pointers (including two that point to arrays of pointers), there could be quite a lot of things for the caller to call free() on (and in the right order).

Again the solution is simple: Allocate everything at once, so that free() need be called only once on the returned block.

In some ways I think the relative difficulty of using dynamic allocation in C compared to other languages is a good thing --- it forces you to think whether it's really necessary before doing so, and in many cases, it turns out not to be. That way encourages simpler, more efficient code. In contrast, other languages which make it very easy to dynamically allocate (or even do it by default) tend to cause the default efficiency of code written in them to be lower, because it's full of unnecessary dynamic allocations.


> I'm not convinced it's "serious" --- thread-local-storage easily solves that.

What about other cases of exceptional control flow? What happens if a signal arrives while that static area is being used, and the signal handler also needs to use the static area?


The set of functions that can be called from a signal handler is very small (most of them corresponding to system calls or otherwise non-stateful functions like strchr()); gethostbybame() is not one of them, and neither are malloc() nor free().


This, when you understand dynamic allocation == bad, and start grogging C, you actually realize how little you need dynamic allocation.


Well, yes. It's 1970s C technology.

There are several options. They all suck.

- Pass in a buffer to be filled by the API. The API can't check the buffer size you gave it. Be mentioned in a CERT security advisory for creating a buffer overflow vulnerability.

- Have the API give you a buffer. Reboot your system regularly to recover the memory leaks.

- Free the buffer before returning it, so the caller is using the buffer after free. Debug memory corruption bugs when someone uses an allocator which overwrites freed buffers.

This is what move semantics are for. You call something, it gives you a thing, and now it's yours to use and release. Needs language support to work well, but is the right answer.


Just one minor quibble: I'd say it's C++'s classes (not move semantics) that made doing things like this much cleaner - mostly by having a destructor that let's data structures automatically clean up after themselves (release memory) when they go out of scope. Now the developer doesn't need to know or care about whether the structure they were given has internal dynamically allocated components or not.

Move semantics is "just" an optimization that makes passing data-owning classes around more efficient. Pre C++11 you'd just return the class by reference to avoid the inefficiency of return by value, but with move semantics you can treat complex types the same as simple ones, and not worry about the efficiency of how you pass them around.


> This is what move semantics are for. You call something, it gives you a thing, and now it's yours to use and release. Needs language support to work well, but is the right answer.

Isn't this literally Rust?


I don't believe this should be downvoted, since it's an earnest question.

The answer is that Rust has affine types, which is a fancy way of saying that variables are owned exactly once. The borrow checker enforces this property, essentially by making temporary reference copies of the owned object and proving that they never outlive their backing value.

Move semantics are an implementation detail of affine typing, but can also be used (and are successfully applied) outside of purely affine languages like Rust. C++ is probably the most famous example of that, where `std::move` essentially means "extend the lifetime of this value by moving its contents to the lvalue."


There’s another way: pass a pointer to the cleanup function as an out parameter – if it’s a struct you’re returning, just return the pointer as one of the struct fields. This is, of course, how "OO" as in methods and polymorphism is sometimes simulated in C. This way you don’t need to pollute your API with a zillion different `free_foo` functions.


I’m not sure how common it is for libraries to return heap-allocated memory like this, vs. taking a pointer to an uninitialized value.


Languages with GC generally have much cleaner and simpler APIs.


Usually. Even in languages with a GC, you get into a Bring Your Own Buffer (BYOB) situation when trying to eek out performance.

As in real life, one of the best ways to cut down on waste (gc load) is to recycle.


I'm super-reluctant to go this way because it's often a bug factory. Once you are expecting the programmer to do manual GC and detect when a buffer can be reused you are losing some of the benefit. A lot of the time the answer should a better GC and a smarter compiler. I realize that in practice that's not always available.


I want to say that it should only be used after profiling and determining that it's a major cause off GC pressure, however I think there are other times when it's obvious that a private scratch buffer would be appropriate.

For example, when marshaling an object before writing it to a file, makes sense to write it to a scratch buffer before writing it to the file. It's generally on the order of trivial to keep said buffer encapsulated to prevent any caller from dealing with potential pitfalls.

Having clear ownership of the buffer is a big benefit to help reduce any potential issues. You're correct that as a GC approaches perfect, there ceases to be a need for it.


Ruby 3.1 provides a built-in middle ground for this use case: https://docs.ruby-lang.org/en/master/IO/Buffer.html


Would a PointerScope pattern work well here?

Functions that need to return dynamically sized things take a *PointerScope parameter, which is a struct with two function pointers: malloc() and close().

The api function uses the provided malloc, the caller calls close when they’re done with the returned data.

This can then be used with a bunch of calls (and by the caller itself), and then at the end all the allocations get freed with one simple call.

Obviously the PointerScope would internally need to keep track of what was allocated, so it’s a far from simple struct&code combo, but should provide decent user ergonomics?

Add a create() function too, to create a new PointerScope and you have an entire malloc/free replacing resource management paradigm! ;)


This author must have buried their head in the sand before C++ destructors.

The need to have an opposite function to constructing a new object isn't coming from malloc; it's a resource management problem. fclose(f) doesn't just free FILE *f; it releases some operating system file handle/descriptor, and also removes the object from some global list. (There is one because fflusn(NULL) or process termination is somehow able to flush all streams.)


It gets even nastier with systems that care about where memory is allocated. Sometimes you really want to use a particular slab of memory for everything and not just whatever malloc happens to return. You could pass in function pointers or something for malloc/free, but that makes your API even messier.


The flip side of this is that as long as your code is valgrind clean, you can get very easy-to-debug code, because C is easy to reason about. While you can certainly do better if very careful, I would prefer to work with C and valgrind than, e.g. most C++ code.


I don’t see how memory allocation is relevant for that. I have seen spaghetti C code (everything is a function pointer) and beautifully readable C++, and vice versa.

After a point, it is just personal preference, but in my opinion good C++ code is much much better than good C code. The former actually has tools to express many things about the code.


[flagged]


Sure, it is not for script kiddies, and for the usual new 'web' developers that are now living in easy dev sandboxes. But, now, most applications are really mediocre. Everything uses hundreds of MB or GB of memory just for simple programs...

Do you realize how snobbish this sounds?


Sometimes the truth hurts, but no need to shoot the messenger...

If you don't believe me, look around you:

- Gmail? Goes easily to >300 MB of RAM usage when doing nothing

- Android apps? You can barely have 2 calculator apps active at the same time, on a 8 cores, 8GB ram device. The own Google "sms app" takes "180 MB" of storage...

- Let's talk about Wordpress or electron apps for PC that easily consume 50% of a quad core high end laptop?

Just as a point of comparison, in the 80's when gethostbyname was created, supercomputer were at most capable of having CPU of 200MHz. Fast forward 40 years, you have now in your pocket a computer that is easily a hundred times faster. Even without taking into account the processing power of GPU, network cards, ... But still, software quality did not follow the same trend.


You're comparing apples and oranges.

None of the examples you listed (Gmail, Wordpress, Electron, Android) have draconian resource constraints, and developers put a heavy emphasis on shipping new features and ease-of-use. The cost of customers lost due to resource bloat is far lower than engineer hours put into optimization - the reality is that customers simply do not care.

For niches that are still constrained, the "quality" you are reffering to is called "efficiency" and it didn't go anywhere since the 80s (in fact it improved due to superior tooling)

I've worked in embedded where I had to come up with ways to shave bytes and kilobytes, now I work in web development and I absolutely don't care about performance. My server costs me 15 bucks a month, I can double it 5 more times before optimization becomes remotely profitable for me.


Like you, no one care anymore, and this is why everything is getting shitty.

Device works less well and are throwable, users have miserable experiences. Huge amount electricity and resources are wasted.


It sounds based.


Well the example of the Linux kernel is an extremely bad one because it is absolutely stuffed full of memory management bugs on error paths. It will leak from `goto` beyond the deallocation, or it will free memory that was never allocated if it branched over the allocation. When pressed to its limits, for example by running a container out of memory, the linux kernel memory management falls to pieces.


And yet, I can have years of server uptime, with high usage, but with constant kernel memory usage...


Sure, if you don't hit any of its error paths it is fairly happy. A really easy way to watch Linux go down in flames is to use kmem accounting (which is enabled by default) and a cgroup with a low memory limit (say, 48MB) and then inside that cgroup do something that exercises the kernel a lot, like walk a filesystem.


I've always wondered why there isn't a `free_all()` function (that I'm aware of) for exactly ensuring that handles this. Or why you can't probe memory, with something like `is_alloc()` (is allocated).

I usually define my own version of `free()` to check whether the pointer is NULL, free the memory if not, and then set the pointer to NULL. That way if your pointer isn't NULL, it should be pointing somewhere. I believe there are some caveats though, specifically around OOM allocations as memory isn't truly allocated until you go to access it.

C memory is generally quite cool to work with, but those tripping points really will trip you up. It's exceptionally easy to have stuff lingering around indefinitely, even worse when it happens in a loop.


lolwat? How would you ever use that in a sane way?

is_alloc: If you have to call is_alloc because you don't know and you get `true` back, you still don't know anything, it could be your memory is live or it could be something else was allocated over your stale pointer.

free_all: This is just ridiculous. Now the heap is completely unusable because your memory could be freed from under you at any time by a concurrent free_all.


> is_alloc: If you have to call is_alloc because you don't know and you get `true` back, you still don't know anything, it could be your memory is live or it could be something else was allocated over your stale pointer.

The answer I want is a simple one - is _this_ pointer currently pointing to somewhere with allocated memory? The goal would be to avoid double free, or double allocating (and therefore memory leaks).

> free_all: This is just ridiculous. Now the heap is completely unusable because your memory could be freed from under you at any time by a concurrent free_all.

I think you misunderstand. You could have a (contrived) structure like:

    typedef struct{ char* a, char** b } s;
You then have the following allocations (not tested):

    s* data = (s*)malloc(sizeof(s));
    data->a = (char*)malloc(n * sizeof(char));
    data->b = (char**)malloc(l * sizeof(char*));
    for(int x = 0; x < l; x++) data->b[x] = (char*)malloc(n * sizeof(char));
Rather than free each of these allocations individually, you would instead have `free_all(data)` (might not be a great name). At compile time the compiler would look at the structure and expand out to be all the free operations as required. Of course you would need the `is_alloc` to test if somebody actually did allocate memory.

I would then imagine a common pattern would be:

    s* data = NULL; // Indicate we point to nothing
    /* Allocate */
    free_all(data); // Free associated memory
    data = NULL;    // Indicate no memory allocated
Of course one trap would be:

    s* data = (s*)malloc(sizeof(s));
    data->a = y;
    data->b = z;
    free_all(data);
Then one would have to consider whether y and z and free'd or not.


> is _this_ pointer currently pointing to somewhere with allocated memory? The goal would be to avoid double free, or double allocating

This doesn't work because you don't know who allocated that memory.

Thread A: free(p)

Thread B: malloc(...) -> p (happens to get the same address)

Thread A: is_alloc(p) -> true

Thread A: free(p) thinking it's not a double free because is_alloc returned true

This scenario is possible because if it wasn't, you wouldn't need is_alloc in the first place. You only "need" it if you've lost track of your memory and have no clue what is allocated and what isn't and you're trying to solve the problem (the wrong way) with these runtime checks.

> I think you misunderstand. You could have a (contrived) structure like: [...]

> Rather than free each of these allocations individually, you would instead have `free_all(data)` (might not be a great name). At compile time the compiler would look at the structure and expand out to be all the free operations as required. Of course you would need the `is_alloc` to test if somebody actually did allocate memory.

If you're going to change the language anyway (changing what the compiler does), just use C++ with unique_ptrs and they do precisely that without is_alloc.

If pointer cycles are a concern (since we're only doing this to avoid proper pointer hygiene in the first place), you could allocate everything within `s` from a private arena and just free the arena.


> This doesn't work because you don't know who allocated that memory.

I obviously haven't fully fleshed it out (I'm not writing an RFC here), but address re-use would be a consideration as you mention.

> You only "need" it if you've lost track of your memory and have no clue what is allocated and what isn't and you're trying to solve the problem (the wrong way) with these runtime checks.

No, you can do:

    char* a = (char*)malloc(/**/);
    if(a == NULL) printf("error"); // Never runs
    /* Any other checks you want to perform */
    /* Some processing later */
    a[0] = 'a'; // Crash here
In this case you have done nothing wrong. At the time of requesting memory there was enough space and the kernel said that you could have it. It's only when you come to actually access it did you find that it wasn't really allocated yet and there was no longer enough memory there for it.

> If you're going to change the language anyway (changing what the compiler does), just use C++ with unique_ptrs and they do precisely that without is_alloc.

You don't have to change the way in which the compiler works. You can likely do this with some macros. You would need some way to probe memory allocations though.


> I obviously haven't fully fleshed it out (I'm not writing an RFC here)

Whatever you don't state explicitly is assumed to be like the status quo. You've completely moved the goalposts with each reply.

> No, you can do [...]

Which a simple is_alloc doesn't help with because the allocator doesn't know if the kernel has actually mapped the memory to a physical page.

This requires help from the kernel, which wasn't stated anywhere, nor was this goal stated anywhere.

> You don't have to change the way in which the compiler works. You can likely do this with some macros.

You can't reflect over a struct in C with macros. You can maybe build something that works with some macro hacks that require you to declare each auto-cleaned field with a special macro.

This again is completely different from your previously stated idea that you just declare a struct with raw pointers and the compiler generates the appropriate free() calls.


On modern OSes, of course, there is a free_all: it's called _Exit, and sensibly prevents your program from doing any further work.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: