Hacker News new | past | comments | ask | show | jobs | submit login
Visual overview of a custom malloc() implementation (silent-tower.net)
92 points by signa11 8 months ago | hide | past | favorite | 30 comments



This was very good!

I was really happy when the first code on the page, for once, did not look like it was written by someone writing C when they would much rather write something else.

This line:

    struct article *my_article = malloc(sizeof *my_article);
Is almost exactly how I would have written it, and (to me) does three things very right:

- Does not use a pointless, wordy, bloated cast to somehow "convert" the pointer to the proper type. The point of malloc() returning a void pointer is that no such cast is needed.

- Does not hardcode a size, but uses the sizeof operator to let the compiler compute it.

- Does not repeat the type name, which is brittle and wordy, but instead (again) lets the compiler compute it from the target pointer.

I was a bit more hesitant about this line, later in the article:

    return (void *)b + sizeof(block_t) + b->size;
Not sure what language standard is targeted by the code in the article, but being able to do pointer arithmetic with void pointers is a GCC extension which at least should be mentioned.


For context, the reason why you often see explicit casts from void* to other pointer types is that C++ forbids implicit casts from void*, which is one of the few compatibility breaks that C++ did with C.

(For better or for worse, people often compile C-style code with a C++ compiler)


> which is one of the few compatibility breaks that C++ did with C

It's not "few" though, it's "quite a lot". C and C++ semantics are so different that you'll have to write in a specific "common C/C++ subset" for code that should compile both in C and C++ mode.

For all the flak that Objective-C usually gets by both C and C++ people it did one thing right: it respects C as a proper subset instead of creating its own bastard-fork like C++ did and then freezing it in the mid-90s.


I know, that's why I wrote that the code looked like it was written by someone who wanted to write in C (as opposed to someone who wanted to write in C++, a different language).


Does anyone know, why does C++ forbid implicit casts from void*?


C++ tries to have a slightly stricter type system than C. The compiler can't check anything when you cast from void*, and since C++ has other mechanisms which usually make casts from void* unnecessary (such as using 'new' instead of 'malloc' for heap allocation), I think it makes sense to require casts from void* to be explicit.


Is segmentation still viewed as terrible? There seems to be a trend towards hardware mechanisms for more fine-grained intra-process memory protection than is practical with paging. Nice intro though.


I'm not an expert in memory allocation strategies, but AFAIK fragmentation can still be a problem with a naive allocator that doesn't group small allocations (less than a page size) into buckets. You won't run out of virtual address space (as was the case with 32-bit processes on computers with much more than 4 GB physical memory), but you may run out of physical memory if the allocator is "blocking" a whole memory page from being recycled because there's a tiny live allocation in it.


Segmentation isn't fragmentation. "Segmentation" is when you have a machine with 16 bit machine words, which would usually mean you can only address up to 2^16 = 65536 bytes of memory, but you want to support up to 1 MiB of RAM so you add a segment register and read all memory reads as relative to (segment register << 4) so that you support up to 20 bit memory addresses. (The details differ of course, but this is how Intel did it.)

This results in a split between "near pointers" and "far pointers". "Near pointers" are pointers which are 16 bit and assumed to be relative to the "current segment", and can thus be dereferenced directly with any memory load instruction. "Far pointers" are 32 bit and carry a segment part and a relative part, and to dereference it, you must first write the segment part to the segment register, then do the memory load instruction with the relative part, and then you probably wanna clean up after yourself by restoring the segment register to its old state.

There were multiple segment registers too, one used for data ("heap") memory operations, one for stack memory operations, one for reading machine code, and some extra segments for convenience. You had to remember which of the segment registers your near pointer was supposed to be relative to.

If this all sounds very tedious... well, it was, and that's why, yes, segmentation is still viewed as terrible :)

(I wasn't around back when this was common, but I did write some 16-bit real mode code for a bootloader for an OS course once, and I have designed and implemented some toy 8 bit CPUs which used segmentation to have more than 256 bytes of RAM, so I do have some experience with it)


Ah right, I totally misread segmentation as fragmentation, sorry.

IMHO a more flexible segmentation system (exposed as base-pointer plus offset) wouldn't actually be bad if properly supported by high level languages. You could treat the base pointer as 'private knowledge' inside a system and only hand out the offset as a 'public handle'. To access a specific address you need both the private base pointer and public offset handle, and the base pointer could also move around without invalidating the offset handles in the wild.

Of course all this can be done purely in software too already.


Segmentation is a horrible hack to make it possible to have a larger address space than what the word size of the machine would otherwise suggest.

Partitioning the address space into regions with more fine-grained permission controls than what traditional MMUs do is fine and all, but it's orthogonal to segmented memory. Having large enough machine words to not have to think about near pointers and far pointers is great.


Unfortunately the author is violating POSIX here by defining his own _t types, which is a suffix reserved for the system. Too bad since everyone likes it.


If you're writing your own malloc(), you could argue you are legitimately part of the system implementation. Those type names are reserved for you.


POSIX is just a bunch of rules some committee made up. People aren't obligated to abide by them. They should get rid of their "reservation" rule, it's pointless since people learn by copying examples and there's tons of _t type examples out there. It's not like the POSIX police will arrest them for it.


C and POSIX are different things though. The only reason why POSIX reserves _t is to avoid potential name collisions, but POSIX moves so slowly that such collisions are rarely a problem (and on any non-UNIX system it's a non-issue anyway).

I *would* have used a prefix for namespacing though, e.g. not `block_t` but `prefix_block_t`.


Essentially everyone does this though...

Ask long as your prefix your types it's unlikely to be a problem


Do you know if any of the popular static analysis tools detect mistakes like this?


Clang has a -Wreserved-identifier warning, but that's only for reserved identifiers in the C standard, not for POSIX (e.g. names starting with double underscore or a single underscore followed by a capital letter).

The C standard has many more such adhoc reserved identifiers, for instance anything starting with `is`, `str` or `wcs` followed by a lowercase letter is technically a reserved identifier, hell, even any preprocessor macro starting with an `E` followed by a digit or uppercase letter is reserved.

Theoretically those reserved identifiers in the C standard matter more than _t because they affect all C code, not just C code targeting POSIX, yet nobody ever brings those up (because these are really only theoretical problems, and should they turn into actual problems one day they are trivial to fix).

It's important to differentiate between the C standard (e.g. what C compilers care about) and the POSIX standard (which C compilers do not care about), since not all C code runs on POSIX systems. It's only on some UNIXes where those two worlds overlap.

In the real-world, using _t typenames really is a complete non-issue.


Also unfortunate are the use of a heap segment representation, and sbrk.


I understand why the use of sbrk is unfortunate but what's wrong with the representation? Are there better ones?


The heap growing towards the stack is a property of the heap segment (used by brk/sbrk). mmap is not bound to this model, and unless used with MAP_FIXED can associate the allocation with virtual addresses anywhere in the address space.


> This means that the largest alignment requirement of any basic type (max_align_t in C++, usually 4 bytes for 32-bit architectures and 8 bytes for 64-bit architectures) is suitable for every single type in the language.

Usually alignment is a bit larger, to accommodate SSE etc. which requires larger alignment.


Could have used an article like this when I was implementing my own allocator! This is great. The Wilson1995 paper referenced is also quite useful.

The algorithm I implemented is very similar to the one implemented in this article: start with a huge block of free memory; split it into multiple blocks on allocation; merge with free neighbors on deallocation. Merging blocks is the reason why doubly linked lists are needed. It worked well enough so I didn't make it a priority to organize the free lists by size. I really should implement that already but there are so many other things to do...

It's also important to discuss the fact that implementing a memory allocator in C essentially requires undefined behavior. Arithmetic will be done with arbitrary pointers and the result will be cast to structures. Compiling with strict aliasing turned off is essentially a must. Others have described to me "pointer laundering" techniques where they pass the pointer through some inline assembly just to prevent the optimizer from making incorrect assumptions. Just turning off strict aliasing seems like a better idea. It's a stupid feature anyway. If you're writing C, you're probably aliasing pointers and reinterpreting memory and the last thing you need is the compiler getting clever about it.

Alignment was pretty difficult to wrap my head around at first, especially the universal alignment concept. It also illustrates one of the limitations of C's malloc interface:

> malloc() can simply return maximally-aligned pointers in order to accommodate any data type

This wastes quite a lot of memory when the maximum alignment is not necessary... It should also be possible to segregate free lists by alignment, right? Seems like alignment should be a parameter of the memory allocation function.

This article is also really great:

https://nullprogram.com/blog/2023/12/17/

https://news.ycombinator.com/item?id=38675379

Explores the idea of breaking away from C's legacy memory allocation interfaces. If you're like me and enjoy the idea of reinventing things from scratch, it's a great read.

Also linked from its HN discussion:

https://gist.github.com/o11c/6b08643335388bbab0228db763f9921...


Excellent, very well written.


Why are the letters ’p’, ’q’ and ’t’ missing from the article? Makes it a bit annoying to read.


I think this might be a client-side issue, since I can see those letters normally.


I also had that issue when reading the article on Safari/iOS. Switching to reader mode fixed it though.

No issue on Firefox/Linux.

Looks like a platform-specific font rendering issue?


No 'r's or '3's on Safari/Mac. It's a custom font 'Cantarell', evidently a bit broken.


> evidently a bit broken

And here I thought the author was making a fragmentation joke.


I don't see `r`s. Firefox/Mac




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: