Am I misreading the bitmask code? It looks like (in addition to a few other idea...

masklinn · 2024-05-08T12:37:50 1715171870

64 bit architectures don't actually have 64 bit address spaces, both AMD64 and ARM64 have 48 bit address spaces by default (some CPUs have extensions you can enable to request larger address spaces e.g. LVA on ARM64 and 5LP / LA57 on AMD64 but that's opt-in on a per-process basis).

So while you have 3 bits available at the bottom of the pointer, there are 16 at the top. That's a lot more payload you can smuggle. There are even CPU extensions which tell the processor to ignore some of that (Linear Address Masking for Intel, Upper Address Ignore for AMD, and Top Byte Ignore for ARM).

vlovich123 · 2024-05-08T16:31:50 1715185910

Small correction. That's true for 4-level paging where virtual memory is capped at 256 TiB (LAM48). There's a 5-level extension that reduces the wasted space to 7 bits (LAM57) to allow for 128 PiB of virtual space.

I have no idea what the purpose of the extension is that when I don't believe you can get that much secondary storage attached to a CPU unless you go to tape which is pointless for virtual mapping.

masklinn · 2024-05-08T16:42:57 1715186577

...

I literally mentioned five-level paging in my comment.

But it's an opt-in extension, you need the kernel to support it, and then you need to ask the kernel to enable it on your behalf. So it's not much of an issue, you just have to not to use this sort of extensions (it's not the only one, ARM64 also has a similar one though it only goes up to 52 bit address spaces) with 16~19 bits of tagging.

LegionMammal978 · 2024-05-08T22:45:33 1715208333

You do need to opt-in when compiling the kernel, but on Linux it doesn't take anything particularly special on the program's side to enable it. The rule is that non-fixed mmap() will only ever yield 48-bit addresses, until the program specifies a 57-bit address as a hint, after which it can yield 57-bit addresses at will. (This rule has lots of curious consequences for the location of the vDSO, when 5-level paging is enabled and an ELF program uses big addresses for its segments.)

vlovich123 · 2024-05-09T01:00:54 1715216454

Any idea what the use case is for such large addresses? Is it RDMA or something else? Even with RDMA I find it hard to believe that companies have > 256TiB of RAM installed behind a single switch.

LegionMammal978 · 2024-05-11T00:51:05 1715388665

I recall hearing one use case where you have a whole lot of threads, and most of them will only use a little memory, but a few will need a whole lot of memory. So you assign each thread a very large segment of the address space upfront, so that they never have to coordinate with other threads at runtime. At 1 GiB of space allocated to each thread, it takes only 262k threads to use up the whole 256 TiB address space.

menaerus · 2024-05-10T09:34:26 1715333666

Good question. I also may not understand that since as of today, and to my knowledge, top of the line Intel Xeon's can support up to 4TB per socket. This means that for a 8-socket system, largest amount of physical RAM would equate to 32TB ... which is not even close to the addressable (virtual) memory on even 48-bit systems (256TB).

Sharlin · 2024-05-08T11:51:08 1715169068

The stored representation is packed such that all the stealable bits are contiguous. To get the original pointer value it’s unpacked first.

pixelesque · 2024-05-08T11:32:53 1715167973

It looks like it can use both traditional "tagged pointer" alignment bits AND virtual address bits...

couchand · 2024-05-08T11:45:24 1715168724

I see that in the readme but I don't see where that's handled in the bitmasking. It appears to be universally masking high-order bits here: https://github.com/irrustible/ointers/blob/1961f75bbb9818d72...

formerly_proven · 2024-05-08T11:59:04 1715169544

Read the function above the function you linked to.

couchand · 2024-05-08T12:38:54 1715171934

Oh that's frightening they shift the entire pointer by the alignment

db48x · 2024-05-08T13:16:13 1715174173

Why? Alignment causes some low–order bits to be zero. Shifting the pointer to the right drops those zeros on the floor, leaving high–order bits zero (or sign extended or whatever). Then you can put your tag up there instead. Shifting the value left by the same amount drops the tag and recovers the same pointer for use.

couchand · 2024-05-08T21:42:46 1715204566

My guideline with pointer tricks is: if you're not just masking bits it's too complicated for real use.

vlovich123 · 2024-05-08T16:33:19 1715185999

Any idea why it's preferred to do the alignment route instead of storing the bits in the upper part of the pointer & masking?

db48x · 2024-05-08T21:18:48 1715203128

Personal preference, architectural differences, phase of the moon, etc, etc.