Every time I've seen similar done in the past, it has come back to hurt the inst...

masklinn · 2024-05-08T12:41:09 1715172069

Apple's been using that for more than a decade: https://www.mikeash.com/pyblog/friday-qa-2012-07-27-lets-bui... http://www.sealiesoftware.com/blog/archive/2013/09/24/objc_e...

OCaml's also been using tagging to mark unboxed type pretty much from the start, nearly 30 years ago: https://dev.realworldocaml.org/runtime-memory-layout.html#:~....

duskwuff · 2024-05-08T18:18:13 1715192293

And it's not like Apple isn't aware of the dangers, either; they used top-byte tagging on the 68000, and it took them a lot of time and effort to migrate away from that to support 16+ MB systems.

gpderetta · 2024-05-08T19:58:51 1715198331

IIRC the issue was that 68000 would transparently mask the top bits, so code could get away with avoiding the explicit masking, which of course breaks when you move to a larger address space.

More modern cpus enforce specific bit patterns for the MSBs, so broken code would be caught immediately.

jstimpfle · 2024-05-08T13:32:00 1715175120

It's common for libraries to use the 2 lowest bits in tree structures e.g. RB tree. It's not a problem at all since the data structure is controlled by the library. Even if tree nodes are allocated/provided by the user, having a "4-byte aligned" requirement in the API contract isn't a problem at all -- in fact you need to work quite hard to allocate a structure with a pointer that isn't at least 4 byte aligned (i386), or 8 byte aligned (x64).

a1369209993 · 2024-05-08T17:51:32 1715190692

> in fact you need to work quite hard to allocate a structure with a pointer that isn't at least 4 byte aligned (i386), or 8 byte aligned (x64).

Well, no, actually; it's:

  p = malloc(size+1)+1;

It's just quite implausible that you'd do that by accident.

menaerus · 2024-05-08T21:42:10 1715204530

Well no it isn't since malloc returns a void* and incrementing a void* is impossible for a compiler to figure out the offset that it needs to increment address for. For that operation to succeed/compile you need to cast the result of malloc first to p*. This means that the resulting alignment will still be correct.

The actual danger is in reinterpreting the pointers, or thereof some arbitrary chunk of memory, to a type that doesn't necessarily satisfy the alignment requirements for that given memory address.

jstimpfle · 2024-05-09T08:19:33 1715242773

gcc will compile this code and advance the pointer address by 1 byte. Yes, it's not standard conformant.

My point though is that this is unlikely to happen by accident. The code above is clearly constructed. It's more work to type it than the correct version. The code makes no sense at all. 1 byte is wasted and it's in general a bad idea to allocate unaligned pointers.

And when you for example provide your own allocator for nodes, you'll probably do it like this:

   Treenode treenode_pool[MAX_NODES];

or this:

   Treenode *treenode_pool = allocate(Treenode, node_count);

And the memory will be aligned properly. Even when you use a plain malloc (no type or alignment information given to the allocator) you will get memory that is at least pointer-aligned.

menaerus · 2024-05-09T09:05:52 1715245552

> gcc will compile this code and advance the pointer address by 1 byte. Yes, it's not standard conformant.

https://godbolt.org/z/cjoYvjxGq

    <source>: In function 'void foo(int)':
    <source>:4:28: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith]
        4 |     int * p = malloc(size) + 1;
          |               ~~~~~~~~~~~~~^~~
    <source>:4:28: error: invalid conversion from 'void*' to 'int*' [-fpermissive]
        4 |     int \* p = malloc(size) + 1;
          |               ~~~~~~~~~~~~~^~~
          |                            |
          |                            void\*
    Compiler returned: 1

jstimpfle · 2024-05-09T10:30:37 1715250637

first, yes it did accept the pointer arithmetic even though it issued a warning.

Second, the error about implicit conversion from void-pointer to typed-pointer is in C++ only -- not in C, where you won't even get a warning. The error happens because you didn't cast the void pointer to the target type and doesn't have to do anything with the fact that gcc _does_ let you (as a GCC extension) do arithmetic on void pointers.

If you remove the "+ 1" you'll still see the same error in C++. In C++, change the target pointer to a void pointer:

   void *ptr = malloc(42) + 1;

or, alternatively, cast the RHS expression properly (doesn't matter with "+ 1" or without)

   int *ptr = (int *) (malloc(42) + 1);

and the error will go away.

menaerus · 2024-05-09T10:49:27 1715251767

I don't think there's a need to explain to me how pointers work, or what to do to make this error/warning go away, since it should be pretty obvious from my response that this is something I know very well.

I was only disputing the claim that ending up with the unaligned ptr is not trivial for given line of code by providing explanation, and counter-example, why this is not possible. At least not in C++ because that's the language I had in mind.

What you're saying is that it's possible to do void pointer arithmetic in C by using the GNU extension - fine, I can't disagree with that.

jstimpfle · 2024-05-09T13:02:39 1715259759

You corrected me and I corrected you. No need to take personal offense. I don't care what you know -- I obviously took the compiler output that you presented above as a refutal to what I said earlier, and I pointed out why it's not a valid refutal.

db48x · 2024-05-08T11:42:01 1715168521

Every lisp system that ever existed uses this technique, and it never hurt them any. Emacs or SBCL or Open Genera; they all work perfectly well.

gpderetta · 2024-05-08T12:54:50 1715172890

And the modern take on this is NaN Boxing!

saagarjha · 2024-05-09T00:10:53 1715213453

I think you and I have a different opinion on what “modern” means ;)

gpderetta · 2024-05-09T00:13:13 1715213593

This millennium :D

rjsw · 2024-05-08T11:55:20 1715169320

Maclisp on PDP-10 and Franz Lisp didn't, they used bibop type checking instead of tagged pointers.

spacechild1 · 2024-05-09T00:40:44 1715215244

> Franz Lisp

Off topic, but that's probably the silliest name for a programming language I've seen. I love it!

db48x · 2024-05-08T12:08:58 1715170138

That’s merely an elaboration on the same theme.

rjsw · 2024-05-08T12:42:53 1715172173

It isn't, pointers in Franz Lisp can be used unmodified in C.

db48x · 2024-05-08T12:58:44 1715173124

But those pointers still have bits in them that indicate the type of the object that they point to. It is merely the case that the tag bits are chosen at run time so that they correspond to allocated memory regions, rather than being chosen ahead of time.

rjsw · 2024-05-08T13:19:47 1715174387

None of the bits in the pointer have been "stolen" though, which is the point of the referenced article, they are all still available to access the full address space of the CPU.

db48x · 2024-05-08T19:58:34 1715198314

That doesn't mean that they aren’t still there, and don't tell you the type. It’s still a tagged pointer, and because all objects of the same type must be allocated out of the same page (or collection of pages), it still limits how much memory can be dedicated to objects of that type.

lmm · 2024-05-08T11:50:04 1715169004

> Every lisp system that ever existed uses this technique, and it never hurt them any.

Wasn't it part of the reason they ended up with poor "mechanical sympathy" on regular PCs, and got a bad performance reputation as a result?

db48x · 2024-05-08T12:04:06 1715169846

No. Their reputation for poor performance mostly arises because the default data structure is a singly–linked list. Although the language makes these lists extremely convenient for the programmer to use, linked lists invariably result in terrible cache locality. High–performance systems that are written in lisp (and they do exist), achieve that performance by avoiding lists and using vectors and arenas instead.

Of course, it must be said that this performance penalty rarely matters in practice. For most programs, the limiting factor is not performance but the cost of implementation and maintenance. Lisp has has many powerful and convenient features that make implementation faster and surer than in any other language, and which usually greatly lower the cost of maintenance as well. Thus by writing a new program in Lisp you can get it off the ground and earning you money faster than you could in almost any other language. You can also add features more cheaply as time goes on, allowing you to keep up with your competitors. It is only when your system is nearing completion, and all necessary features have been invented and implemented, that you should think about rewriting select parts of that system in a systems language where more efficiency is possible. Not only will you only understand the system after it is written, but you can measure the actual performance of the system to figure out which parts you need to rewrite, and which parts have acceptable performance already and thus do not need to be rewritten.

pfdietz · 2024-05-08T12:59:37 1715173177

Cache locality is improved if the Lisp has a compacting garbage collector, but you still need extra memory for the cdr links. Lisp Machines had cdr-coding that allowed those to be avoided but I don't think that's practical on stock architectures.

db48x · 2024-05-08T13:19:04 1715174344

True, but don’t forget that one of the _other_ causes of performance problems in lisp programs is the creation and subsequent collection of garbage. If you know ahead of time which of your data needs to be accessed or created by the most performance–sensitive parts of your program, then you can put that data into vectors to start with and avoid all of the extra work.

pfdietz · 2024-05-08T13:46:06 1715175966

Long-lived things end up in older generations and aren't traversed much during GC, which is the moral equivalent of being statically allocated.

There's overhead if you mutate old objects in a GCed system due to card marking.

Lisp vectors are vectors of pointers, so there's still an overhead of dereferencing through it. Presumably the objects being pointed to end up compacted together, eventually.

db48x · 2024-05-08T19:52:08 1715197928

I should have been more clear; I meant using vectors as arena allocators, so that all of your data ends up stored contiguously with no extra indirection.

mik1998 · 2024-05-08T14:26:11 1715178371

> linked lists invariably result in terrible cache locality

With a compacting GC (ie most of the relevant ones) lists often provide better memory locality than arrays. They only lose at a large scale due to having +8 bytes of memory, which makes arrays occasionally cheaper for large sequences.

gpderetta · 2024-05-08T20:05:14 1715198714

Even with perfect cache locality, a linked list will be significantly slower to traverse than an array because of worse pipelining and parallelization. Partially unrolled lists can help, but at the cost of additional complexity.

Of course not all container accesses require traversal.

nequo · 2024-05-08T19:16:29 1715195789

Why do lists provide better memory locality than arrays? The array already stores its elements next to each other in memory. The compacting GC helps arrange the elements of the list close to each other too, right? Then wouldn’t the compacting GC at best even out the performance difference for sequential traversal?

jxf · 2024-05-08T12:17:42 1715170662

> For most programs, the limiting factor is not performance but the cost of implementation and maintenance.

The limiting factor for what? Their commercial success, or something else?

db48x · 2024-05-08T12:57:21 1715173041

Success, yes. It need not be direct commercial success; lots of programs are written purely for internal use within some larger organization. And even when there are known performance requirements, Lisp is often still a good choice because the performance is handled by specialized hardware. Consider games, for example. Time to market and rapid iteration are both paramount, while performance is taken care of not by writing a fast renderer but by sending everything to the GPU to be rendered on specialized hardware.

samatman · 2024-05-08T15:45:26 1715183126

It's not. Two counterexamples: LuaJIT's interpreter/VM, and Wren's, both use NaN boxing, and are among the fastest VMs out there (we're ignoring JITs as out of scope).

It isn't tagging pointers that makes things (some Lisps are 'things', some are not) slow: it's pervasive abstraction and indirection. Doing some masking on a pointer is pipelined, out-of-order, one to three cycles, and is absolutely dwarfed by cache misses and conditional mispredictions, let alone the sort of pointer chasing which is common in idiomatic Lisp (or indeed Python) code.

bitwize · 2024-05-08T12:08:47 1715170127

I don't know what you mean. Maybe in the early early 80s that was the case, but PCs were still 16-bit back then, and would've been a poor fit for Lisp anyway.

One of the reasons why the Lisp machines died out is that round about the mid-80s or so, compiler technology improved for generic 32-bit processors, and it became possible to run Lisp software on a VAX, 68k, or RISC CPU faster than it ran on the Lisp machines' bespoke architecture. Back during the first AI hypecycle, the makers of Golden Common Lisp introduced the Hummingboard to the market, billed as an inexpensive solution to have a "Lisp machine in a PC". It was just a 386 on an ISA card (predating Compaq's 386 PC by about a year) with gobs of memory on board; a special build of Golden Common Lisp allowed code to be run on that CPU rather than the main one.

lispm · 2024-05-08T14:41:20 1715179280

Symbolics had its own PC implementation of Lisp: CLOE.

From the Lisp FAQ: "CLOE (Common Lisp Operating Environment) is a cross-development environment for IBM PCs (MSDOS) and Symbolics Genera. It includes CLOS, condition error system, generational garbage collection, incremental compilation, code time/space profiling, and a stack-frame debugger. It costs from $625 to $4000 and requires 4-8mn RAM and a 386 processor. "

Later they also ported Genera to DEC Alpha. Currently I have access to an implementation which runs on ARM64 and Intel x64.

coldtea · 2024-05-08T13:18:28 1715174308

>One of the reasons why the Lisp machines died out is that round about the mid-80s or so, compiler technology improved for generic 32-bit processors, and it became possible to run Lisp software on a VAX, 68k, or RISC CPU faster than it ran on the Lisp machines' bespoke architecture.

I'd say Lisp machines died out because Lisp died out (commercially). Other languages got popular, the second AI winter didn't help at all either.

If Lisp itself had fared better (even if it was on generic hardware), Lisp machines could have been saved too, they could still use a VAX, 68k, or RISC CPU underneath, and optimize for the developer experience. Or they'd have turned Lisp machines from hardware into a "Lisp machine" OS or IDE/REPL/etc environment for other OSes succeeding. But none of that took off.

bitwize · 2024-05-08T13:57:44 1715176664

> Or they'd have turned Lisp machines from hardware into a "Lisp machine" OS or IDE/REPL/etc environment for other OSes succeeding.

You mean like Allegro CL? LispWorks? Genera on DEC Ultrix?

With the exception of Genera, these solutions are available, maintained, and supported today. Even Genera hung on for a good few years. Lisp machines started to wane a few years before the AI winter hit in full force, because the idea of dedicated hardware to run Lisp programs made sense when your alternative was Maclisp struggling on a PDP-10, but became expensive, slow, and fiddly compared to the generic boxes running fast 32-bit processors with, again, much improved compiler tech. Genera on DEC Alpha, even as an interpreted VM, was so much faster than any of Symbolics's bespoke CPU architectures that Symbolics just quit making hardware and declared the Alpha version of Genera the official upgrade path.

rjsw · 2024-05-08T14:53:45 1715180025

One road not taken would have been to port the MIT/LMI/TI Lisp Machine software to stock hardware, it was 32-bit until the end so didn't need an Alpha as the host. A fast 68020 with a custom MMU would have been fine. There was only around 12k of "assembler" to rewrite to provide the VM and simple RTOS.

coldtea · 2024-05-09T07:41:33 1715240493

>You mean like Allegro CL? LispWorks? Genera on DEC Ultrix?

None of these cover my whole description, namely the last part:

"Or they'd have turned Lisp machines from hardware into a "Lisp machine" OS or IDE/REPL/etc environment for other OSes SUCCEEDING".

I wasn't talking about mere existance of those.

My argument was "if the issue was not Lisp falling in adoption in general, but merely Lisp Machines being dedicated hardware (as the parent claimed), then Lisp OSes for generic hardware and Lisp IDEs/envs for popular OSes would have succeeded.

lispm · 2024-05-09T09:12:02 1715245922

> Or they'd have turned Lisp machines from hardware into a "Lisp machine" OS or IDE/REPL/etc environment for other OSes succeeding.

That was a very different business model, very different offering into a different market. One could not move that business model easily, in an existing company to a different market. There were offices, factories, contracts, ... -> costs to get rid of. The thing collapsed, before anything usefully could be scaled down.

For example the Symbolics graphics products were very expensive, just the software. A new owner ported it to SGI and Windows NT machines. To be a technically viable product it used a commercial Lisp vendor. It survived a while in that market (modelling/animation/game tools for game studios, animation studios, ...), changed owners again and then died.

Lisp (the forbidden word during/after the AI Winter) was a part of the problem, but generally a new business model and customers for it wasn't found/searched. For example TI just closed its AI business and never cared about it from then on.

Something like Genera was a huge pile of Lisp code written during 1.5 decades. During its best times the OS and maintenance upgrades were already more expensive than a PC.

Applications from the Lisp Machine were ported away. One no longer needed the OS and no longer had the OS. Some applications died, some survived, some died later.

Some applications (or the development environment) survived for some years on emulators (-> Interlisp-D was ported to SUNs and PCs, Genera was ported to DEC Alpha).

skissane · 2024-05-09T09:41:19 1715247679

> Something like Genera was a huge pile of Lisp code written during 1.5 decades. During its best times the OS and maintenance upgrades were already more expensive than a PC.

Genera's biggest contemporary problem is that John C. Mallery seems to want to just sit on it rather than make it available to people.

Likely future revenues must be close to nil, so why not open source it? Apparently he talked about it years ago but never actually has.

Yes, a lot of the code is really dated, but if it were open source, maybe some people might freshen some of it up.

lispm · 2024-05-09T10:13:58 1715249638

The MIT has open sourced their Lisp Machine software.

https://tumbleweed.nu/lm-3/

bitwize · 2024-05-10T08:36:11 1715330171

Franz and LispWorks are still in business through several boom-bust cycles. You never specified, what are your criteria for success? Some market share percentage? My point was, the second AI winter happened, but interest in Lisp machines waned first, because Lisp on generic processors reached a point where they could run the same code faster for less money.

zozbot234 · 2024-05-08T13:46:40 1715176000

> Or they'd have turned Lisp machines from hardware into a "Lisp machine" OS

Emacs is that Lisp OS. But it's still lacking a good text editor.

pfdietz · 2024-05-08T19:31:55 1715196715

Ok, that made me laugh.

skissane · 2024-05-09T03:08:40 1715224120

> Maybe in the early early 80s that was the case, but PCs were still 16-bit back then, and would've been a poor fit for Lisp anyway.

There were LISPs for 16 bit MS-DOS. The most famous was arguably XLISP, which was used as the basis for AutoCAD’s AutoLISP extension language, and also the XLISPSTAT statistics package. Another was muLISP, which was resold by Microsoft as Microsoft LISP, and also used as the basis of the muMATH computer algebra system, and also its successor Derive.

bitwize · 2024-05-09T04:43:33 1715229813

I'm familiar with Xlisp, Microsoft Lisp, and PC-Scheme. What I'm not sure about is how performant they were, or whether large systems could be built with them.

skissane · 2024-05-09T09:35:30 1715247330

> What I'm not sure about is how performant they were,

They were sufficiently performant to be used in anger. I mentioned some of the real world uses

> or whether large systems could be built with them

I mentioned the muMATH computer algebra system, built with muLISP. I think that would count as a large system, at least by the standards of the time.

lispm · 2024-05-08T14:50:12 1715179812

No, the reason for less than stellar performance of Lisp on the 32bit x86 CPUs were things like too few registers.

On 64bit Intel CPUs this is much less of a problem and Lisp runs roughly twice as fast in 64bit mode. I would think that this even holds without taking advantage of wider data words.

On other architectures this was less of a problem. 32bit CPUs had more registers on a RISC or some other CISC CPUs.

_flux · 2024-05-08T11:49:18 1715168958

OCaml uses the lowest bit, which is nice because (aligned) pointers work as-is.

mbrubeck · 2024-05-08T15:46:21 1715183181

This bitvec I wrote for Servo and Gecko uses this technique, and has been shipping in Firefox releases for over six years with no bugs found:

https://docs.rs/smallbitvec/

I'm pretty sure you can find several more examples in Firefox, as well as other major browsers.

junon · 2024-05-08T12:03:48 1715169828

VM writers use this egregiously. I've not heard of it causing issues. It's not that complicated.

Lua, Lisps, many others.

titzer · 2024-05-08T13:27:31 1715174851

All fast JavaScript VMs use some form of pointer tagging. V8 uses (compressed) tagged pointers and both JSC and Firefox used NaN-boxing.

WasmGC has a i31ref as part of its reference hierarchy, which is implemented with pointer tagging.

pixelesque · 2024-05-08T11:24:59 1715167499

I've used tagged pointers with no issues at all over the years to "smuggle" / store values in...

coldtea · 2024-05-08T13:13:08 1715173988

This "trick" has been used since the dawn of time in major platforms

adql · 2024-05-08T14:14:42 1715177682

I dunno, pointers were kinda small on 8/16 bit platforms

coldtea · 2024-05-09T07:38:06 1715240286

Smalltalk for example used 1 bit (or 2 bit?) tagged pointers

adql · 2024-05-08T14:14:15 1715177655

Any example ?

speed_spread · 2024-05-08T11:30:36 1715167836

If you build your whole app with it, yeah. If you use it tactically for a certain data type it can be very nice. Just don't try to dissimulate it's specialness.