Well yeah RetDec, Hopper, Binary Ninja HLIL, Snowman, Ghidra... none of these “d...

Well yeah RetDec, Hopper, Binary Ninja HLIL, Snowman, Ghidra... none of these “direct” decompilers are going to match the input token for token or AST node for node, even without symbol names. To do so you’d need a lot of heuristics about the specific compiler, many of which are probably accidental parameters (branch layout, register picking, order of operations, vectorization, inlining). The sort of task that’s perfect for a neural network (lots of hidden parameters to learn about each compiler version) and hard for a hand written program.

Now that I’ve said it out loud I bet accuracy goes way down if you train or evaluate against a wide spectrum of compiler versions, as these hidden parameters will change.

...but those parameters also don’t entirely matter. So I think they probably need a better evaluation method.