Yes, internally fxch is a register rename—_and_ fxch can go in the V-pipe and ta...

ack_complete · 2024-12-29T21:08:10 1735506490

FMUL could only be issued every other cycle, which made scheduling even more annoying. Doing something like a matrix-vector multiplication was a messy game of FADD/FMUL/FXCH hot potato since for every operation one of the arguments had to be the top of the stack, so the TOS was constantly being replaced.

Compilers got pretty good at optimizing straight line math but were not as good at cases where variables needed to be kept in the stack during a loop, like a running sum. You had to get the order of exchanges just right to preserve stack order across loop iterations. The compilers at the time often had to spill to memory or use multiple FXCHs at the end of the loop.

Sesse__ · 2024-12-29T22:30:56 1735511456

> FMUL could only be issued every other cycle, which made scheduling even more annoying.

Huh, are you sure? Do you have any documentation that clarifies the rules for this? I was under the impression that something like `FMUL st, st(2) ; FXCH st(1), FMUL st, st(2)` would kick off two muls in two cycles, with no stall.

Tuna-Fish · 2024-12-29T22:52:59 1735512779

Agner Fog's manuals are clear on this. Only the last of FMUL's 3 cycles can overlap with another FMUL.

You can immediately overlap with a FADD.