Which, with store forwarding, can be shockingly cheap. You may not actually be h...

loeg · 2025-10-27T19:59:12 1761595152

Are you talking about context switching every handful of cycles? This is going to be extremely inefficient even with store forwarding.

ori_b · 2025-10-28T09:48:16 1761644896

Sure, and so is calling a function every handful of cycles. That's a big part of why compilers inline.

Either you're context switching often enough that store forwarding helps, or you're not spending a lot of time context switching. Either way, I would expect that you aren't waiting on L1: you put the write into a queue and move on.