1. "Explicitly signaling your intent" takes space and time. Decoding isn't free,...

1. "Explicitly signaling your intent" takes space and time. Decoding isn't free, and even if there were no other downsides you'd want to be careful with that change.

2. It's a bit historical. Early CPUs didn't have a RAM/CPU discrepancy and didn't need pipelining. Code wasn't written to account for pipelining. As CPUs got a little faster relative to RAM, you added a few prediction mechanisms so that most consumers and workloads could actually use your brand new 2x faster gigaterrawatthournewton CPUs without having to rewrite all their code. Iterate 10-20 generations, where most software has never been written to care about branch prediction and where modern languages don't even expose the concept, and you have the current status quo.