1. "Explicitly signaling your intent" takes space and time. Decoding isn't free, and even if there were no other downsides you'd want to be careful with that change.
2. It's a bit historical. Early CPUs didn't have a RAM/CPU discrepancy and didn't need pipelining. Code wasn't written to account for pipelining. As CPUs got a little faster relative to RAM, you added a few prediction mechanisms so that most consumers and workloads could actually use your brand new 2x faster gigaterrawatthournewton CPUs without having to rewrite all their code. Iterate 10-20 generations, where most software has never been written to care about branch prediction and where modern languages don't even expose the concept, and you have the current status quo.
2. It's a bit historical. Early CPUs didn't have a RAM/CPU discrepancy and didn't need pipelining. Code wasn't written to account for pipelining. As CPUs got a little faster relative to RAM, you added a few prediction mechanisms so that most consumers and workloads could actually use your brand new 2x faster gigaterrawatthournewton CPUs without having to rewrite all their code. Iterate 10-20 generations, where most software has never been written to care about branch prediction and where modern languages don't even expose the concept, and you have the current status quo.