BTW, Bellman actually coined the term curse of dimensionality [1]; got that confused with combinatorial explosion since it is a synonyms in the contexts I typically encounter it [2].
OpenAI has a pretty good introduction to the Bellman equations in their Spinning Up in RL lessons [3]. Sutton's work in Reinforcement Learning also talks about Bellman's work quite a bit. Though Bellman was actually studying what he called dynamic programming problems his work is now considered foundational in reinforcement learning.
Uh, and for the dual mode observations the person that brought that to my attention was Noam Brown, not Bellman or Norvig. If you haven't already checked out his work, I recommend it above both Norvig and Bellman. He has some great talks on Youtube and I consider it a shame they aren't more widely viewed [4].