Thanks! Coding would go a bit beyond the target audience here, but I do have som...

Thanks! Coding would go a bit beyond the target audience here, but I do have some examples from experience and the internet. Whenever starting on a new problem, I've found there's two steps to repeat (neither are really coding time, more so training time). The first is to run some training to see if any hyperparameters of the RL agent need to be significantly adjusted (discount factor, learning rate, etc) and the second is to actually train the best combination of agents. For initiating the training, there's very little new coding to do.

Now, if you also had to code a simulation environment for the agent to interact with, then that could be significant coding as you move to solve a new problem. Updating the state features/action space are minimal code though. Hopefully that helps clarify!

Here's a great code example of a deep Q learning agent to play Atari: https://keras.io/examples/rl/deep_q_network_breakout/