Neural networks can approximate the function fine. The issue here is reinforceme...

Neural networks can approximate the function fine. The issue here is reinforcement learning works by initially random or largely random action followed by feedback to get better. This means you fail over and over, possibly millions, possibly billions of time before you ever get good. If you're learning to play chess or go or Starcraft, that's fine. If you're learning to perform high-precision industrial process control, I'm not sure you can afford that much initial waste, and it may take centuries to get through enough iterations since the process does not take place entirely inside of a computer.