What I suggest is to use the function: if v(s')==1 then 1 else the usual rule.

Eridrus · on Sept 24, 2016

Lets pretend alpha = 1 on a win and alpha = 0.1 on a loss.

Imagine a scenario where you play a game and the opponent plays poorly and you win; you then try and repeat the same thing again, this time the opponent has learnt from their mistakes and beats you. You'll keep playing the same losing move significantly more times because it worked that one time.

I don't know why everyone wants to second-guess the first chapter of the standard textbook in this space with what seems like no experience even thinking about this topic...

piedradura · on Sept 24, 2016

When you lose the value of v' change and so change the value of v.