Lets pretend alpha = 1 on a win and alpha = 0.1 on a loss.
Imagine a scenario where you play a game and the opponent plays poorly and you win; you then try and repeat the same thing again, this time the opponent has learnt from their mistakes and beats you. You'll keep playing the same losing move significantly more times because it worked that one time.
I don't know why everyone wants to second-guess the first chapter of the standard textbook in this space with what seems like no experience even thinking about this topic...