Hacker News new | past | comments | ask | show | jobs | submit login

I completely agree with you. Let me just add two remarks. First, although picking 9x9 boards makes connect-four intractable for bruteforce search indeed, I would be suprised if it made it much more difficult for AlphaZero, which relies on the generalization capabilities of the network anyway. Second, using a solved game for the tutorial is a feature, not a bug. This allows precise benchmarking of the resulting agent as a ground truth is known.



That's really cool and I didn't think of that. I just wanted clarification: that means you train the agent without the deterministic solution and your "validation/test" (I'm not sure what those phases are called in unsupervised learning) sets are done without the deterministic solution.


Yes, the agent is trained without access to the deterministic solution.


I did not see an evaluation of how close to perfection the agent becomes. Did you compute any sort of error rate (by finding moves that turn a won position into a non-won one or a drawn position into a lost one) ? And how this error rate drops over time as learning advances? That would indeed be very interesting to see.


My team did an implementation of alpha zero connect four a couple of years ago. Our findings are in a series of blog posts starting at https://medium.com/oracledevs/lessons-from-implementing-alph.... We didn't manage to get to perfection either on policy, but got pretty close. You can play against some versions of the network here: https://azfour.com


Your series of blog articles has been an important source of inspiration in writing AlphaZero.jl and I cite it frequently in the documentation. Thanks to you and your team!


Such an evaluation is available in the tutorial: https://jonathan-laurent.github.io/AlphaZero.jl/dev/tutorial...

Admittedly, the connect four agent is still far from perfect but there is a lot of margin for improvement as I have done very little hyperparameters tuning so far.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: