Hacker News new | past | comments | ask | show | jobs | submit login

Having a hard time to parse what is the action space here.

The paper claims: AlphaStar’s action space is defined as a set of functions with typed arguments

Looking at citation 7, it seems like they are structuring the action space as (First pick high level action)->(Pick argument 1 for action)->...->(Pick argument n for action). If this is the case, this seems to be "cheating" calling this AI as humans have completely picked out the actions. That is, the achievement here this: given what humans consider useful actions, AlphaStar can play at a grandmaster level.

The achievement here is mostly engineering in my opinion. One that extends far further than the 40ish people list on the paper. Probably an effort of over 1,000 people. From casually looking over the paper, there is nothing significantly different than AlphaZero or previous art. Again, the achievement here is listed under the infrastructure section of the paper.

In summary, this is a great step forward but now we need to start developing techniques to learn these action space hierarchies instead of throwing more power at increasingly difficult games.




It is extremely different from AlphaZero... In fact, they heavily rely on human knowledge, which is like opposite of AlphaZero. To quote the paper, "We found our use of human data to be critical in achieving good performance with reinforcement learning".


Ok, you’re right. I should’ve said AlphaGo. But that in itself shows what I mean that this is almost a step backwards.


AlphaZero was miraculously good, almost to the point of straining credibility. AlphaGo and AlphaStar are more like normal advances. They are mostly engineering, although theoretical contributions are not trivial. (Using reinforcement learning for value network in case of AlphaGo, and multi-agent self-play setup in case of AlphaStar, since straight self-play doesn't work.)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: