> It is possible the model calculates an approximate board state
Yes - this is exactly what the probes show.
One interesting aspect is that it still learns to play when trained on blocks of move sequences starting from the MIDDLE of the game, so it seems it must be incrementally inferring the board state by what's being played rather than just by tracking the moves.
Yes - this is exactly what the probes show.
One interesting aspect is that it still learns to play when trained on blocks of move sequences starting from the MIDDLE of the game, so it seems it must be incrementally inferring the board state by what's being played rather than just by tracking the moves.