> It is possible the model calculates an approximate board state Yes - this is e...

> It is possible the model calculates an approximate board state

Yes - this is exactly what the probes show.

One interesting aspect is that it still learns to play when trained on blocks of move sequences starting from the MIDDLE of the game, so it seems it must be incrementally inferring the board state by what's being played rather than just by tracking the moves.