Great question! I would say the main reason why this work is significant is that StarCraft agent was bootstrapped from human replays, and Dota agent did not have game-theoretic guarantees on minimizing exploitability (i.e. how far the strategy is from Nash equilibrium). The R-NaD algorithm with neural nets (behind Stratego) starts from scratch, and has game-theoretic guarantees.
In principle, the AlphaStar's league approach (from StarCraft) could be done also in Stratego, and it would be very interesting to compare the two approaches. Note that AlphaStar is more expensive: it required to train N competing agents with pair-wise evaluation costing N^2, while Stratego's NeuRD trains a single agent.
In principle, the AlphaStar's league approach (from StarCraft) could be done also in Stratego, and it would be very interesting to compare the two approaches. Note that AlphaStar is more expensive: it required to train N competing agents with pair-wise evaluation costing N^2, while Stratego's NeuRD trains a single agent.