It's also possible that the positional evaluation is strong enough that AlphaGo can see the value in a position before the human because of the complexity involved in determining the "value" of a given position.
My experience is with Chess and Chess AI, but in my experience, the more positional knowledge built into the evaluation function, the better the search performs, even if you have to sacrifice some speed for more thorough evaluation. A significant positional weakness may never be discovered within the search horizon of a chess engine because it may take 50 moves for the weakness to create a material loss, so while it's certainly possible that a deep, but carefully pruned search is being utilized, I suspect that some of the Value Network's evaluation is helping to create some of these seemingly odd moves.
For AlphaGo to recognize a position that doesn't achieve a good result for 20 moves, it would often have to search much deeper than those 20 moves (I'm not sure if you're using the term moves to mean ply or both players moving, but if it takes 20 AlphaGo moves for the advantage to materialize, that would be a minimum 40 ply search) to quiesce the search to the point that material exchanges have stopped (again, this is how chess typically does it, I don't know about Go), so the evaluation at the end of the 20 move sequence is arguably more important than a deep search. The sooner you can recognize that a position is good or bad for you, the more time you have to improve the position.
My experience is with Chess and Chess AI, but in my experience, the more positional knowledge built into the evaluation function, the better the search performs, even if you have to sacrifice some speed for more thorough evaluation. A significant positional weakness may never be discovered within the search horizon of a chess engine because it may take 50 moves for the weakness to create a material loss, so while it's certainly possible that a deep, but carefully pruned search is being utilized, I suspect that some of the Value Network's evaluation is helping to create some of these seemingly odd moves.
For AlphaGo to recognize a position that doesn't achieve a good result for 20 moves, it would often have to search much deeper than those 20 moves (I'm not sure if you're using the term moves to mean ply or both players moving, but if it takes 20 AlphaGo moves for the advantage to materialize, that would be a minimum 40 ply search) to quiesce the search to the point that material exchanges have stopped (again, this is how chess typically does it, I don't know about Go), so the evaluation at the end of the 20 move sequence is arguably more important than a deep search. The sooner you can recognize that a position is good or bad for you, the more time you have to improve the position.