I don‘t understand why there is a difference between following any algorithm for picking drawers or picking them at random. Isn’t the chaining algorithm they describe just another form of random order (with the warden being the random number generator)? Is there a simple explanation?
If you randomize a list of numbers there will always be cycles like described (similar to how there will always be cycles if you would draw lots for Secret Santa and then reveal who drew who), and these cycles have a certain length. It might be one cycle of length 100, but more commonly there will be lots of shorter cycles. If you start with your own number, you are guaranteed to be in a cycle that will return back to you, you just don't know how long that cycle will be (could be 1, could be 100, more likely something between). And apparently you can calculate that in ~31% of cases all cycles will have a length of 50 or less. This is just a neat way to exploit a kind of structure that there always will be within the randomness.
The key insight is that this strategy causes there to be a strong correlation between the success of different prisoners.
As an example, let's say prisoner #10 opens his box and finds #21, then finds #5 in that box, then #84, then #51, then finally succeeds and finds his number 10 in box #51. These boxes form a cycle 10-21-5-84-51 (and then back to 10). Anyone who opens any of these boxes will eventually see the same set numbers that #10 did, so that means that we know prisoners #5, #10, #21, #51 and #84 will all succeed in finding their number by starting at their own box.
Compare that to the situation where they just randomly look in boxes and each independently have only a 50% chance to succeed - then the odds that those 5 prisoners all succeed would be (1/2)^5 = 1/32, or only about 3%. In every cycle containing n prisoners, instead of their success rate being (1/2)^n, now they all succeed together or all fail together.
Now that the success of every prisoner in the same cycle is perfectly correlated with each other, the prisoners' overall chance of success depends only on whether the random permutation created by the warden has any cycles >50 in length. If so, then everyone in that cycle will fail. If not, then all the cycles across the 100 boxes are short enough that all 100 prisoners will succeed.
It doesn't affect the chance of any one prisoner finding their number - that's still 50%. But the joint strategy massively reduces the independence of the individual prisoners' success, and hence the distribution of the number of successful prisoners.
Rather than a normal distribution (sum of independent outcomes) with a big peak around 50% of the prisoners finding their number and a vanishingly small chance of them all (or none) finding the right number, you end up with a much more complex pattern - there's an approximately uniform distribution in the 0-50 successes range with ~70% of the overall outcome, and a huge peak at the 100 successes point with ~30%.