tl;dr: he used a counter provided by intel that describes the total number of microcode instructions translated. He tried thousands of possible opcodes in "speculation mode" (this is the mode CPUs use to calculate both forks of a branch while waiting for the branch to be decided) and checked when an anomalous number of microcode instructions were translated.
He found 13 likely candidates for unpublished ops, including the 2 that were recently found. Also a few unpublished quirks of some known instructions.
A technical nitpick: speculation doesn't check both forks of a branch, it has to pick a side!
The CPU tries very hard to guess which way branches go and that allows it to speculate much further than if it tried every possible combinations.
What happens in this post is that the author writes a CALL instruction, but then manipulates the stack so that it doesn't actually return where the CPU expects it to.
So the CPU will speculatively execute the instructions that follow the CALL linearly, even though they are never actually reached!
He found 13 likely candidates for unpublished ops, including the 2 that were recently found. Also a few unpublished quirks of some known instructions.