1) Any comparison-based sorting algorithm that isn't branchless will mis-predict often.
2) The article completely ignores the possibility of conditional-move instructions, which would cut the mis-predicted branches in this case to zero (or one if we add the "are we done?" check and always predict "no").
I'm not sure you're quite right about 1. Depending on what you mean exactly. For example, insertion sort is O(n^2) but mispredicts O(n) times. That doesn't seem often to me.
There are (artificial) mergesorts that execute O(nlogn) branches but also mispredict O(n) times.
See for example
"branch mispredictions don't affect mergesort:"
http://www.cphstl.dk/Paper/Quicksort/sea12.pdf
In the case of 2. I agree, although compiler support still isn't great -- I think. For almost all cases (integers, floats, strings) radix sorting would be even better -- fewer instructions, no comparison branches.
Hmm, you're right. My thinking was, "if we know enough about the structure of the problem that we can make sure we're guessing right much more than half the time, we can probably turn that into a better O()", which insertion sort as an example doesn't really violate per se - it is possible to trade off that knowledge for better asymptotic complexity, but clearly the claim wasn't quite right as broadly as I stated it.
2) The article completely ignores the possibility of conditional-move instructions, which would cut the mis-predicted branches in this case to zero (or one if we add the "are we done?" check and always predict "no").