Hoi Peter, >I have a similar conjecture for glidesort but I'll have to do some t...

orlp · on Dec 2, 2022

> Hoi Peter,

A nitpick, but my name is Orson Peters, Peters is my last name :)

> I assume that's one of the advantages of building upon powersort? The blocky nature of quadsort is both a blessing and a curse in that regard.

I don't know, I only use a binary search when splitting up merges, and almost no time is spent in this routine. I use this for the low-memory case, as well as to create more parallelism to use for instruction-level parallelism.

> As for instruction-level parallelism, I do think it's actually almost entirely memory-level parallelism. I could be wrong though.

I didn't do specific research into which effect it is, when I say ILP I also mean the memory effects of that.

scandum · on Dec 2, 2022

>A nitpick, but my name is Orson Peters

I actually knew that, not sure what went wrong in my brain.

>I don't know, I only use a binary search when splitting up merges, and almost no time is spent in this routine.

You'll still get a definite speed-up, worth benching the difference just in case. I guess there's no easy way to avoid a binary search.

>I didn't do specific research into which effect it is, when I say ILP I also mean the memory effects of that.

They're indeed related. I did some empirical testing and I'm quite sure it's cache related. Thinking about it, one issue I may have had was not benching sorting 100M elements, which might be where a bidirectional partition might benefit on my system.