When it’s all in memory you get to amortize the cost of the initial load. Or just pay it when it’s not part of the hot path. When it’s segmented, you’re doing that because memory is full and you need to read in all the segments you don’t have. That’ll completely overwhelm the log n of the search you still get
I was trying to make the point that the dominant factor becomes linear instead of logarithmic, but more accurately it's O(S log N) = O(N log N) because S (number of segments) is proportional to N (number of vectors).
Ah yeah that’s what I wanted to write but I guess I didn’t want to put words in your mouth, and stuck to what I could be certain about happening. We do all this work to throw away the unneeded bits in one situation and when comparing it to a slightly different situation go “huh some of that garbage would be kinda nice here”