It has nothing to do with the model and is just a question of whether b is a var...

DanWaterworth · on Aug 29, 2016

> It has nothing to do with the model and is just a question of whether b is a variable in the big O notation.

There are models, like the cache oblivious model, where b is assumed to be a variable, so the model matters.

> a binary tree and b-tree stay within a constant factor even as you vary b.

it's not a constant factor if b is a variable. It's a factor of log b.

SamReidHughes · on Aug 29, 2016

> it's not a constant factor if b is a variable. It's a factor of log b.

It's between 1/(sqrt(2)-1) and sqrt(2)/(sqrt(2)-1), depending on how full the b-tree nodes are.

DanWaterworth · on Aug 29, 2016

I don't know where you got those numbers from.

SamReidHughes · on Aug 29, 2016

> The reason is, the binary tree accesses that a single b-sized access replaces get exponentially more "local."

sqrt(n)+sqrt(n/b)+sqrt(n/b^2)+... versus sqrt(n)+sqrt(n/2)+sqrt(n/4)+...

And since b-tree nodes can be half empty, there's a sqrt(2) uncertainty. (And of course there is no other memory overhead at all, none whatsoever.)

DanWaterworth · on Aug 29, 2016

Thanks, I see. Are you assuming ephemeral usage of nodes to make that claim?

SamReidHughes · on Aug 29, 2016

I don't know what that means. I'm assuming a random element of the tree is picked, that parent nodes are in a smaller or equal cache level than children, that all but one cache levels are used completely, that each cache level has O(sqrt n) access time, and that there is an upper bound on the ratio between successive cache sizes.

Or less generally: it takes sqrt(j) nanoseconds to dereference the pointer with value j, and parent nodes are at smaller addresses than their children.

DanWaterworth · on Aug 30, 2016

Right, you're assuming there's one tree being maintained.

SamReidHughes · on Aug 30, 2016

Or any fixed number of them, or any curve where the number is polynomially smaller than the total size...