> Reservoir sampling is still O(N), and it requires generating a random number for every element.
I agree with your comment in general. Only: you can do better than generate a random number for each element. (Generate a random number to tell you how many elements in the steam to skip.)
> Can you show that [generating a number of records to skip instead of a yes/no for each record] will still maintain a uniform probability distribution?
That should work if you pull the skip count from the appropriate distribution (negative binomial, I think).
I agree with your comment in general. Only: you can do better than generate a random number for each element. (Generate a random number to tell you how many elements in the steam to skip.)