That's the physical and link layer's fault for not possessing alignment information. You shouldn't have a bit stream, you should already have a byte stream, otherwise you are dealing with data forensics.
The physical part of disks and networks usually have things like preambles, correlators, phase-locked loops, Gray codes, and self-clocking signals which can be used to deduce bit and byte boundaries, which is why in software, we almost always deal with information which we already know is correctly aligned.
There's no way to align a bit stream containing exclusively 8 bits per byte. (Whatever code you choose for the alignment mark could also be a valid code since there are no extras.)
You need to add extra bits so you can find the alignment, and once you do that you have the answer to your question.
While you may be correct, it's interesting to note the existence of Consistent Overhead Byte Stuffing (http://en.wikipedia.org/wiki/Consistent_Overhead_Byte_Stuffi...) which sort of kind of if you squint manages to add a 257th symbol to the data stream at a constant (worst-case) overhead per byte.
The original paper also discusses how to use the "Eliminate all zero bytes" behavior in order to set up an unambiguous sync pattern to indicate the beginning of transmissions in a bit-level wire protocol.
You remind me of a question my cryptography professor in college asked (and failed to answer...) -- what are the information-theoretical properties of a channel which sometimes deletes a character?
He attempted to answer it, but the answer assumed that the recipient could recognize deleted characters, which doesn't make any sense to me (it seems to be rather like assuming that if someone sends you a postcard and you don't receive it, you nevertheless sense, at the time it would have arrived, that you should have gotten it).