Here is a fun challenge for language snobs: implement bit-streams for your favorite language. Here is a simple signature:
bitvec read_nbits (unsigned int count, bitstream input);
bool write_nbits (unsigned int count,bitvec bits,
bitstream output);
Then add a multirecord I/O. That is, read a record of bitvecs, each N bits wide, where the length is not given but encoded in the bitstream in this manner: read a bitvec record of N bits, if the high-bit is set (take it in whatever endian you like), read another record and repeat, if the high bit is not set, return whatever you read thus far.
Then add the ability to return the bitvec records as single integers (bignums even) both signed and unsigned, taking them in this manner. If the bitvecs are unsigned, each record contributes its 7 least significant bits, and the high bit is dropped. Taking the bit on either end of the first record as the most signficant, iterate over the rest and shift and OR according to your endian needs to construct an integer from the sum of all the bits.
When I first did this exercise, I was very close to gouging my own eye-balls out, but after I did it, I started to think of bits as something very natural, and not to be feared.
Are you sure you're decoding bits from an octet stream (normal 8-bit bytes) and not using logical booleans? I ask this because I don't see any bit-manipulation operators, except maybe for "1 :> 'T" which looks like the SML module type casting, but I could be wrong.
Yes very sure.
nthbit will extract the nth bit from an integer type. The bitwise logic is the &&& and <<< in F# you use triples for bitwise. But there is a bug in there, i forgot to times sz by 8
http://msdn.microsoft.com/en-us/library/dd469495.aspx
bitstream turns a sequence of integer types into a sequence of bits.
So if you wanted to turn a stream into bits you'd do something like the following.
new System.IO.FileStream('foo.txt')
|> Seq.unfold (fun s -> (s , match s.read with
| -1 -> None
| x -> Some(x)))
|> bitstream
I just realized that the bitstream function is unnecessary and you could instead write
Just for fun here's the reverse to turn a sequence of booleans into a sequence of integer types (byte,int,long,etc). Note that the length of sequence s must be a multiple of sizeof<'T> times 8
let bitsToBitContainer (s:seq<bool>) : seq<'T> =
let sz = sizeof<'T>*8
s
|> Seq.scan (fun (i,n) x -> (i++,(n<<<1)|||(x&&&1)) (0,0)
|> Seq.filter (fun (i,n) -> i = sz)
|> Seq.map (fun (i,n) -> n)
You wouldn't happen to still have your code? This sounds like an interesting exercise. Maybe you could blog about your implementation and we can compare notes.
As much as I love bit manipulation, I have to say that most (not all) of them don't provide any real performance gain for C/C++ code (only some geeky satisfaction of writing difficult to interpret code!)..
e.g. with any decent compiler...
* if ((x & 1) == 0) performs same as, if ((x % 2) == 0)
* if (x & (1<<n)) same as, if (x% (pow(2,n)))
so on..
on the other hand, i find bit operations really handy when the variables in question are to be treated as separate bits than normal base-10..
fwiw, you were actually looking for (x & ((1 << n) - 1)) = (x % pow(2, n))
That aside, you're absolutely right. Using bit ops is almost always a bad idea. It turns a simple store (to a byte-addressed memory address) into a read, manipulate, and store (to a bit within a byte-addressed memory address) which can have pretty bad side affects in any concurrent environment.
Granted, yes. There was discussion about whether the patent is at all valid, given abundant prior art. However, it depresses me too because you probably need lots of money and possibly a lawsuit to settle out the validity issue.
Yeah, that's a better one, with more actual hacks. Most of the "hacks" in the original post weren't really hacks, but the way embedded programmers set and toggle bits every day.
Is it pretty common? As someone who does a fair number of software engineer interviews, that's a trick question. The real answer to "swap two vars with no temps" is:
1. Don't be clever in our code base. Use a temp variable.
2. There's various dumb tricks with XOR, and possibly add/subtract if overflows don't break.
3. A sequence of several instructions where each of them requires the result of the previous one may not execute particularly fast on modern processors. Instruction/cycle counts -- like 3 -- are great when there's no pipeline and no cache, but otherwise pretty much useless.
4. The things you're swapping might be local variables, and when the compiler has -O <anything> specified, local variables start getting weird, and "swap" can sometimes be done in zero instructions, namely by the compiler noting that they have now been swapped and using the other one for the rest of the basic block. (or further dominated basic blocks for that matter)
5. If the things you're swapping are in main memory, or even if it's not in L1, you're going to be incurring a cost much greater than the temporary use of a register. (and, if you don't know where they are and it might be main memory, this might dominate the average runtime)
Embedded and firmware engineering questions expect this as an answer. Anyone following your advice will not be taken seriously. This is true regardless of whether you're correct factually. Readers of this thread deserve to know that.
3 is actually a quite valid point for embedded. Any swap actually will be expensive when it comes to keeping cache lines clean. The correct answer in that case is just to not swap the variables, and instead swap their uses later on:
int x, y;
...
SWAP(x, y);
foo(x, y);
becomes
int x, y;
...
foo(y, x);
(naturally, this is why I still eagerly await the arrival of a C compiler that has macros with LISP power)
In that case, you can get the right behavior by just swapping the variables using a temporary variable. If the compiler is decent, it'll automatically swap their uses later on.
I don't know how every compiler works, but if you use Clang (or anything LLVM-based), it converts everything to Single Static Assignment (SSA) form:
In SSA form, the value of a variable does not change, so it ends up creating a bunch of "imaginary" variables to hold intermediate values. From there, it does optimizations, then figures out how best to allocate registers, and what needs to be stack-allocated.
Most of the hacks in that guide assume that you don't care about readability or even portability to a certain extent. It certainly isn't everyday that you need to optimize your code at that level, but in some instances it could be useful (for example trying to reduce delay in a real-time program.)
As an interviewer, the real value in this question is to get the attitude of the person behind it. The best candidates know it because they've read widely and know the tricks. They also then add "But I wouldn't use it."
The very best candidates add: Because it's tricky, hard to read, limited in scope, and usually you can avoid swapping variables by changing their usage downstream. Besides, the best compilers will sort it out for you if you write it clearly and cleanly.
When I interview it's not the answers I listen to, it's the knowledge they expose, not of programming per se, but of good practices in programming.
If you're interested in reading about more of these, and how they work I'd highly suggest reading the book Hacker's Delight by Henry S. Warren[1]. I know that I've used more than a few of the tricks in my day-to-day work.
Let's say you use a bitmask for keeping track of free registers in a compiler. To "allocate" a register, you'll want to clear a bit, to indicate that the register is no longer free.
That's the context I learned this trick in. It's used in the Embarcadero (ex-Borland) compilers (Delphi, C++, etc.).
This is mainly useful for testing whenever given number is integral power of 2 (you gen zero in that case). Otherwise it might be useful for some special purpose counters/timers.
These aren't hacks really: they're just basic operations in binary arithmetic that anyone who has seriously coded C or assembly must have bumped into, probably quite early.
It was 10-20 years ago. The funnest project I ever worked on was to implement a 17-function motor controller with 40-bit precision onto an MCU with 2K of RAM and 64 BYTES of RAM. That was in '93.
Today embedded systems generally run Linux. You get to write a bit of assembly language code in your bootloader, and then it's just bog standard Unix programming. It's even likely that you'll be doing most of your coding in a scripting language, although it's more likely to be Lua than Ruby...
There are still a lot of 8051s and MSP430s out there.
The fun thing about embedded is that since everything is done as a result of interrupts or clock signals it's all the joys of multitasking without a threading library.
I just got handed a project where the 'app' was just main() { while(1) {} } ! Everything happens as the result of functions that get magically called when certain bit patterns appear
Great tutorial, but I wish he'd given some context for these. The even/odd is a well-known one, but propagation of rightmost 1-bit? I have no idea what that, or most of these, are used for.
Then add the ability to return the bitvec records as single integers (bignums even) both signed and unsigned, taking them in this manner. If the bitvecs are unsigned, each record contributes its 7 least significant bits, and the high bit is dropped. Taking the bit on either end of the first record as the most signficant, iterate over the rest and shift and OR according to your endian needs to construct an integer from the sum of all the bits.
When I first did this exercise, I was very close to gouging my own eye-balls out, but after I did it, I started to think of bits as something very natural, and not to be feared.