Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: How to store a set of four 5-bit values in one 16-bit value (github.com/isometric)
188 points by andromeduck on Jan 28, 2018 | hide | past | favorite | 140 comments



Ah, without preserving order. That's a set of four 5-bit values, not a sequence of four 5-bit values.


To answer the obvious next question:

Four 5-bit values in order have 20 bits of entropy, so cannot be stored in 16 bits.

Four 5-bit values without order have 20 - log2(4!) =~ 20 - 4.59 = 15.41 bits of entropy (corresponding to log(2^20/4!) possible configurations), and thus can fit in 16 bits of data if you're clever about it.


Your math is wrong.

If two of the numbers happen to be the same, order no longer matters for those numbers, so your log2(4!) needs to be larger...


Right. So the true answer is

    log2((32*31*30*29)/(4*3*2) + (32*31*30)/2 + (32*31) + (32*31)/2) = 15.675

 which is still smaller than 16.


Are you sure you don't mean

   log2((32*31*30*29)/(4*3*2*1) + (32*31*30)/(3*2*1) + (32*31)/(2*1) + 32/1) = 15.34
using the number of sets of values from {0, 2^5-1} with at most 4 elements?

If on the other hand you want to store exactly 4 values, possibly with duplicates (while still ignoring order), you need to count multisets ( https://en.wikipedia.org/wiki/Bag_(mathematics)#Counting_mul... )

  log2( (35*34*33*32)/(4*3*2*1) ) = 15.676


The second answer, 15.676, is the correct one. It can also be arrived at by:

log2(32C4 + 32C3 * 3 + 32C2 * 3 + 32C1)


What the parent message did is: log2(32+4-1 C 4), because the number of k-sized combinations of n objects with repetitions is (n+k-1)Ck. One way to prove it is that you sort the objects (so you can establish an order among the objects in the combination) and add k-1 dummy objects for "the second is the same as the first, the third is the same as the second,..., the last is the same as the penultimate".


I followed the original link as well as this calculation. Why not use this to map them directly (instead of breaking it into 12 bits and 4 bits) ? We'll just have some unused headroom in 16 bits. This would seem more natural than thinking of the 12 + 4 solution.


It uses more memory.


Does this implementation allow duplicates? Typically a set is defined as an unordered collection with no duplicate items.


The maximum entropy is what is important, to the encoding challenge, as I understood it.


Actually the log2(4!) needs to be smaller, because as yorwba below says, duplicates don't have 24 permutations.


His math is right. The factorial already took care of the fact that order does not matter.

If you want to optimize storage of duplicates, you still have to store the number of duplicate numbers, and you are back where you started.


The factorial is taking too much care of the fact that order does not matter. If I wanted to store four 1-bit numbers, ajkjk's math would suggest that I need log2(2^4/4!) = -0.58, i.e. a negative number of bits. You can't just divide by 4! because not all combinations of 4 numbers have 4! unique permutations.

> If you want to optimize storage of duplicates, you still have to store the number of duplicate numbers, and you are back where you started.

No, because you can shave off a few bits by ignoring the ordering of the non-duplicates.


You’re right, correction appreciated.


Is there a specific topic of study that taught you those formulas?


As cameldrv said, information theory. The late David MacKay has a good free book:

http://www.inference.org.uk/itprnn/book.html

The intro chapters cover the topic of measuring informational entropy.

Aside: One interesting application is to the problem of “use a balance scale only three times to determine which of twelve balls has the wrong weight (too heavy or light)”. You choose the weighings so as as to maximize the entropy of the outcome i.e. so there are many possible outcomes of equal probability.


Not the person you asked but

4! is the number of permutations of a list - it's commonly seen throughout CS theory and other forms of discrete math.

log_2(n) is the number of bits needed to store a integer from 0 to n - it's reasonably commonly used in CS theory education.


As others have mentioned, it comes up in information theory, or maybe in some subfields of combinatorics and probability. I learned it tangentially in various parts of physics.

But I might just recommend going to the source! which is Claude Shannon's seminal masters thesis on coding: http://affect-reason-utility.com/1301/4/shannon1948.pdf . It's surprisingly readable and worth at least skimming if you like this stuff.


try combinatorics branch of mathematics



Probability theory and information theory.


[deleted]


They're not ordered: that's the whole point.


My bad, I misread!


Well, the title literally says "a set of" and "not a sequence of".


Yup so no ARGB5555 16 bit type :-)


But an MSAA 4-fragment each 5-bit 16-bit type.

Or MSAA4x ARGB5555 in 64-bit (instead of 80 bit)


Hmm, it's interesting to me that order would be expected. I nearly always think of storage and ordering as entirely separate problems.


Usually, when you store a value, you want to be able to get exactly that value back. If you store multiple values without keeping the ordering, you lose that capability. I.e. you can't replace all possible uses of four 5-bit variables with one 16-bit variable, only those where the variables are interchangeable.


Many common data structures fail to preserve order. Maps (aka dictionaries) based on hash tables come to mind.

Slightly adrift of the topic...

I have a HAMT (hash array map trie) implementation that uses a 32-bit unsigned int as a bitfield to indicate which of the 32 possible children nodes are populated. With this trick I could encode any node with 5 or fewer bits flagged with a 16-bit unsigned int instead.

I just checked and 1 million keys created roughly 1.3 million nodes (1 million keys, 300k internal nodes). Nearly 90% of the 300k internal nodes have 5 or fewer bits set. This trick would make the HAMT overhead nearly half the size.

Very useful!

Now, how about 6-bit values in one 32-bit value? I'd like to try this with a 64-bit HAMT. Can this be easily generalized?

Edit: Oops, that should be nodes with 4 or fewer bits set, there are only 4, 5-bit values. Interestingly that apparently doesn't make much of a difference for HAMTs only about 2k internal nodes out of 300k have 5 children. So the results remain the same.


The number of bags of k N-bit numbers is 2^N + k - 1 choose k:

https://en.wikipedia.org/wiki/Multiset#Counting_multisets

The number of bits required to uniquely represent such a bag is thus ⌈log2((2^N+k-1) choose k)⌉. For (N, k) we have:

(6, 6) -> ⌈log2(69 choose 6)⌉ = ⌈log2(119877472)⌉ = 27 bits (6, 7) -> ⌈log2(70 choose 7)⌉ = ⌈log2(1198774720)⌉ = 31 bits

So you can store seven 6-bit integers without order in one int32. I don't know if it'll be slow, though. It should be possible to infer an encoding algorithm from the proof of the counting formula but it might be slow.


My understanding is you can't tell order using this trick. I guess the bitmap is just an optimization to skip the hash lookup? Order doesn't matter for you?

Something I came up with is packing the pointers/values into a contiguous array and using the bitfield to tell not only if it's present, but what offset is at (mask + popcnt) * size of(value). This eliminates wasted space due to unused buckets in the hash table. I'm sure it's been done before, but I haven't seen it anywhere.

How's your Go? Want a remote contact doing Go microservices that pays very well? Email is in my profile.


That's pretty much exactly how hash array mapped tries work.

Take the first 5 bits of a 32 bit hash, use it to set a flag in the bitfield, a populated field indicates a populated child array element, find out which one using popcnt. When you find a conflict on insert take a step down the tree, use the next 5 bits in the 32 bit hash rinse and repeat.

A HAMT is the basis of a fully concurrent data structure for maps which can also do fast atomic snapshots called a ctrie.


I'll add HAMT and tries to my reading list, thanks.


I believe, based on the math, that 6 6-bit values barely don't fit in 32 bits. :(


Taking this to the extreme, you could "store" 65535 1-bit values in one 16-bit value. You just have to keep track of how many of them are equal to 1.


Thank you - this is a much better explanation of how this 'storage' works.


I think this is not true for single bits this way.


For completeness: The number of possible sets that include 4 numbers from the range [0..15] is (16+4-1) choose 4, or 3876.

Let's assume that our set is (a, b, c, d). It is ordered, therefore a <= b <= c <= d. For example, a valid set is (2, 4, 4, 10).

Consider all possible numbers that may be in our set, laid out in order.

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

For each element in the set, we mark its position in this order by placing a marker to the left of the corresponding number.

For our example, we first mark element 2:

0 1 m 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Then 4:

0 1 m 2 3 m 4 5 6 7 8 9 10 11 12 13 14 15

Then, 4 again:

0 1 m 2 3 m m 4 5 6 7 8 9 10 11 12 13 14 15

And finally, 10:

0 1 m 2 3 m m 4 5 6 7 8 9 m 10 11 12 13 14 15

Now, out of these 16+4 places, only 19 of them could be a valid placement for a marker, as a marker should always be placed to the left of a number (or said differently, "15" will always be in the end.)

Therefore, there are 19 choose 4 ways to pick the locations of the markers.


Thanks. That was really helpful.


That is one big-ass makefile though. I realize it has some extra niceties but I have to ask, did you ever try

    make main
without any makefile at all?

If you haven't done so, delete the makefile now (you have it in version control anyway) and give it a try.


Heh, that big ass makefile is the generic makefile [1] I came up with a few years ago. Definitely a bit of overkill for a single file project, but some of us are lazy when it comes to such things.

[1]: https://github.com/mbcrawfo/GenericMakefile


I can see why you gave up on handling spaces :-) I made one over a year ago that handles spaces correctly so that I could use it on Windows (MSYS2), and it was quite the 'fun' project... =P


GNU Make has a number of implicit rules. A blank target rule (or no target rule) in the Makefile (or no Makefile at all!) will cause `make` to try to create the specified target `foo` from `foo.c` or `foo.cpp` (or others) if they exist. More complete explanation here:

https://www.gnu.org/software/make/manual/html_node/Catalogue...

Note: I was initially annoyed with parent's "do this and see what happens" post and its lack of substantive communication, and responded poorly. This comment is substantially edited.


Your response of linking the GNU Make manual comes of as even more self-impressed and coy than the parent.

The parent suggests removing the makefile entirely and running `make main`, and observing the result. That would be a learning experience (and much better than your "RTFM")! But, if you don't want to humor that, the thing to say is: The result is that it works, and correctly builds the program, without a makefile at all.

The explanation is that GNU Make has a catalog of built-in rules (which you linked to); but it's quite a leap from knowing that fact, and even making use of them, to realizing that it means in some cases you don't need a makefile at all.


> The parent suggests removing the makefile entirely and running `make main`, and observing the result. That would be a learning experience! But, if you don't want to humor that, the thing to say is: The result is that it works, and correctly builds the program, without a makefile at all.

Exactly my intention. Thank you, I was worried after the first response from the other person that I'd worded myself too poorly and that more people would think I was being "self-impressed and coy" when that was not the intention at all.


> self-impressed and coy

That's an odd thing to read into my comment. I am merely passing on information that was at one point given to me when I myself had a Makefile that was actually not needed because the program it was for was a single source file project like OP.

Also in what way is it easier to read a long document without reference to a specific section than it is to just delete the makefile and run make and the stem of the name of the source file?

Edit: My second paragraph of this comment was written prior to parent commenter editing their comment.


I appreciate that you edited your comment to make it more informative. Now if you could also remove the snark, it would be a great comment.


There is no snark. I think the GGP's post adds nothing of value, and is in fact rude. It's not snark to point that out, anymore than your critique of my post is snark.

You may feel differently. That's fine. But differing opinions being downvoted to oblivion is ... problematic, at best.

EDIT: After reading your response below, you were right and I was wrong. Editing.


From the HN Guidelines:

When disagreeing, please reply to the argument instead of calling names. "That is idiotic; 1 + 1 is 2, not 3" can be shortened to "1 + 1 is 2, not 3."

You may think that encouraging experimentation without giving an explanation is rude, but to me it's just a different way of sharing knowledge.


Is there a way to make it work with just make? I'm way too used to just typing make at this point and it's not too much work to just copy the same makefile everywhere.


Use a phony target that does whatever you want. Mine typically invoke cc to build the program ("cc -O3 -Wall -o main main.c", something like that?), then run the program ("./main"), which prints its stuff to stdout and stderr. Makes it very easy to build+run from within Emacs, where the default compile command is "make".


echo "all: main" > Makefile

make


No need create a Makefile.

  make main
is enough.


Indeed, but it doesn't quite answer how to make it work with _just_ `make`


It is possible to do this in O(1) memory, for arbitrary sized collections, efficiently.

The first trick is to make a function that can calculate the nth set with k elements from some universe U (in this case U = {0..2^5-1} without order (no duplicates) directly. This is done using the https://en.wikipedia.org/wiki/Combinatorial_number_system. This is very efficient.

Then, to encode duplicates you use the stars and bars trick.


Can you elaborate on the second part? Are you saying you're going to expand the size of the universe by a factor of 2^(k-1) to use as locations of the "bars"?


Yes, the universe is expanded.


I really like this mechanism. It seems like it should be the optimal way to store a set of size k.

Out of curiosity, do you know of an efficient way to calculate this without sorting first? Or even better, without knowing k ahead of time? If you can sort your input first it's trivial, but even though it seems like a super efficient way to store a set of numbers I don't know how to use it to solve, for example, the other problem in these comments of sorting 1M 32-bit integers in 2MB of RAM (you can store them with this mechanism in about 1.7MB of RAM, but only if they were already sorted).


I don't know of a way of doing this without sorting or knowing k ahead of time, sorry.


I'm pretty sure that computing N choose k requires O(k log N) memory to store the result (for N >> k). Neat trick though.


I should've said O(small), I'm comparing to the solution in the OP, which computes all possible combinations in a lookup table.


This seems obviously wrong, though I admittedly don't know the stars and bars trick. Lets say you wanted to store a number a in [0,k), for arbitrarily large k. All you would need to so is encode your value as 0 repeated a times, and 1 repeated k-a times. Because you can store arbitrary integers below k with this method, you would need at least log(k) space. If k can be arbitrarily large, you would require unbounded space.


Arithmetic coding will work too, without any special trick to code duplicates.


Can you elaborate? I am not very familiar with arithmetic coding, but I thought it required arbitrary precision floating point numbers, so not O(1) memory.


You don't need arbitrary precision, just enough precision. Specifically, in this case you only need 16 bits. You can start with an upper and lower bound, and narrow it down for each number.

Here's an example with numbers dumped from a quick prototype encoder. The exact range numbers aren't that important, but note how the range gets smaller for every symbol we encode:

Say we want to encode the ordered numbers 10, 15, 31, 31.

We start with the full range [0, 65536)

There are C(32 - 10 + 2, 3) = 2024 combinations that start with 10, out of 52360 possible, so that narrows it down to around 4% of the full range, we'll use [49702, 52236)

For the next number we only allow numbers 10 and above, 15 has 153 out of 2024 possibilities, range [51022, 51214)

For the next number, 31 is only 1 of the 153 possibilities, so it gets a tiny range, [51212, 51214).

And now 31 is the next number in 1 out of 1 of cases, so the range is again [51212, 51214)

We can code the sequence as either 51212 or 51213, since we used a range that is slightly larger than needed some combinations have multiple codes. We could have started with the smaller range [0, 52360) instead to get a bijection.


I don't believe that it does, at least not without some additional trick. How do you handle the case where you repeatedly encode the largest case? E.g. {31, 31, 31, 31}.


I wrote a blog post several years ago that's basically about this same effect, although from a different angle: https://hips.seas.harvard.edu/blog/2013/06/05/compressing-ge...


The fact that it's using lookup tables to store the mapping between the set of numbers and the encoded 16-bit number makes it less interesting; it basically just enumerates the set of all possible combinations of numbers, and uses the ordinal as the encoding. A directly calculated scheme without a lookup would be niftier, though I suspect in practice the way it steals bits from redundancy of duplicates it wouldn't be pretty.


It does seem like it should be calculable without the lookup table. Something with modular math perhaps.


This reminds me of the following old puzzle.

Ten people are standing in a line. Each person is wearing a black or a white hat and can only see the colours of the hats of the people in front of them. The goal is to guess the colour of one’s own hat. First, the last person in the line shouts out their guess (‘black’ or ‘white’), then the person in front of them shouts out their guess, etc. No other form of communication is allowed (no clapping, no touching, no texting – and no tricks involving, e.g., encoding information in the delay between it being ‘your turn’ and actually shouting out your guess). The group ‘wins’ if at most one person guesses wrong.

The group is allowed to discuss a strategy before taking part in this puzzle (i.e., before being given their hats). Which strategy should they choose to maximise the chance of winning?

They’re only allowed one wrong guess, so this is basically trying to encode 9 bits of information in 1 bit. And here the ordering of the bits actually matters. Still, it’s possible to choose a strategy which gives the group 100% chance of winning! Good luck trying to solve this one. :)


Seems impossible to me.


The way huftis stated it, it is indeed impossible.

In the popular, solvable version of the puzzle, the group also gets the information if each guess was wrong or right. It's not at all about encoding "9 bits of information in 1 bit".


No you don’t need information on whether the each guess is right. (And, of course, if the group is playing it correctly, every guess except possible the first one is correct.)


Damn, you are right! I wonder why the first two instances of this puzzle I googled included that info.

Still, it's not about encoding "9 bits of information in 1 bit".


You’re of course right about this not being encoding ‘9 bits of information in 1 bit’. That would be impossible; to encode 9 (arbitrary) bits of information, you need 9 bits.

On the other hand, you must encode information on the colour of 9 hats, and only the last person in the line is free to provide some information. He can’t possibly know the colour of his own hat, so he can only guess – or use his turn to provide 1 bit of information to the rest of the group (while the other people have to shout out their correct hat colour). The difficulty lies in figuring out why the puzzle as stated does not need you to encode 9 bits using 1 bit …


Try doing it with just two people. Then extend the concept somehow. All the details of the puzzle are important.


Well it’s easy to do with two people because the first person can just guess the color of the second, since we are allowed one wrong guess. I don’t see how this can be extended.


I assure you that it's possible. To give you a hint without giving everything away: the guess of the first person must be derived from the colour of all other person's hats.

You could think of it as a puzzle in coding theory, and it's not implausible. If you sum up the number of hats that each person sees and the number of guesses they hear, you'll note that everybody has 9 bits of information available. And they're supposed to make 9 correct guesses. It adds up.

Another hint would be to think of the case of 3 persons first. That can be done with patient case analysis, and it's likely to get you an idea for generalisation.


Think about it as if each guess preceding a person is a bit of information, the goal is to come up with a strategy for using that information (you control encoding as well as decoding) to correctly infer another bit.


You can extend to three people if the first "guess" indicates if nr 2 and 3 have the same color or not. But doesn't go further because after the first all guesses must be 100% correct and can't be used for communicating additional information.


It does extend. It's just an XOR/parity trick. The last person shouts out the XOR of all the previous colors (if you prefer: whether the number of visible white hats is even or odd). From there, every other subsequent person knows their color by XORing the hats they see together with the guesses made so far.

For example, the second to last person knows their hat is black (0) if the last person's guess matches the XOR of the hats they see - if it doesn't, their hat flipped it, and thus is white (1). And so on and so forth. Your hat color is everything you see and everything that was said XORed together.


That’s correct. Or, to explain in (slightly) more non-mathematical terms: The last person (‘person 10’) shouts out (e.g.) ‘white’ if the number of white hats in front of him is an odd number (1, 3, 5, 7, 9) and ‘black’ otherwise. Let’s say he shouted ‘white’. The person in front of him looks at the people in front him. If it’s still an odd number of white hats, his must obviously be black; otherwise it must be white.

If his hat is black, his shouts out ‘black’, and the person in front of him knows that the rest of the line (8 people) must still have an odd number of white hats, and he applies the exact same logic. But if the hat of person 9 was white, he would shout out ‘white’, and person 8 would know that the rest of the line (including himself) should now have an even number of white hats.

So, basically the rule is: Define ‘the rest of the line’ to mean the people who have not yet shouted out a colour. When the first person (person 10) shouts out ‘white’, this means that ‘the rest of the line’ has an odd number of white hats (parity: odd). Whenever someone shouts out ‘white’, the parity (even/odd) of ‘the rest of the line’ is flipped.

Using this, each person only has to count the number of white hats of the people in front of them and observe if it is even or odd. If it matches the parity of ‘the rest of the line’, the person’s hat is black, otherwise it’s white. (This also works for the person in front of the line (person 1). He sees no (or 0) white hats, i.e. he sees an even number of white hats.)


The title is deceptive. It should state that the goal is to store a set of four 5-bit values.


This is quite related to the problem of sorting a million 32 bit integers using only 2M of RAM (and no disk). It can be done.


A list of deltas should do it. In the worst case, the deltas would use log2[2^32/1000000] * 1000000 bits, so about 1.5 MB. Plus some space because of base 128 encoding (it increases size up to 37/32, rounded up per byte).

I got a worst case of exactly 2 MB (1.907 MiB) (all deltas being 4294, so the list is 0, 4294, 8588...), but maybe it's possible to get better than that.

It would be uber slow though, probably n^2.


The idea is good, but base128 won't work. The worst case scenario is around 250k offsets of 2^14 (requiring 3 bytes each) and 750k offsets of 2^7 (requiring 2 bytes each). That's 2.25 MB


That's true. A 5 bits header before each number that specifies how many bits a number uses would be enough.


How are you encoding the deltas exactly?


Say the first number is 100, then the following one is 300, then 1000. You'd encode [100, 200, 700].


So what is the trick?


It's the opening problem from Programming Pearls: http://www.fusu.us/2013/06/bitmap-sort.html

(edit: better link)


How is this relevant? A bitmap of size 2^32 is 512MB in size and you only have 2MB of memory.


The trick is that you only need to store the unordered list of seen integers as your state while you sort. This takes about 1.5MB or so. The reduction in space comes from the fact that you don't have to store the order (for example, you could store the numbers sorted and only record offsets. That saves space because the offsets will be smaller)


If I only have 8 bytes of ram and a program counter I can do it...

For (I=0; I<1<<31; I++) For (J=0; J<1e6; J++) If (input[J]==I) print(I);

Shoddy runtime, but hey...


Yes, this is a problem that requires careful specification. You only get streaming access to your input, not random.


Couldn't you just do that with radix sort or am I missing something?


Yes, you're missing something. Radix sort, like most sorting algorithms, requires storing the intermediate results in memory. This problem seems impossible at first blush because storing 1 million 4-byte integers would seem to require 4M of RAM, and only 2M is available.


But if I pick my buckets right, I can store only the lsb in each bucket.

Or more efficient, dynamically construct an implicit trie/heap type thing, where each leaf is the count of times that int appears (and the int itself is encoded in the location)

That feels really, really handwavy but promising?


You're sort of on the right track, but of course you have to encode those counts very carefully because 1M counts will exceed 2MB of memory unless you work hard.


Right, you can't store them as ints. 1 million bits is the max you need, but like this whole thing is just super tricky because you can't just address things normally anywhere, since that's too big.


So I'm allowed to read/write only once to the backing media?

You're right, that does sound impossible. :O


Only if all your ints are unique I guess?


No, that's not a requirement, as it turns out, though of course it would decrease the entropy even further.


Sure, just pick a sorting algorithm that uses constant memory and the optimal O(n lg n) time complexity:

https://en.wikipedia.org/wiki/Sorting_algorithm#Comparison_o...


These need random access to use constant memory, he was talking about streaming access


Yeah, I didn't pay attention that you can't really store 1 million 32 bit integers in 2MiB without encoding/compressing them somehow.


Conversely: the order for 5 4-bit values itself has 4-bits of information.


The order for five four bit values has about 6.91 bits of information. log_2(5!)=6.91

More relevant to the problem, the order for four five bit values has about 4.59 bits of information (log_2(4!)=4.59). Not all of the sets actually have four numbers- many have duplicate numbers, like the list [1 1 2 3] is a set of numbers {1 2 3}- and that gets us down to 4.08 bits of information which is small enough to make this example work.

At least that's how I understand this. I could be wrong.


Ahh, just yesterday in HN: Operation Gunman

http://www.cryptomuseum.com/covert/bugs/selectric/index.htm

Quotes:

"The data from the 6 magnetometers (i.e. 6 bits) was somehow digitally compressed into 4-bit words and then stored in a magnetic-core buffer that could hold up to 8 such 4-bit data words." "It is unknown why and how the data was compressed from 6 to 4 bits, and the NSA report is very vague on this point. It is possible that the Soviets used 4-bit logic and had to spread the 6-bit data over more than one 4-bit data word, but it is more likely that they used frequency analysis"

"According to the NSA report, the Russians compressed the 6-bit data into a 4-bit frequency select word. Although the report doesn't explain what they mean by this, we can make a few educated guesses. The reason for compressing it into 4-bits, was probably the fact that the Russians only had access to 4-bit digital technology at the time. The problem with 4 bits however, is that each data word has just 16 possible combinations"

...

Maybe a similar "hack"?


Which reminds of of this old programming joke:

While developing a CAD system we saved a lot of RAM compared to our competitor's product. They were storing each (X1,Y1)(X2,Y2) as four integers. We figured out how to the same thing in just 16-bits.

Yes, you guessed it (wait for it....) we could fit a edge in word-wise!

(Disclaimer: I didn't say it was a GOOD joke!)


I had to add `#include <algorithm>` to get it to compile.


Why is this first line true: "This works because there are 3876 possible unique values for a set of 4 4 bit values"

Each 4 bit number has 16 possible values. And you can order them in the set in 4! = 24 ways. So I thought you would get 24 * 16 = 384. Can someone explain?


They're taking the multiset [0].

So it's \binom{2^4 + 4 - 1}{4} = 3876

In other words, there are 3876 multisets of cardinality 4 with elements taken from the set containing all 4 bit values.

[0] https://en.wikipedia.org/wiki/Multiset


Can you explain the top number (2^4 + 4 - 1)? I understand the bottom number 4 is probably because you're choosing 4 values out of the set of numbers from the top, but not sure how you got 2^4 + 4 - 1.


I haven't learned about compression of data, but from what I know, one has to make some assumptions to store n bit data into m bit space, where m < n. Here, the assumption seems to be that numbers are necessarily 5 bit, and that there are only 4 numbers.

What I fail to understand is how this helps to compress the data in this case. The code isn't really that nice, and the comments seem to assume prior knowledge is this field, which I don't have.

So can someone give a ELI-Noob?


It’s not perfect compression; it’s lossy. Per the top comment, this method treats it as a set i.e. ignore the order. So you’re throwing out some information.

https://news.ycombinator.com/item?id=16249092


This is a great little lesson in information entropy. I work on similar problems quite often in my work, and understanding this was a pleasant epiphany.


Could one practical use of the general case be to store factorial based numbers efficiently ? https://en.wikipedia.org/wiki/Factorial_number_system


how are you getting 3876 unique values for a set of 4 4 bit values (16, 16, 16, 16) ?


It's not the number of sets with at most 4 elements (which would be 1820), it's the number of bags with exactly 4 elements (i.e. the multiplicity of duplicate elements matters).

https://en.wikipedia.org/wiki/Bag_(mathematics)#Counting_mul...


He doesn't care about the order so 1, 2, 3, 4 is stored the same as 4, 3, 2, 1. That's where the savings is coming from.


I really don’t understand why not caring about the order would save space. Could you explain in simple terms please?


If you care about the order then

1, 2, 3, 4 has to be stored differently than 4, 3, 2, 1, as well as 1, 3, 2, 4, and so on.

If you don't care about the order all of those can be stored in exactly the same way. Think of it like lossy compression, if you don't care about some of the detail you an ignore it (the order of the numbers in this case) and save some space.


Consider the simplest case where your set members are just bits.

If you have three bits and you care about the order, that's a 3-bit number -- it can take 8 different values.

If you have those same bits but you don't care about the order, the operation is a "bit count" -- you can only have 4 different values.


I don't understand as well...


(n + k - 1) choose k

n = number of unique 4-bit values

k = size of multiset

(2^4 + 4 - 1) choose 4 = 3876


Any reason why store last 4 bits as is and not just making a LUT out of 4 sorted 5-bit values?


Can someone explain this in very simple terms without math? An ELI5 example. Thank you.


If you’re playing Yahtzee or cards it doesn’t matter what order you get the dice/cards. It just matters if you got them at all.

If you have two values, you can arrange them two ways. Three values you can arrange six ways. Four values you can arrange 20 ways. 0-20 takes a little over four bits to represent.

If you have a set, you don’t care about order. So if you save the data without an order to it, it takes a little less space to hold it. About four bits worth.

With five values you can save almost seven bits (120) and with six it’s over 9 bits (720).


What's the practical use for this?


Saving space in data structures that use buckets such as hash tables. I actually discovered this in a paper on hash tables that kind of just mentioned that it was a thing but I had a harder time finding out exactly how it was done.


Perhaps you could mention the motivation for the trick in the readme? My initial reaction was "either this is an april-fools style joke, or it's a cute but pointless trick", and it wasn't til I got to the bottom of the HN comments that I found an explanation of why it would be worth knowing...


done :)


Any chance you can remember which paper?



De Brujin Sequences


It's pretty cool and all but what are the uses for this? What set of 4 un ordered 5-bit numbers would i need to store that I couldn't just store as 3 bytes? I waste only 4 bits while preserving the order if need be of the values I'm storing. I can think of a very very small few occasions 4 bits would matter over order but nothing realistic. Again It's a cool trick and I'm not trying to be a dick about it. I like cool little tricks like this even if there's no purpose. I'm honestly curious about some realistic use cases for this or some variation of this with larger numbers.


It's 4 5-bit values, not 5 4-bit values. 5-bits is enough to enumerate each bit in a 32-bit bitfield. This means for any application that might set 4 or fewer flags out of a possible 32 need only use a 16-bit bitfield.

As I said in another comment I just tested this with a 32-bit hash array mapped trie, and it would result in a significant reduction in the storage overhead of the trie (which is already more efficient than typical hash tables for map like data structures).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: