HLL eliminates read before writes in many cases and that's great.Would love to s...

antirez · on April 22, 2014

In eventually consistent systems like Cassandra, HLLs have the ideal merge semantics too (very similar to union of a Set).

ddorian43 · on April 22, 2014

if by pg you mean postgresql it is available as an extension

aktau · on April 22, 2014

I have a vague notion of what you mean but could you annotate that with an example or something, please? I'd like to make sure :).

ddorian43 · on April 22, 2014

What he means is something like non-reading-increments in hypertable.

How they function:

You write a+=1

  if it doesn't exist in memory:
    a=1
    append +1 to commit-log
  else:
    a+=1 (in memory)
    append to commit-log

After some time, 'a' is written to disk and the commit-log is checkpointed (so if a server crashes it doesn't have to read a very large commit log), and 'a' becomes immutable.

But you have to increment again the 'a' key, and it is immutable. So you create a new 'a':

And repeat again. After some time this is again persisted on disk and the commit log checkpointed.

Now you want to read the value of 'a':

If a merger has run, it reads different versions of data on disk and merges them, counters are merged and written as 1 key. So it reads 'a'.

If the merger has not run, it reads both versions of 'a', merges them in memory, and returns the value.

Now change '+1' to add_to_set(5). This is even better, because it updates the in-memory value, and if the hll doesn't change because '5' was already added to set, it doesn't even have to write/commit to log because no change is made.

aktau · on April 22, 2014

Thanks!