2 different things. This reduces the size of the HyperLogLog the other improves ...

jbapple · on June 26, 2017

HLL is a space/accuracy tradeoff. If you trim HLL to use the same space as if you were using HLL+TailCut but use the Ertl method for estimating cardinalities, which is more accurate: Ertl with a smaller `m` or HLL+TailCut with a larger `m`?

seiflotfy · on June 26, 2017

I think the latter. I will implement this next week. Nice call :D

jbapple · on June 26, 2017

I suspect you are right that HLL+TC will be more accurate. Based on Figure 5 vs. Figure 1 in the Ertl paper, I think Ert's method improves HLL estimation for low cardialities and for very high cardinalities, but not for the region where HLL is already at its most accurate.