Hacker News new | past | comments | ask | show | jobs | submit login

Out of curiosity why weren't products like Druid http://druid.io/ or influxdb https://influxdb.com/ or possibly opentsdb taken into consideration ?



To be totally honest, there are so many technologies out there that claim to solve analytics that it's tough to seriously consider all of them.

That said, we have looked at Druid, which is also a good example of using lambda architecture in practice (http://druid.io/docs/0.8.0/design/design.html -- note the historical vs realtime distinction). They use many of the same design principles as us, and one of our sub-systems is very similar to it. We still believe the pre-aggregation approach is critical for performance in our use case, though. Lastly, when we started building the architecture (mid-2014), Druid was very new, and I'm generally wary of designing everything around a new and potentially unstable piece of software.


Druid does pre-aggregation (roll-up) of data at ingestion time and is also used at scale (30+ trillion events, ingesting over 1M+ events/s) by numerous large technology companies: http://druid.io/druid-powered.html


Note that the commenter mentioned mid-2014. That page first appeared on (or about) July 29th, 2014[1], and at that time only contained 4 names:

Metamarkets

Netflix

LiquidM

N3twork

So while today Druid may be in use by "numerous large technology companies", at the time the commenter was researching it wasn't showcasing as many large companies.

[1] https://web.archive.org/web/20140729014707/http://druid.io/d...


Hey Fan, I know you feel very strongly about Druid but at the time it wasn't the way to go, I can see how they might have opted to steer clear.


Just to note, with druid you are able to have preaggregated tables based on dimensions. Overall good article and thanks for sharing.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: