Hacker News new | past | comments | ask | show | jobs | submit login

I too believe it is the best fit, but the "aggregate functions" gets most people. The use of counter columns is very limiting and many engineers don't want to struggle with storing state during streaming writes to precalculate the aggregates on write. Also, engineers tend not to want to do large reads to rebuild large aggregate values on small data changes.

It is what we use, and we use spark streaming for the rollups. We had evaluated Influx, OpenTSDB, and Druid also. So long as you know the exact read patterns for your client I think Cassandra is definitely the best fit for most things.




Did you try out the Cassandra-backed KairosDB? I'm very interested in the results if you did.


I did not. The Cassandra schema appears similar. We had enough custom needs for how we aggregated data (e.g. ewma) that we probably needed to do this ourselves anyways.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: