Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Sai from ClickHouse here. Adding to above, we just released a blog that presents JOIN benchmarks of ClickHouse against Snowflake and Databricks. This is after the recent enhancements made to the ClickHouse core. https://clickhouse.com/blog/join-me-if-you-can-clickhouse-vs.... The benchmarks is around 2 dimensions of both speed and cost.


This is really encouraging! Commented elsewhere in the thread but this was one of the main odd points I ran into when experimenting with ClickHouse, and the changes in the PR and mentioned in the recent video about join improvements (https://www.youtube.com/watch?v=gd3OyQzB_Fc&t=137s) seem to hit some of the problems. I'm curious whether "condition pushdown" mentioned in the video will make it so "a.foo_id=3 and b.foo_id=a.foo_id" doesn't need "b.foo_id=3" added for optimal speed.

I also share nrjames's curiosity about whether the spill-to-disk situation has improved. Not having to even think about whether a join fits in memory would be a game changer.


Will Clickhouse spill to disk yet when joins are too large for memory?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: