Hacker News new | past | comments | ask | show | jobs | submit login

Thanks for your question!

The main value proposition is raw performance which translates into hardware efficiency and ultimately, costs at scale.

To get a rough idea of how we compare, you could try to running `select cab_type, count() from trips` and `select passenger_count, avg(total_amount) from trips`. They are the equivalent of queries (1) and (2) of this benchmark [1]. In this benchmark, BigQuery took 2 seconds for each query.

Our server runs both in hundreds of milliseconds. And this is actually slow because this is the same server which is currently being hammered by HN users. Also, we are scanning more rows. Our dataset contains 1.6 billion rows while the benchmark attached runs on 1.1 billion. Lastly, we only use one 24-core CPU from one server while the top of this benchmark is clusters or GPUs.

Of course this is only approximation but I hope to be given the chance to feature in this benchmark when we are ready.

[1] https://tech.marksblogg.com/benchmarks.html




Is your data indexed or clustered? Redshift queries on sorted data is just as fast if not faster.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: