Hacker News new | past | comments | ask | show | jobs | submit login

The startup time of a HDInsight cluster is quite long; it's not really suited for ad hoc clusters (which are really easy to spin up quickly in Databricks).



It takes around 20 minutes to spin up HDInsight clusters. So it is not instant but I wonder if databricks on Azure can do any better given they are running in customer environments so wont have a big pool of preallocated multitenant machines available.


AFAIK it is not multitenant, you get your own machines.

They spin up the VMs with either open source Spark or the Databricks runtime - you get to choose the distribution and version before spinning up the cluster.

If you have enough workloads you can run your own pool of VMs to provide a 'serverless' experience to your Spark users: https://databricks.com/blog/2017/06/07/databricks-serverless...


agree, if you have batch jobs for spark, Azure Batch Service may not be a bad solution either. HDInsight includes too much other complexity if all you wanted was spark. Databricks will be interesting, as they can take away even they mysticism of touching azure beyond initially provisioning them some rights.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: