Here is the thing with the lakehouse though, you have flexibility and don’t need...

buttaphingas · on Nov 17, 2021

But you do still have to secure the S3 buckets, right? And I guess also secure the infrastructure you have to deploy in order to run Databricks. Plus then configure for cross-AZ failover etc. So you get flexibility, but I would think at the cost of much more human labor to get it up and running.

Snowflake uses the Arrow data format with their drivers, so is plenty fast enough when retrieving data in general. But it would be way less efficient if a data scientist just does a SELECT * to bring everything back from a table to load into a notebook.

Snowflake has had Scala support since earlier in the year, along with Java UDFs, and also just announced Python support - not a Python connector, but executing Python code directly on the Snowflake platform. Not GA yet though.