Hacker News new | past | comments | ask | show | jobs | submit login
Snowflake Computing raises $263M for its data warehouse (venturebeat.com)
58 points by jonbaer on Jan 29, 2018 | hide | past | favorite | 32 comments



> this latest round brings Snowflake’s total funding to $473 million

Absurd. When I see deals this large, especially in enterprise software, it raises alarm bells. It seems likely there are at least a few “quid pro quos” involved here. When your customers and investors overlap and run in the same circles, it’s hard to avoid the nepotism inherent in that world. The board members, investors, and customers are probably all executives at the same 50-100 companies. They have the power to purchase the product (demonstrating growth), invest in the product (demonstrating increasing valuation), and then acquire the product at a multiple of what they invested in it for.

I also wonder how much the founders are controlling, or if this is a case like Box where the founder owns <10% and the rest of the huge valuation is spread across dozens of different investors.


Actually, it is quite the opposite. Once an enterprise software company has crossed the $40m revenue threshold, has low churn, and growth is >2x, you got yourselves a winner.

Now Snowflake has grown 3x over consecutive years, and with all these big enterprises coming in, there has gotta be a true advantage.

Now the question is if this current advantage can survive an upgrade of say e.g. the capabilities of Amazon Redshift, that moves the latter to almost feature parity.


While there certainly is some overlap, the investors in this round were venture capital most likely funded by institutional capital such as pension funds and endowments, rather than corporations who might be clients: see http://www.theequitykicker.com/2010/12/01/where-do-vcs-get-t... . And as of earlier this year they counted 450 unique customers: https://techcrunch.com/2017/04/05/snowflake-rakes-in-100-mil... So the extent of self-dealing is limited.

IMO, every company in every industry will need data warehousing at some point in their growth, if they are to survive in the modern world. It's difficult to get right, and Redshift is far from a white-glove experience for those attempting to navigate the space. Compared to B2C deals of this magnitude, this is significantly more sane.


My first thought about reading that article too. I'm suspect of this submission to HN actually; usually, tech news that's submitted here is very detailed in the algorithms and "how" things work. This article reads like a press release watered down for investors.


The question for me is, does Amazon buy Snowflake or beat Snowflake?

Snowflake is built on AWS, you can be sure Amazon has been watching them for years.

I’ll put my money that by 2020 AWS will have a strong Snowflake competitor. Honestly, Athena/Redshift Spectrum are getting somewhat close.


I'm astounded these days when I pull up an overview list of all of the AWS products. They currently list 142 products (several should be excluded). It seems like that has tripled in maybe four years. One of the nice things about AWS trying to do everything, it's guaranteed they're going to do a lot of things at a mediocre level. As they aggressively expand that product list, the number of things they suck at will increase.


it's guaranteed they're going to do a lot of things at a mediocre level

Why?

If it was a smaller company then spreading resources out across more and more products could result in a drop in quality, but this is Amazon. They have practically unlimited resources. They can afford to put far more in to products than other companies, and they've already got a team who can build some really good stuff. If they can capture what works and use it to understand how to build their products that should result in better products rather than worse.


By that logic, all big companies must be the best because they have huge resources.

Scale means you can throw lots of people at something, it doesn't mean you can make it good.

Bureaucracy, internal politics, budget/cost cutting, unpleasant work environment, short-term stock price targets instead of long-term goal based targets... There are a million and one reasons why big companies start to suck.


I agree, so many of the AWS products feel almost abandoned. Like they did the easy 80% of the work, released, and then coasted.


I don't see why that's necessarily the case. One could argue as they grow their software development methodologies also advance as an organization and they'll create better products. Also as more competition can spur either copying or innovation.


Snowflake isn't just reselling storage and compute though, they're building some pretty impressive software.

One of the standout features when I used them a few years ago was their ability to do analytic queries on JSON columns in huge databases. Maybe Postgres is up to the task on it's own now, but MongoDB at the time was not.


Yeah, we moved from Mongo -> Postgres -> Snowflake (skipped redis due to connection limits). Really great performance and support from various ETL & Reporting tools. Glad to see them succeed.


I hadn't heard of this company so I looked on their website for more information about what they do. I came across this demo video that's a bit under 5 minutes that was pretty helpful:

https://www.youtube.com/watch?v=dUL8GO4ZK9s&feature=youtu.be


Here's a helpful link to understand their success: https://blog.fivetran.com/warehouse-benchmark-dce9f4c529c1

Essentially, their product is (at least) on par with major competitors like Google and AWS, no small feat for a young company.


Amazon Redshift has some architectural problems in an enterprise environment that Snowflake solves. Redshift does not scale up well - everything has to fit on one cluster and eventually you hit limits.

When a Redshift cluster gets too big: 1. Takes forever to take backups, create read-replicas, modify anything 2. Redshift clusters can still only handle X concurrent queries, no matter how big the cluster is 3. Also Redshift is dragging their feet on adding new features i.e. Active Directory support.

Snowflake clusters allow a theoretically unlimited amount of users to query the same tables in a database - so it scales up forever.

And each team can create their own "compute cluster" which allows teams to track and divide costs much more easily then trying to associate this-batch-user on Redshift with this-team/line-of-business.

Snowflake is well-designed for an enterprise environment where line-of-businesses need to share/access each others data but still have separate infrastructure. Redshift does not make it easy go between Redshift clusters.

Also Snowflake as a company is still hungry and is willing to add features if you pay enough

EDIT: However Snowflake is more expensive than Redshift

EDIT: Splitting data between two Redshift clusters was not efficient for us and trying to have the same table in multiple Redshift clusters requires you creating your own tooling and the data might not stay in sync so the only real option was keeping everything in one massive cluster


Could you please don't use uppercase for emphasis? The site guidelines ask you not to: https://news.ycombinator.com/newsguidelines.html.

I'll mark your comment editable for a while if you want to correct this, and can delete this comment if you do.


What makes you think snowflake is more expensive? When I have tested it, it has given higher performance / $. Have you seen different results with your workload?


If I remember correctly, Redshift became more cost-effective over Snowflake the larger your compute got.

Snowflake would just run an EC2 instance in the background for compute, and the very largest EC2 instances were more expensive to run than a Redshift cluster of similar power.


Snowflake is most definitely running multiple instances for anything but the smallest clusters. It's plausible that one warehouse might scale better as you increase the number of nodes but I would really be surprised if there was a big difference---the techniques for query execution are pretty well-known, so Redshift and Snowflake are probably doing the same things. I did a benchmark I while ago [1] and I'm definitely planning on adding multiple scales to evaluate how they compare as you increase data size.

[1] https://blog.fivetran.com/warehouse-benchmark-dce9f4c529c1


I'm going to get downvoted to all hell but come on? All caps? Scales up forever?


I say unlimited because concurrent reads were basically a non-issue for Snowflake. You could have as many users reading from the same table as you wanted.

Meanwhile "Redshift enforces a query concurrency limit of 15 on a cluster." http://blog.blazeclan.com/what-is-amazon-redshift-11-key-poi...


Why would they call this Snowflake? What an awful name.



Thank you for the answer. I'd never heard of it. It struck me as odd given that a snowflake is delicate and ephemeral which is exactly what you don't want for this kind of technology.


I agree with your previous post, even though the product's name is eventually related to a concept on the same field (snowflake data models) it doesn't seem like a very good choice of branding. I work for one of their customers and only now do I see their website on Google's top results; namely because they also don't have a .com for the brand name.

Even though this is less relevant in a B2B context, because research is usually more thorough, it's still a warning sign in terms of branding when you have a different brand that could be confused with yours: https://snowflakesoftware.com/about/ (as posted by another user, strangely downvoted)

Also, snowflake data models are actually not very common (or appropriate/useful) in data warehouse/mart environments, as they tend to be harder to understand and use on a self-service approach, and require more joins between datasets/tables, unlike dimensional models.


Nothing to do with Snowflake schemas actually... The founders are avid skiers, and the name represents their passion for the snow.


Prbly something to do with Snowflake data schemas in ETL?


A snowflake schema is a data modelling concept, not directly related to ETL (which is the process of moving and transforming data from/to a data model, which can be "dimensional", a "snowflake" on less common approaches, among others)


Are you triggered snowflake??


Please don't post uncivil or unsubstantive comments to HN. We eventually ban accounts that abuse the site like this.

https://news.ycombinator.com/newsguidelines.html


It was a joke. Thought that was obvious given the reference and the previous comment referring to the name as 'Awful'.


Not to be confused with https://snowflakesoftware.com




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: