Interesting to see the competitive analysis with Fivetran in the article but then see almost identical copies of infographics used between their site and Fivetran's.
- address the long tail of integrations (our goal is 200+ by end of 2021) - we're working on a low-code/no-code framework to make it easier to build and maintain connectors
- give you flexibility/customizability to adapt pre-built connectors to your needs
- debugging autonomy (we're standardizing how connectors are being built, so maintenance can be done by us and the community)
- No more security and privacy compliance, as self-hosted and open-sourced (MIT)
- No more super high prices (volume-based) that don't make sense for big data companies.
After just going through a 6 month, pain-filled fund raise for an open source database (big on integration), this is probably the most upsetting thing I have ever read in my life.
Far away from Silicon Valley with no flashy credentials, 13 days is an impossible dream.
That said, massive kudos to the team for such clear storytelling & delivery.
This could end up helping much more than hurting long-term. Engineering salaries in the Bay Area are insane. Salaries for engineers in the US overall are very high. If you’re in Europe, you can likely afford 2-3x more engineers than your competitors for the same amount of $ raised.
that's true, but they get more money at better valuations so it balances out -> and engineer headcount is a bad predictor of success! It is as much the experience of taking a database/dev tool and going big that's lacking outside SV -> marketing, operations etc. Plenty of talented engineers (and marketers) around the world, very few engineers who have built a MuleSoft or a Snowflake.
Thanks! My reply looks a little bleak, but it was a successful seed round raise (TerminusDB). We are doing fine and the future is bright, it just took months of blood, sweat, and tears to close.
Seriously, this deck would likely not have flown without the YC backing and implicit stamp of approval, once you are in YC you'd have to do pretty bad not to raise seed funding.
It's actually a good question. They say 1/3 of the batch should raise easily, 1/3 less easily, 1/3 will struggle.
It's also with which fund you are raising. There are many funds for sure, but raising with the top tier VCs is definitely not 1/3 or even 1/10 of the batch.
Anyways, hopefully, this article was useful to you :).
> It's also with which fund you are raising. There are many funds for sure, but raising with the top tier VCs is definitely not 1/3 or even 1/10 of the batch.
Do most top-tier VCs even do seed rounds? Surely some do, but there are many VCs (I think including some that are considered top-tier, at least in certain verticals) that focus on A and later.
Maybe it would help if they weren't all trying to raise at a 10-30M valuation when they're barely getting started. There's a lot of YC businesses I'd love to fund but where the economics are totally out of whack.
We (YC S20) had issues raising our round after demo day, a good number of investors we spoke with were series A focused while claiming to be "early stage" investors. Long story short, we ended up pivoting, considered the amount we raised (under $500k) as pre-seed, and now are actively moving in direction of profitability instead of taking outside capital.
Good for you. I am a bootstrapped founder and I still haven't seen the need for a seed round yet. Of course, we are growing slowly and organically but that's ok with me.
YC does play an important role to get connections to VCs but at the end of the day, "YC" is a signal for VCs, not a criteria.
VCs spend time looking at the team, the past achievements, the product and most importantly the existing users. They also try to invest in industries that they know about.
In our case the team experience was important. We had solved the problem internally at other companies (and the scars that come with it!).
In one of John's response, he mentioned that we've been talking to many VCs. The reason was that we were looking to talk to the ones who understand deeply the problem and the market we're addressing. No matter how good your product or deck is, if you're pitching a calendar app to a VC who is specialized in deep tech, you probably won't get them on your cap table.
To determine this you'd have to track how many companies raise funds and how much they raise (at what valuation) both inside and outside of YC. From my own view on this supported by quite a bit of data I would say that YC is as close as you're going to find to a stamp of approval in the start-up scene, it has a substantial effect on your chances to raise funds and on the amount you will raise when you do.
Actually, Airbyte is a pivot from a first product. And we struggled a lot to raise with the previous product, even though we were fresh out of YC. YC is an indicator but not enough by itself.
One of my best investments is a company I turned down at demo day, but who pivoted a year later after running out of funds and finding a completely new product. I gave them their first post pivot angel check and they're well on their way to unicorn status now.
Big fan of looking at what YC founders do 1 year out of the program. YC backs companies in large part on founder quality even if the original idea isn't a winner, and I've made quite a few strong bets on YC founders when they needed a reset a bit later. (The good ones keep the original company cap table intact to take care of their pre-reset investors. That's a huge positive signal for me coming in as fresh blood.)
Not only that, but it also materially improves your business for many B2B startups given the network of companies who will do you a favor and sign up for your service during the program. It helps kickstart early traction.
The YC stamp is a door opener, a necessary condition to get funded, but not a sufficient one.
The primary way YC is beneficial for raising VC money is in the way the force founders to focus on the metrics that matter, say user growth etc. Which happen to be also crucial for VCs.
This deck without the YC stamp would probably not be sufficient for serious investors for a variety of reasons: Missing market size, addressable market, and market growth, doesn't answer the question of "why now"? (whereas why this and why you is answered). It isn't a bad deck but let's be honest here.
If you have to convince VCs about market size for early stage investment (seed, A), you likely are talking to the wrong people.
VCs who will fund you already agree there's a potential market and are willing to make a speculative bet based on patterns like data + open source that have worked in other investments. Market numbers tend to be more or less imaginary at this point.
The pitch deck first and foremost tells a story with the objective to get a VC interested - or not. This is a necessary but not a sufficient step in the process. Think of it as a marketing prospectus or a 'door opener' type of thing.
If a VC is indeed interested they will conduct intensive due diligence and leave no stone unturned.
In our case, we're disrupting a well-known industry. So only a non-educated investor would ask for the total addressable market, and the "why now". Both are pretty obvious if you know the industry.
Having investors ask us about those was a strong signal that those VCs didn't know enough about our market, so they wouldn't be great partners for you.
However, in a standard pitch, I would agree those slides need to be there, especially if you're creating a new category. On our side, we're disrupting a well-known one, so not the same.
That's completely my experience as well. At this stage you need investors who understand the market. (Congrats by the way on getting Accel to pony up!)
> Think of it as a marketing prospectus or a 'door opener' type of thing.
YC and other accelerators help to get in the door though. My first deck was worse than this but we ended up with USV at seed (thanks to the accelerator, in my opinion), which opened the doors through B/C and out.
Having worked with Fivetran, Segment and Singer in the past I am really excited for an opensource solution like what you guys have developed. The long tail of connectors has been a real hassle when you work with mostly small companies who use very specific SaaS products.
Second, I would say though that often having a team from the industry with previous exits, etc. is usually a winning formula, so YMMV if your team doesn't look like that, even if you have a great deck, etc.
Sure, it does help. I think timeboxing the fundraise gives some FOMO to investors, if they know that you're meeting with a lot of other ones. There are many things to it. But the team does help for sure!
I notice they use GitHub stars as a metric in the Appendix, supposedly to "insist that we hadn’t done a hard launch yet." I wonder how meaningful this metric is. Is 350 a lot of stars? A little? What am I supposed to take away from this?
Some of our sources are written from scratch, some are wrappers directly over Singer, and others are wrappers over forks of Singer taps that we've updated/improved. All of our destinations right now do not depend on Singer.
The high level strategy of orchestrating sources/destinations is similar although configuration, state management, source/destination installation/isolation, normalizing data, etc. are quite different.
Nice deck! And terrific market positioning; in doing my own research of trying to get all organizational data in a single place in an otherwise non-technical organization, I've done demos of Fivetran and a few of your other competitors and your analysis of their weaknesses are spot on. I'll be giving your product a try.
For those of us not in the the SV area... when I do stuff like this I have some slides (not nearly as pretty), but really they are a small part of the presentation. I would also have supplied a prospectus with a business plan etc. Is that the case with YC too, or is it really slide_deck>$$$$?
How sophisticated are investors in general with niche or arcane technology segments that might not be widely understood? Do they know enough to tell if your product solves specific problems, or are they trusting your market validation?
If it was an 'intense 2 weeks' what compromised the back and forth intensity? Negotiation, waiting/anxiety? Were there any big surprises during raising or do they 'like it or not'?
So within 2 weeks, we met with 45 investors (76 calls in 7 days, that's our record ;) ). Our goal was to identify the 10 VCs that understood the best our vision and industry and that could bring the most value.
On those investors, you could see that 50% didn't know much about data infrastructure, or that it was a fresh topic for them. But for the 10 funds we liked best, they knew A LOT, invested in it, brought a lot of insight and value, just by interacting with them.
So for the next round, we will mostly focus on those 10 funds, keep them posted on our progress, so that the next round is just a question of when and how.
In terms of negotiation, I would say we had a lot of interest, so we could have negotiated the valuation higher, but for us, it was more a question of who we wanted to work with.
But will try to write a blog post on the process for more details, if you think that could be useful.
I think that would be incredibly helpful to the community, but I wouldn't ever press an 'extremely busy person' to do such a thing. I honestly which YC had an 'after action report' section for founders just to quickly write up some materially experiential things.
Congrats on your raise, your pitch to me has basically all of the attributes - some people see it as some kind of arcane magic, but for B2B generally I don't think it is, it seems you've nailed the issues quite squarely. It's a good benchmark well done.
1) We did have some intros thanks to being a YC company. That definitely helped.
2) For some funds for which we really wanted intros, we asked our investors.
3) We timed these 2 weeks of fundraising to happen 2 weeks after some important product release for us (0.2.0). And we did get some inbound from investors (them reaching out to us).
Also being at the crossroads of data infrastructure and open-source helped a lot, as both are important topics for investors right now.
We tried to keep the meetings with the funds that we liked most at the end. For instance, Accel was the 42nd investor we met with.
We used this before the pivot, when it was just a data unblocker tool. The product was flawless and high quality, although too expensive for our needs in the end. Good luck Airbyte!
Airbyte was a pivot from a 1st product. And we had a hard time raising with the 1st product.
So vision / product / market opportunity are also really important.
Also confused here. Slide 7 [0] says seed funding of $1.8M, but Airbyte is also calling the most recent round a $5M seed. Implies they will call the first round Pre-Seed retroactively, rather than calling the second round Seed+. It's all grey at this stage either way!
I was a bit baffled to hear valuations in the latest batches are reaching $15mm+. For seed. Right of of an accelerator.
If they were able to raise on a cap in that ballpark, the $ amount makes sense.
I have a feeling these high valuations and giant rounds will end up doing a disservice to founders of moderately (but not massively) successful startups who are left with $7mm of notes (or safes) to pay back on acquisition with 1x liquidation preference
Nice, thanks! Looks like you are working on growing mind share for the foreseeable future and layering in enterprise-specific features to close big deals. Do you foresee a sales team for this? Or do you think independent devs will self-serve their way into larger orgs?
Approaches one and two make sense to me. I'm a bit lost on approach three though.
So if I understand well, approach 1 is mindshare. Approach 2 is sales team, and approach 3 is bottom-up but self-serve.
So definitely approach 1. Will be focusing on the open-source edition for the next year or more. Doing that will help us being deployed in a lot of companies. And we hope this will help the sales team close the deals. So it would be a mix of 2 and 3. Makes sense?
Anyways, that's what we have in mind. And we'll learn by doing!
One thing that’s not clear to me is why is there so much competition and crowding in “data massage” space. There is Snowflake, there are all kinds of ETL tools. The customer lists these startup posts have overlaps. Is it just Marketing departments inside these companies playing around with these tools or the CIOs cycling through the hottest startup on TechCrunch list ?
I worked for a large organisation where management was far closer to 'technology leaders' and 'technology strategists' than engineering and data science principles and leads. They would endlessly swoop in to our division asking us to assess another product they have bought to fix the legacy problems of multiple data sources.
All of them were brittle af. They all anticipated a very idealistic data source and the absence of non-technical people curating data in excel ten different ways.
Even though we were the data science team, we usually ended up providing far more value to the organisation because we could do data engineering and cleaning and ended up being the source of truth for a lot of data required by the wider organisation.
We got pitched dozens of sexy solutions to fix all our ETL problems, but when we started asking questions it was always seemed like a well designed custom pipeline couldn't be beaten for both data quality assurance, reliability and speed.
That's exactly why we are approaching the problem with open source. It changes the dynamic of how it gets adopted. we've been in your shoes where a tool is being pushed Top-Down and now you have to deal with a super complex, super expensive, rigid & half working product.
Instead Airbyte gets adopted by engineers, data scientist... to solve one problem and then the usage expands from there. We can improve the product based on the feedback we get from the real users.
And if a feature, a connector is not there, anyone can actually add it!
Coming from an ad agency background I’ve seen a lot of attempts at “unifying” various data sources from client’s analytics and sales data, agency tools, and third party data sets that are all in different formats, date ranges, and scopes.
Warehousing that data might also require firewalling clients or teams for privacy or “competitive/conflict” reasons.
These aren’t difficult problems to solve with a few knowledgeable devs but that is nothing but added cost and some agencies just aren’t good at hiring the right devs - especially if their previous exposure has been basic front end web developers from their clients.
“Data warehouse” has also become a selling term even if “really big database” is a more accurate term.
Hopefully more of these companies start to distinguish themselves in this space but their competition isn’t each other - it’s entry-level data people blasting through Excel.
There is value to having all your data in one place and managing data connectors is generally not very fun. They also tend to be annoyingly brittle and it's very visible when they go down. That all makes for a perfect recipe to cause burnout in a small data team. Competent data engineers are also fairly difficult to hire right now.
in my experience, most ai projects die before the ai part
people hate hiring data engineering (plumbing people feels like cost), and data eng like tools that work but most are.too niche/happypath-oriented, so even w trifacta etc, a lot of open territory. SW can solve a lot of that, in theory, so everyone wins.
And I agree that until there is an oss winner, the proprietary stuff will keep getting churned through. So ultimately whatever your data platform does (aws, databrick, whatever) or oss you're bringing. A lot of room for vendors to carve out niches b/c of connectors x use cases, until platforms/oss eats them all. VC's will see some ARR and name brands and thus be happy to fund: a lot of gaps any startup can fill. (I am impressed by airbyte for a few non-technical reasons even without having used it, so not a knock on them, so just some clues for the continuing froth in their market.)
They all promise to reduce your data engineering budget. The problem is that building a data connector is a one-time platform problem per data source. Once it’s solved; it’s solved. None of them solve the problem of ETL design and data warehousing design.
It sounds like you don't think solving for data connectors + necessary maintenance has a lot of value. I would agree, not FTE levels of value, but most companies I've seen in the SMB space would do well to pay $1-3k per month to have their data all housed in one spot. That lets their 1-2 DS/DE/SWE spend their time actually analyzing the data.
Maintaining connectors is also a good way to demotivate high achievers - better to have them further down the value funnel.
Lol! It seems HN rewrote the title.
The initial title was "How Airbyte raised $5M with Accel in 13 days (deck included)"
I didn't know they renamed titles.
Yup. Diversity is a big focus for us now.
Unfortunately, you tend to look for your first employees in your own network, so people you know. Most of our team is also composed of backend/data engineers, and diversity is not easy there.
But this is definitely top of mind for us now.
Airbyte: https://airbyte.io/wp-content/uploads/2021/03/Airbyte-Seed-D...
Fivetran: https://images.cms.fivetran.com/mgtdf72hs0mx/6qYtmEEotXqScar...