Infra has always been a tarpit idea. Google didn't start out as an "infra" company, and neither did Amazon or Facebook. In fact, the few "infra companies" that did start back then (companies like Godaddy) are minuscule compared to the aforementioned.
VC pouring money in LLM infra is legitimately crazy to me. It's clear as day that there will be winners of this AI cycle, but, as always, they will be companies that provide actual, real, tangible value. Making shovels works for huge companies like Nvidia or Intel, but it won't work for you. It's sad to see so much capital funneled in frameworks upon frameworks upon frameworks instead of fresh new ideas that could revolutionize the way we interact with our devices. I know it's a bit of a meme, but I'd rather see more Rabbit R1 and less LangChain.
Even OpenAI doesn't really have a product. Just throwing data at a bunch of video cards isn't value-generating in itself. We need a Dropbox or a Slack or an Instagram: something people love that makes their life easier or better.
> VC pouring money in LLM infra is legitimately crazy to me.
VC business model is throwing money at the wall and seeing what sticks. They love congratulating themselves on how smart they are but at the end of the day their overall returns trail S&P 500. They are salespeople and their job is to sell themselves to private capital on how smart and connected they are.
Does their return trail s&p 500? I guess there are some of them that are successful and some that aren't, but to be honest I've only heard about vc as people who on average have a very very high return. Now, it is very possible that's just survivor bias, but I would need to search for some data
Most of the funds, even the big ones, trail the overall market, and even the successful ones have difficulty maintaining a long track record of success.
The reason for the big pension funds and endowments to invest in VC is that it’s a bit counter cyclical. Also it’s a tiny pimple on the side of the PE sector overall.
You can see how marginal it is in the financial sector by going to an LP meeting — the big institutionals send kids — first year analysts — to attend because it isn’t that important.
The S&P 500 is a list of the largest and most profitable public companies. It's hard to do better than the most successful businesses. Most other indices and hedge funds don't outperform the S&P 500. Most private equity shops don't. Most real estate investors don't. So it shouldn't come as a surprise that venture capital doesn't.
Most startups don't get big. How many startups founded in the past decade have become hugely profitable? It's not that many. A handful out of the 500k or so funded startups. Meanwhile the S&P keeps chugging along at 8% annually.
It's hard to do better than the most successful businesses is not a statement that makes sense from the investor's perspective.
The price of an investment is based on the expected profitability of a company, an investment in a barely profitable company, if priced correctly, should yield returns at least equal to good companies like Apple, Google, and Microsoft, as the investment would be discounted to compensate for the poor expected future earnings of the company you are investing in.
That's getting into perfectly-spherical-cow territory, though. Investors aren't logical and neither are founders, and there's a real chance that an investor-fair deal isn't going to get any bites. The BATNA for most founders who don't secure funding is "go get a job that pays a lot of money"; while most VC investment is lopsided in the investor's favor through other means (equity preferences etc.), not landing deals makes your fund's LPs ask why they're letting you hold their bag of cash.
Most of the smallish funds do not publish any data. They are sales people who are selling the idea of "higher risk higher return". The sales is their alpha not the investments.
People allocating capital in VC funds are trying to diversify a very large portfolio, not just picking a high IRR asset. VC is very uncorrelated to S&P 500 performance as most of the best VC vintages were during public market downturns.
That's only half of it. They also have to justify their jobs. Anyone can manage a 60/40 portfolio with index funds, but pensions are complicated, so they need complicated investments.
I think they don't even trail S&P 500 - with all the dirty tricks up their sleeves they roll losses over and over. Bottom will fall off at some point when hype train stops rolling, but so far it was good years as we have hype after hype.
I am waiting what will be next one after AI, because quantum computing feels like too hard to become a hype, the same with space ventures, there is some upward trend going on there but still space is too hard.
> AI is a long term trend, next is more products on top of AI. Just like the internet.
This is legitimately just the same damn hype train the tech sector is constantly attempting to create. Now AI is the next internet. Before that it was the metaverse. Before that it was NFTs. Before that it was cryptocurrency. Before that it was quantum. Before that it was VR. Before that it was AR.
None of those were the revolutions postured by techno-fetishistic CEOs. Most stick around in some capacity, like VR and AR, and the argument can be made that those have a future. Blockchains certainly have a future as it's a highly useful technology, even if the financial vehicle made by it is utterly useless. The metaverse is Dead on Arrival because nobody ever wanted it in the first damn place, apart from the speculators betting money on it.
> Blockchains certainly have a future as it's a highly useful technology
No trolling here; I promise. Are you saying that blockchain is already "highly useful technology", or that we will in the future? From my perspective, it is a very cool technology concept that has yet to demonstrate any major commercial value. I also seriously doubt it will be commercially valuable ever; it has already existed for more than 10 years without any killer app (ignoring shitcoins).
> This is legitimately just the same damn hype train the tech sector is constantly attempting to create. Now AI is the next internet. Before that it was the metaverse. Before that it was NFTs. Before that it was cryptocurrency. Before that it was quantum. Before that it was VR. Before that it was AR.
Neither of those things you listed were mainstream trends. If you can not distinguish between fads and major trends that is your problem.
Internet, Mobile, Cloud and now AI are technology trends with mainstream buy in from the biggest companies in the world.
> If you can not distinguish between fads and major trends that is your problem.
You're distinguishing based on hindsight. It's mainstream if it succeeds and if it doesn't, it was a fad. How much did Zuckerburg put into Facebook's metaverse projects? I believe it was $46 billion. Then there was Decentraland too, those were as mainstream as it got. You had tons of internet-famous people cashing in for hundreds of thousands of dollars just for selling their original copies of reddit memes. How is that not at least somewhat mainstream?
I have zero doubt whatsoever that after the AI bubble lets go, I'll be having this same damn conversation with someone else and they'll be saying AI was a fad. That's my entire point.
Only zuck was metaverse obsessed. Decentraland I've never heard of it, so it was not mainstream at all. NFTs were never mainstream. All of your examples are bad.
AI has buy in from the entire tech sector. Just like the cloud, mobile and internet did.
This becomes very obvious when you use Claude Projects with Artifacts! ChatGPT depends heavily on one’s ability to copy and paste… and even though it is an improvement, Claude Projects still make managing a set of documents tedious compared to your standard code editor.
Third-party tools like Cursor are an improvement but will be prohibitively expensive compared to companies that create and manage their own LLMs.
I expect to see a native document editing/code editing software system directly from one of the LLMs-as-a-service companies at some point.
In my humble opinion, a chat interface (API or not) does not a product make. Not to mention that Llama is free and competitive with both (same with Mistral, heck the 7B model works great on my RTX 3080). If you started a company that blew up because you made a badass product (and let's say you used ChatGPT under the hood), you would just eventually train and deploy your own model because an LLM is not a product.
LLMs are clearly products! The fact that Meta rather inexplicably chooses to give away assets that cost millions or billions to create doesn't mean LLMs aren't a product, it just means that they're competing against a company funding open source stuff for (presumably) strategic reasons, like in many other markets. And like in many other markets over time it's possible the proprietary versions will establish permanent competitive advantage. Meta may not keep releasing Llamas forever.
> you would just eventually train and deploy your own model
Just train an LLM? It's really not that easy! Even if it was, it'd be like how people can "just" run their own email service. Hosted email API is not rocket science but in practice companies all choose to pay Microsoft or Google to do it. Doing these things isn't a core competitive advantage so it gets outsourced.
> The fact that Meta rather inexplicably chooses to give away assets that cost millions or billions to create doesn't mean LLMs aren't a product
Real question: Why are so many LLMs given away for free? Are they hoping to crush non-free alternatives?
EDIT
Your last paragraph makes an excellent point. In the near future, I could see big corps paying OpenAI (or a competitor) to train a private LLM on their squillion internal documents and build a very good helpdesk agent. (Legal and compliance would love it.)
Paying OpenAI for fine tunes on internal docs is already happening.
Giving expensive things away for free is a great marketing technique that has been used since time immemorial, so why startups like Stability do it is somewhat understandable. And OpenAI uses free API access as a loss leader for their API product so that's understandable too.
Why Meta/Google/others do open weight releases is a bit less clear. Recall though that the first Llama wasn't really an open source release. You had to sign a document saying you were a researcher to get the weights, and that document was an agreement to keep the weights secret. Two people signed the documents, anonymously compared their weights, discovered they weren't watermarked (i.e. Meta didn't take this seriously, it was a sop to their AI politics/safety people) and promptly leaked them.
Presumably this was useful for the more libertarian wing of Meta as they could then prove the sky wouldn't fall, and so the influence shifted towards those arguing for more openness in research in general. With that Rubicon crossed other companies didn't see competitive advantage in withholding their similar sized models anymore and followed the leader, so to speak.
Sometimes it also feels like Meta may have over-purchased GPUs and - lacking a public cloud - have just decided to let their researchers do what they wanted. Which is great for the public! But we mustn't be too overconfident. This is really only possible because of Zuckerberg's unique corporate structure that makes him unfirable, combined with Meta being a big data company. It's really benefiting all of humanity here because he's invulnerable to board action so doesn't have to worry about heat from shareholders over 'wasting' money like this.
There's a lot of R&D being done right now on shrinking models whilst preserving quality, so hopefully the Zuck's generosity is enough to ride the open AI research community through the hard times when you needed billions to train LLMs.
That's one opinion. the number off people who get value out of ChatGPT is known only to them, but anecdotally it's pretty high. a lot of people I know are actively using it on a daily basis and paying them $20/month.
n=3 now so it's full on anecdata, I've got a $20 sub professionally (swe). It has to save me so little time to be worth it it's easily great. Might add claude, though at this point probably better to find a nice interface and use the APIs.
n=5, my wife and I split a subscription between the two of us. While our needs are usually pretty minor, we both work in the computer science space and GPT-4o's ability to e.g. generate good example sentences for my Finnish vocabulary learning is astonishingly good. I'm building a wrapper around it so I can generate them en masse at [1], but it's still quite early days and very obviously not software I intend to sell
Uploading a a photo/file to a server does not a product make yet you referenced Dropbox and Instagram and that’s what they started out as.
The UX of the AI applications is the moat and the infrastructure providers behind those applications is pretty much always OpenAI and Anthropic at the moment because running your own open source LLMs (which are inferior out of the box) at scale is not cheap or easy to do it right - same reason most companies use cloud. Agree that once you hit super scale then you can run your own infrastructure but there are thousand of companies who won’t get that far and still need an LLM.
Agreed. The LLMs have effectively been democratized. As much as people deride "LLM wrappers", the quality of the wrapper and how creatively they use the LLM api is the differentiator.
Llama and Mistral are not really competitive but they can be used as a base to finetune for a very specific usecase.
Then they don't suck as much.
With OpenAI and Claude, you throw some text instructions and you get back the answers which are surprisingly correct (minus a few exceptions). In order to replicate that with Llama you'd probably need N-hundreds finetunes and a model to decide which finetunes to use.
If I had to use Llama or Mistral for free (money) I'd have to buy that RTX 3080 card plus the computer to fit it into. It doesn't work with my laptop. So I use ChatGPT for free instead, on OpenAI site with the don't archive option or whatever they call it. I think many people use the free (money) and hosted option.
I imagine you meant to say that LLMs are comoditized.
Getting the correct words here is important, as you can see by all the people disagreeing on the literal interpretation of your post.
And yeah. LLMs have got the fastest transition from highly innovative singular product to plain commodity I've ever seen or read about. BSD licensed software libraries do not move that quickly. They were mostly not even adopted yet, and have a huge barrier to entry, what makes it much more of a feat.
We’re already seeing a lot of competition between LLMs. They are quickly becoming commodities. Margins will approach zero and the real value proposition will be with consumer products that extend beyond an <input type=“text” />.
I disagree they will become commodities because most of my use cases are more sensitive to accuracy than cost. We typically have usage volume that isn't absurd and for large enterprise customers our LLM budget is a rounding error. Meanwhile our product saves them many hours of a data engineer. If we can pay double to get a 10% performance boost we will do so gladly. You can already see this in LLM pricing where they have cheap models that deliver low performance a My bet is that 80% of profits will be made on the workloads that are sensitive to accuracy and workloads where running an LLM at all gets you most of the benefit will become very commoditized.
I agree with you. I'm using a couple of different LLMs depending on what I'm doing and what happens to be easiest but the difference between them is marginal in my experience.
The only play for OpenAI et al in my opinion is to try to pull up the draw bridge behind them by getting legislation passed which makes compliance prohibitively difficult if that's not your core business.
I think OpenAI has a good future by having much better training data. That is a super hard problem that requires thousands of low cost workers in developing countries to tag and arrange training data.
there is a lot of consumer surplus, but it’s very unclear if any one of these companies will be able to capture it. Especially if mostly good enough commodity LLMs like Llama 70b are in the mix.
Not to mention that fine tuned smaller parameter models are much cheaper to run. See: Google embedding a model into the latest version of Chrome, accessible through dev tools.
Yep. Obviously they have a product. Millions and millions of people that are not software engineers or any very tech-savvy persona use their product. And a lot of those people pay for it. It’s just silly to say they don’t really have a product.
Specially given the rest of GP comment, OpenAI seems to be the Google, Facebook of the industry and will be the infrastructure company of it (already is kind of)
OpenAI is in growth mode, not profitability mode. Their ARR is over 3B now, and doesn't show signs of stagnating. At some point they will need to build and offer more products, but they've got as good a shot at being the next big tech company as anyone else.
> Even OpenAI doesn't really have a product. Just throwing data at a bunch of video cards isn't value-generating in itself. We need (...) something people love that makes their life easier or better.
I agree. I'm of the hypothesis that, when it comes to AI, a lot of product teams are pursuing overly ambitious and sophisticated features instead of targeting easy wins that are in plain sight [1].
And despite all that, I get mysteriously being pushed by partner companies I work with, this product named Box. With tactics similar to the push to use Teams. It's like being volunteered for karaoke night by your Japanese boss...
Author here. I think the tarpit extends to most chatgpt wrappers as well, which is why I called out pivoting prematurely to application layer is a futile exercise.
are there even any VCs pouring money in LLM infra? I would assume VCs aren't interested in projects which won't give them a tenfold return. And with infrastructure there are some many existing competitors (like AWS) so that such returns are never expected
I get the feeling that there are. MLOps became LLMOps before most people knew it as any more than a buzzword and well we can't just call it ops because there's clearly something different here.
AI is still too opaque to reliably know beforehand if an idea will pan out, so you just gotta try it.
Plus it's easy, once you start imagining how the magic of AI is gonna make you rich, to ignore the problems with your idea and assume that the AI will handle them too.
So you've got all these hyped up fools trying to make stuff that lacks merit. How do you capitalize on that? You don't invest in the stuff that's doomed to fail, you create a slot:
> Insert coin, insert half baked idea, receive AI app
That way you get to keep the coins even when apps don't turn out. Plus, you're collecting the institutional knowledge necessary to pounce on something that comes along which is actually worth investing in.
Or at least that's the vibe I get when our meetings feature an AI-excited VC (which isn't common, but it happens).
Great article. I am not going to name names, but over the last one year, whenever there is a concept that became popular in Gen AI, thousands of startups pivoted to doing that. Many come from software background where the expectation was that if the code works on one dataset, it would work for everything. You can see this with 1/ Prompt engineering 2/ RAG 3/ and now, after Apple's WWDC, it's adapters.
Enterprises I have spoken to says they are getting pitched by 20 startups offering similar things on a weekly basis. They are confused on what to go with.
From my vantage point (and may be wrong), the problem is many startups ended up doing the easy things - things which could be done by an internal team too, and while it's a good starting point for many businesses, but hard to justify costs in the long term. At this point, two clear demarcations appear:
1/ You make an API call to OpenAI, Anthropic, Google, Together etc. where your contribution is the prompt/RAG support etc.
2/ You deploy a model on prem/private VPC where you make the same calls w RAG etc. (focused on data security and privacy)
First one is very cheap, and you end up competing with Open AI and hundred different startups offering it. Plus internal teams w confidence that they can do it themselves. Second one is interesting, but overhead costs are about $10,000 (for hosting) and any customer would expect more value than what a typical RAG provides. Difficult to provide that kind of value when you do not have a deep understanding and under pressure to generate revenue.
I don't fully believe infra startups are a tarpit idea. Just that, we havent explored the layers where we can truly find a valuable thing that is hard to build for internal teams.
Pretty much this, 18 months ago my CEO told me we HAD to get into this space, and I told him that basically our money came from our private product and that the only way our big enterprise customers were going to play game with us was either ironclad agreements that went all the way to openai, or more likely a completely single tenant system, which would cost far more than they were willing to pay.
Of course they went with both, and as far as I can tell both are a major disaster post layoffs :)
I fully expect in somewhere around 3-6 months the dam will burst and we're going to start hearing more and more about all the teams out there that are pouring tens of millions of dollars into AI and all they have to show for it is a worse version of whatever it is they were doing.
To placate the AI fans, that's not because AI isn't interesting, it's because that's how these hype cycles always go. I remember when everything had to be XML'd. XML has its uses, but a lot of money was wasted jamming it everywhere because XML Was Cool. AI has its uses, but it is still an engineering tool; it has a grain, it has things it is good at, it has things it can't just wave a magic wand and improve, the demarcation between those two things is very, very complicated, and people are being actively discouraged from thinking about those lines right now.
But there really isn't any skipping the Trough of Disillusionment on your way to the Plateau of Productivity.
That was probably before my time. Was it really "cool"? Like big data, cloud and agile cool? Or more like ... dunno ... some design pattern? So hard to think about XML as having been cool.
Can you go into details (as much as you are comfortable) on what happened with the single tenant system? I have seen a few things, but I find it hard to put a finger on what went wrong except the ROI wasnt there. Would love to understand your experience.
Mostly capacity issues with openai made super early stuff infeasible for our customer base, and really it was because they would only give an extremely small number of total resources per azure account, which of course each would be vetted.
I pushed for local first models but the cost tradeoff just did not make sense for anything but the biggest clients, and they were constantly swapping back and forth whether openai would be acceptable or not for "insert sensitive use case here"
Sorry for being vague. I meant LoRA, but used Apple as an example because their demo showed the potential. At a conceptual level, you can finetune a base model to be good at a specific task - eg: summarization, proofreading, generation etc. These finetuned weights are at the top layer and can be replaced by other weights for a different task as needed. Apple demoed different tasks by showcasing how their model identifies the task and then chooses the right set of finetuned weights. Apple called it Adapters as it comes via LoRA (Low Rank Adapters). It's around for some time, but only shot into prominence after people got some idea on how to use it.
> Our foundation models are fine-tuned for users’ everyday activities, and can dynamically specialize themselves on-the-fly for the task at hand. We utilize adapters, small neural network modules that can be plugged into various layers of the pre-trained model, to fine-tune our models for specific tasks. For our models we adapt the attention matrices, the attention projection matrix, and the fully connected layers in the point-wise feedforward networks for a suitable set of the decoding layers of the transformer architecture.
Not Apple adapters per-se, but LoRA adapters. It’s a way of fine tuning a model such that you keep the base weights unchanged but then keep a smaller set of tuned weights to help you on specific tasks.
(Edit) Apple is using them in their Apple Intelligence, hence the association. But the technique was around before.
Its rent seeking and grifting. As technology has become easier to get into, a huge number of "startups" are low level people just looking to make noise, get their cut, and bail. Its a bad look, up there with fast fashion.
An acquisition here amounts to teams luck surface.
I'm not sure I'm exactly at the edge of things, but I have 2 companies trying to setup regular meetings with me to be a beta customer. Both have promised I can help define a new product, but when I list my real problems... they aren't in the mission. Everyone wants to solve RAG (that's easy, don't need help) or they want to give me a gui I don't need, or wrap open source software like vllm. Or "solve privacy" (which usually comes in the form of masking... which surprise, that works for PII, but not PHI... I need the protected information).
Want to solve a real problem, help me create custom benchmarks, clean my data, get my small parameter model to reason better etc.
We had the exact same problem before genAI became the next big thing. All the startups were selling generic fine tuning and labeling services both of which are super easy to build, and they didn't even work on our unique super high quality super high resolution 40TB dataset.
Our problem was we had a real world problem and real data. All the startups were solving for imaginary problems and had no data.
maybe that's really where the business here is.. working through a whole bunch of custom data-sets and trying to generalise from there. It'll be hard to generalise all of it, but I'm sure there'll be pockets of functionality that can be shared across more than a single data-set.
And maybe that's at the core of the issue here, namely that this service in its current form doesn't scale like b2c internet tech
Most startups want to sell you the equivalent of those airline kiosks where you tag your own luggage, because they are cheap to deploy and don't have high labor costs. The problems you describe are labor-intensive and not easy to solve automatically.
I think we will go back to tools combined with humans to solve at least some of them. So it's services and software.
This rings so true! I think it's natural whenever there's a new technology that a lot of start ups spring up with a vibe of "GenAI is cool, let's do something with that!", which is 100% the wrong way to go about building something.
Starting by investing yourself fully into a given problem, and fixing it with the most appropriate tool (might be GenAI, might not) is much more likely to end in something people actually want or need.
Doing the reverse, and trying to find an existing problem that matches a solution you've already picked is how you end up with hundreds of companies selling thin API wrappers for ChatGPT.
That is basically every SaaS solution problem. You are not going to want to pay them for your custom development because you would like pay "generic solution price".
That is why you have to pay for your own dev team as SaaS vendor is not going to be your custom development team - as your custom problem is not available for easy resell to other customers.
> Want to solve a real problem, help me create custom benchmarks, clean my data, get my small parameter model to reason better etc.
I recently started a company with a friend if mine to do exactly this.
Ive worked at a few AI startups over the last 8 years, and the problem everyone tackles independently (and poorly) is the long tail of dealing with input data that isn't great. You build a demo with sample data that works well, then you move on to real world uses and the data is suddenly... blehg.
I work for a foundational AI company. I guess we're technically AI infra. We're inherently "narrowly" focused since our origin (which was well before the recent hype in past 2-3 years).
Our customers are really the type of AI infra companies being talked about in this article. And yea, the new ones I work with everyday are often a dime a dozen. A revolving door of small startups trying to make the same general purpose AI infra targeting other traditional "boring enterprise infra" companies.
The ones that I'm seeing get the most traction, have the best products, and best chances of success have zeroed in on specific niches and sub-industries. (Think AI infra that helps B2B2B companies where that last "B" is like Roofing companies and the value provided is helping Roofing companies easily and drastically scale their outbound and inbound marketing and sales.)
The startups I work with that make me scratch my head are the ones trying to build "disruptive" AI infra that does nothing different, provides nothing special, other than potentially nice UI/UX, and is liable to have their lunch eaten by either natural iterations and improvements of our own services they essentially just white label, or some other incumbent.
To me, it's like trying to create a new company to compete against Walmart and Target on groceries because they're too massive scale to win against "a well tailored customer experience" but then forgetting Costco, Aldi's, Trader Joes, and Whole Foods exist. And why would any of those aforementioned companies feel the need to acquire you rather than casually crush you as they go about their business either ignoring you as you wither or taking your good ideas and incorporating them into their own offering?
It's not impossible, just has to make sense and even then a certain degree of "the stars aligning" is required. Which is why there inevitably can only be a small group of winners out of this massive sea of hopefuls.
And I of course can only shrug my shoulders if asked if the AI infra startup I work at is differentiated, necessary, and lucky enough to be at the finish line with the survivors at the end. (We're finding our PMF and potential road to incumbency mainly with two-ish markets: old and new school enterprise infra and non-tech Fortune 500 type of companies.)
Author here. Thanks for the perspective. p.s. I do hope AI startups not estimate how hard it is to break into vertical markets which have their own challenges
Way too many founders don't understand the impact of competing with cloud vendors.
Almost all enterprises have pre-committed budgets for cloud which means unless your product is FOSS it's going to be hard to convince someone to bet their business on it. Especially given that in this fundraising environment there is a 95% chance they won't be around in a year or two anyway.
It's going to be a brutal few years especially if we are heading into a period of diminishing returns in terms of LLM accuracy.
I’m also sort of curious as to how much of a market research they’ve done if they’re trying to compete with Azure and AWS.
Even before the recent LLM rush took off, AI was a thing. In the city of Copenhagen there was a project to digitalise a few million case files (which is 10-100 documents per case file), and how it was done was basically with an intermediary company who knew the training and a cooperation with Microsoft. Yes, I’m dumping down the complexity of it all, but once the training period of half a year was over, Azure made a lot (and I mean a lot) of infrastructure available for not a lot of money and the process completed in a week or so. Since it had to happen and because it was a PoC the same project was also done by real humans. This was the “actual” project and every time deadline and whatnot the AI project had came from how long it would take X humans to do it. I can’t recall how many X was, but it was enough to meet the legal deadline for when these case files had to be digitised and sorted correctly.
The human project was the result, and then the AI PoC was later used as a lesson on whether it could be done this way or not. It can, it was more accurate and not more expensive.
Anyway… I’m not sure who would’ve been capable of competing with Azure. (Outside the usual suspects). Maybe a company of Hetzner could? But you would need someone who can offer you a massive amount of computing on demand, and the only companies which are going to have that are big vendors.
Maybe it’s different with LLMs because the requirement is a continuous thing rather than something you need for a short period of time?
Assuming: 50 million files, assuming 4x concurrency per CPU, taking 1 second each, would take approx 150 CPU days. Using just 10 machines it could be done in 15 days. This does not fundamentally seem like a massive project in terms of compute? If time per instance would go up factor 10x, if one would allow 30days execution, could be done by 50 machines. I think that most compute providers can do that (given some months notice)?
That's the observation.
I work for a huge company, both in terms of number of employees and geographical reach.
I/we receive emails, offers, and declarations of interest each month/year from dozens of start-ups selling some AI infra system, labeling service, GenAI whatever.
We are not buying anything and we won't. We already have contracts in place with cloud vendors--the usual suspects: Google, Microsoft, Amazon--and the rest of the infra is developed in-house.
Why should we buy subscription-based and high-maintenance products from an a16z-funded AI infrastructure company that might shut down operations in two years and say "bye-bye, it's been great"?
The former Uber/Michelangelo team that built Tecton, Netflix's Metaflow that later became Outerbounds, I have no idea who they are selling their AI infra products to.
> Almost all enterprises have pre-committed budgets for cloud which means unless your product is FOSS it's going to be hard to convince someone to bet their business on it.
This isn't a death knell.
1. If you get into the marketplace, enterprises can spend their commit against you.
2. A few million in ARR is ~nothing to a hyperscale cloud, but meaningful to most startups. If you find the right positioning, you can get their sales team selling your solution on many deals.
Hyperscale cloud has stringent security and compliance requirements, and within hyperscale cloud, it's institutionally difficult to lobby for spending on a startup. I spent about six years in hyperscale cloud and never saw a case where we spent a few million on a startup.
You seem to have misunderstood, I'm not referring to the hyperscaler spending money themselves.
If you don't think any ISVs are making millions through the marketplace you're simply mistaken. I worked at Google Cloud and personally know at least one startup that made the majority of their revenue through cloud partnerships.
Author here. Yes, that said getting into the ISV / APN partner program can be a major pain in the butt and take quite long (from a startup’s perspective where time is precious)
If I'm an application developer or manager, at any >100 person company, it normally doesn't fall into my remit to go out and pick a new company to contract with to provide services. Typically, it gets harder and harder to do that. Even with LLM stuff, we're contracting that through our existing relationship with Microsoft. When evaluating infra options, it therefore is a huge barrier to entry for most developers if there's a 'good enough' option on one of the main cloud providers
I've experienced this too. Any new service that requires more than a credit card number has to go through the legal department to review the contract and that means it goes to the bottom of the pile of similar requests and probably won't even get a look for months.
Great article, and pretty relevant to what I'm building (cloud developer tooling, including some genai, but also including non-AI tools + an application platform. Email me if interested.).
Obviously I'm not nearly as pessimistic about it. Zoom out for a sec and generalize to SaaS in general, not just AI infra (a subset of Saas) - all the arguments listed apply there too, except the data moat (which honestly doesn't matter to tons and tons of AI infra companies. That's more of an AI application problem). Now of course most startups are doing AI at least a bit, but in the past decade we've seen plenty of SaaS vendors compete with incumbents either head on or by carving out their own niche. In fact, two of the companies the author considers "incumbents" are arguably still challengers, but definitely were in this exact situation just a few years ago: Vercel and Databricks.
Also, competition from incumbents is hardly a deathknell. There's room for multiple products in some market segments - how many RDBMS companies are there? Competition from a huge incumbent in many ways comes with benefits, because it helps grow the overall market and awareness of the product space, including your own product.
I suppose according to this author I'm in the "application layer" even though really I'm in the AI-application-layer-now-but-not-later-layer, software-infrastructure-layer. And that's great because I actually do have experience in that specific application area. But honestly, saying "you ought to have expertise in your domain" is 1) duh 2) in the examples (llamaindex parsing/ocr, langchain llmops + agnetic stuff), there is clearly a big enough twist on doing it "but with AI" that the application/vertical is close to novel. Successful challengers create valuable businesses without prior deep expertise in their domain all the time and I don't really see how this is any different.
Basically, you could repeat this for any SaaS business. Starting a company is hard, but I don't know if AI infra is uniquely hard in the ways laid out.
The title is true. But, the arguments don't hold water for me. 12 years ago, I started a big data company. It looked similar for big data companies when Cloudera raised almost $1B in 2014. Too many people building data warehouses, especially in the cloud. I exited. Who knew that Snowflake and Databricks would emerge against the incumbents. Similarly, there will be winners in the AI infrastructure space. To win, you need to focus on your customers and delight them. Narrowing focus makes a lot of sense. Don't pay attention to the doom and gloom, or you'll never do a startup.
I've worked with hortonworks, cloudera, and databricks. It's no surprise at all that databricks is killing the competition. Those other companies products were embarrassingly terrible. Not stable, slow, and worst of all, I had instances of wrong results. The software just wasn't good.
Databricks is different, it's fast, it's robust, I trust the results. They just built a good quality product.
They are bundling open source stuff (Spark and Delta and so on), sure, they have their fancy IDE and whatnot on top, but they can let the community maintain things, scale back R&D, focus on things that matter to existing clients.
They have 1.6B revenue, a 50% YoY growth. And still not profitable. Hm, okay, recent acquisition on ML stuff, and of course probably burning hundreds of millions on cloud-GPU-AI shit.
Well, I guess as long as they have so big growth it makes sense to invest and raise ... and yeah that probably completely obscures the actual profitability of their core business. (Not to mention that they are probably spending all that money to try to expand their core business. To upgrade their value prop from cloud version of less-dumb-data-pipelines to 1-800-data-4-AI.)
They are becoming like SAP though where initially a company buys one service and it soon finds itself buying every adjacent service from them. If they manage to do that successfully they will be quite profitable.
Good article, but what is the alternative? What can you build today as a software engineer that can have impact? Nothing seems to come close to AI / AI infra, even of its hard / risky / a moving landscape.
I would almost invert that statement. Sorry if this comes off ranty, but what exactly are people doing in the "AI space" currently that isn't "undifferentiated spam/chatbot" being sold to non-techies who heard about AI on NPR? What are real people using "AI" for that is so insanely valuable today? How much "company Y: same product with a chat window, sparks emoji" do we all need before this thing levels out and we all take a breather on the hype?
- writing and refactoring code. probably 50 times a day now
- improving documentation across the company
- summarizing meetings automatically with follow ups
- drafting most legal work before a lawyer edits (saved 70% on legal bills)
- entity extraction and data cleanup for my users
Put a number on it. How much value of this will they capture from you personally (we'll assume, very very charitably by the sound of it, that you represent an "average" user of AI products) when this market matures? Exactly how much will your employer pay for a meeting summarizer? $10/mo a seat, $20/mo a seat, $50/mo a seat? Could the product sustain a 5x, 10x, 50x price hike that is going to have to happen to recoup the investment being made today?
Agreed. Even if right now this seems like stuff companies want to throw money at for novelty/FOMO related reasons, I think eventually reality ought to catch up.
Probably an unpopular opinion, but I think the most efficient companies of the future will tackle the ironies of automation effectively: Carefully designing semi automation that keeps humans in the loop in a way that maximises their value - as opposed to just being bored rubber stamping the automation without really paying attention.
I'd say if you're not using a meeting summarizer, you're wasting someone's time by having them write up notes. if you're not writing up notes, you're wasting someone else's time recapping the meeting for them. meeting notes are a 1 (meeting):many relationship for conveying information as to what was discussed. how else do you go back and see what the one person on the storage team talked to your the person on your team who left last week about so you can go into the next meeting with them prepared?
If your meeting produces "notes", and those are relevant for people that were not in it, you are doing it wrong.
If your meeting is aimed at producing "general understanding", it's already a dangerous one, and the understanding should go to the correct documentation (what is best done during the meeting). Otherwise, it should produce "focused understanding" between a few people and with immediate application.
If all you take from it is notes, well, I'm really sure that your team won't go digging through meetings notes every time they need to learn about some new context. Meeting notes are useful for CYA only, and if people feel safe they'll be filled directly at /dev/null.
Going to be vague, but I'm using it to scale out human processes in ways I couldn't using humans (because they cost too much) or regular code (because it's unstructured). Early results are promising, we've found a bunch of stuff which has been buried... and is potentially worth millions. Not a chat wrapper, just breathing new light into our regular old business.
What do you consider "AI"? Because machine learning models have been deployed in enterprise systems for years. Video processing, security, data labeling, sentiment analysis. The sexiest one I can think of in recent memory is nVidia DLSS.
Broadly, what marketing is saying is “AI.” There is huge value being created with deep learning today on internal systems. Recommenders, machine translation, computational photography… it is huge, improves people's lives, drives revenue.
None of that is marketed as "AI." It's just a thing the computer does. The single most valuable application of deep learning so far (content recommenders) is a cultural phenomenon, but it’s not referred to as “AI” but rather “the algorithm.”
Not sure why this is down voted, that is the key question. Impact means different things to people. Could be:
1. Building a sustainable business and making decent money
2. Building a market leader and making ludicrous amounts of money
3. Advancing the state of the art in technology
4. Helping people with their little daily struggles
5. Solving pressing problems humanity is facing
Or many other things I suppose. Now if you believe that AI is eventually going to make anything humans can build now redundant, that'd be a reason to believe nothing else matters in the end I suppose. But even if we get there, there's a lot of road leading to that destination. Any step provides value. Software built today can provide value even if nobody is going to need it ten years from now. And it's not like you could even predict that.
The motive is to get acquired in most cases. It’s obvious and starts to make sense when you see startup that has no feasible monetisation strategy on the horizon, yet they exist and get funding. They’re betting on building infra to be hopefully used in large corp and this is their demo/PoC.
Anything SaaS that solves a painpoints for established industries. Those that have billions of turnaround for decades already, are not good at building tech themselves, and buy solutions/services to run their business. Bonus for low barriers to entry. Agriculture, logistics, real estate, energy, etc.
I have a theory that the days of established businesses that don't know tech is dwindling. A lot of companies which has adopted tech has started building a small foundation of talent internally. I think you're seeing this trend accelerate with the large tech companies laying people off. I have heard about top grade data science talent landing at some small sized health plan.
My companies fastest growing competitor is "internally sourced departments" of the services we provide.
Yes computer savvyness is on the rise, and have been for decades, and this will continue. But there are many levels: Ability to be a competent user, and competent buyer, ability to build it themselves. Then there are big difference in buy vs build culture. And preferences for type of solutions that are default buy vs default build. And finally, smaller and medium sized organizations are less likely to have internal teams. All this should be analyzed for the specific market, product and customer segment one targets.
Slightly different take than some of the siblings: you can still just build this stuff. If your goal is impact, maybe the best place to do it will be at a cloud vendor or other big corp. If your goal is actually just a big VC exit, then maybe not.
If your product is something that can be ripped off in 3 months, then it probably wasn’t going to have a long term impact anyway.
Define ‘impact’. Does ‘impact’ here mean ‘tickles the fancy of a 2024-era VC’? If so, you may be right. If used in its common meaning, absolutely not; most of this stuff is ~useless.
All the same stuff, to be honest. If AI is set to replace human work, well we have had a cheap human labour market for decades and yet we still need software. An LLM can't replace a business itself, which is made up of niche processes, direction and purpose, which we sometimes codify into a SaaS. We'll still need to do all that even if AI replaces some of the human parts of the business.
> What can you build today as a software engineer that can have impact?
Quite a bit, if you don’t follow the standard tech hype. Find an industry that isn’t tech-first and you’ll notice that there’s a lot of room for improvement.
As someone who has semi-unwillingly worked in infrastructure and infrastructure consulting most of my career - we’ve never even really solved that problem, what on god’s green earth convinced you AI did?
It bugs me that all we are seeing in the vc-backed startup scene seems to be ai infrastructure startups. We got something close to ai and all people come up with is they want to be the next ai marketplace store or the millionth infrastructure startup that does exactly the same like their competitors. How boring.
It's as if all of the AI devex/infra companies are cargo culting the story of how the people that made the most money in the gold rush were the people who sold the tools.
The thing is that the tools were well understood and battle tested.
And this isn't just a matter of AI: You see all kinds of companies trying to provide value adds on top of cloud: "We will annotate DNA for you as a service!" When all they do is dockerize the same tools their customers use, and serve as small shims for the least sophisticated customers. The moment said customer grows, they understand they can replace the vendor with less than a week of work.
The people making shovels make the money by having strong profit margins, becoming a default vendor, and having a moat. Good luck doing that in AI!
And my favorite counter example of selling tools is precisely docker: They built tech used everywhere... yet how much value they captured? It's tge same story all over dev tool space.
GenAI applications are so finnicky that it’s easier to build a company around tools-for-AI than a company fitting their archtypical user profile doing AI applications (that’s actually profitable. Most of their customers likely aren’t anywhere close). That’s inverted from the prior SAAS/cloud boom.
I too think there are too many shovel chasers but I think it’s also a consequence of what’s easier to ship.
While I agree that AI infra startups are hard to build, I strongly disagree with the idea that they are harder than foundational or application layer startups. I think it boils down to what you know and what resources you can muster.
For instance, foundational AI startups are also ridiculously hard to build. You need an insane amount of funding, spend it pretraining models to stay competitive only to find that gains in hardware and model architecture make them obsolete within months plus there's no real guarantee that scaling will keep working.
Application layer startups are hard in a very different way, there's an insane amount of competition and new capabilities are emerging every few weeks. I have worked with a few AI girlfriend startups and they are really struggling with keeping apace and warding off ridiculous amount of competition.
I think it's really just YMMV. Of course, the deeper you get into the stack, the more monopolizing pressure there is. Is it hard to build AI infra startups? Yes 100%. Will there be very few winners? Yes. Is it harder than foundational or application layer startups? Depends on the founders' strengths. Is it Is it a lost cause? I really don't think so.
Author here. Yes, I explicitly called out the danger of thinking application layer startups are easier, because it totally depends on the founding teams' backgrounds and interests.
This is a well-written blog post. Thank you to share.
This part:
> For AI infra startups to be “venture scale”, they will eventually need to win over enterprise customers. No question. That requires the startups to have some sustainable edge that separates their products from the incumbents’ (GCP, AWS, as well as the likes of Vercel, Databricks, Datadog, etc).
On the surface, I agree. But look at a parallel market segment: Cheap cloud hosting. Think: Linode (or any of its competitors). There are a bunch of cheap cloud providers who are more than 10 years old. They didn't all get bought out nor bankrupt by up-starts. Why? They must add just enough value to stay in business. Could we see something similar in the AI infra space? In fact, it looks more logical for the cheap cloud providers to try to build some AI infra -- low hanging fruit, to help with LLM training. (I am sure they already see GPU time.)
I've come across a number of these AI infra start-ups like Scale AI and Zerve and TBH I'm amazed they can do what they do with relatively small teams when you have Meta and Apple somewhat struggling in this area and buying rather than building themselves.
Did any of the startups in question actually ever want to build AI infrastructure or did they all pivot from Metaverse to Crypto to AI in the great pivotting of 2022.
Given VC's penchant for throwing cash at grifters in the latest hype space is it any suprise that some of the beneficiaries are looking for a quick exit before they have to do any actual work?
Yet there is little "AI" specific in this AI infrastructure startup challenge:
1) insane levels of competition towards any goal make relavant minor, secondary, traits that are not obvious before hand. Pure luck becomes more important.
2) excess market concentration (of which the tech sector is maybe the most egregious example) makes any new initiative harder. The more dominant and controlling the incumbents the harder to find a decent sized niche to grow.
3) selling to risk averse enterprizes / organizations is always an uphill battle that requires climbing a mountain of bureaucracy and regulation, only to eventually face random internal politics.
In the end the current craze will certainly produce a modified tech landscape. These recurring hypes always overpromise and underdeliver, but a cumulative effect is slowly happening.
In such stormy seas its hard to identify an optimal course and strategy. Riding every hype wave may sound silly but might work. On the other extreme, one may seek beacons indicating eventual stable land and try to navigate there.
Good tips, especially the point about narrowing the scope. At https://Lemonfox.ai we started with a LLM, image and speech-to-text API. Now we are only focusing on the speech-to-text API as the other areas are already very crowded and there's a lack of innovation in the speech-to-text space.
> Now we are only focusing on the speech-to-text API as the other areas are already very crowded and there's a lack of innovation in the speech-to-text space.
I'm legitimately wondering how your hosted Whisper API for $0.17/hr is supposed to compete with groq's exact same API that costs $0.03/hr.
You may be about to find out how crowded all of the AI infra spaces are.
I strongly recommend narrowing your scope far beyond modality. If you've been working with this tech and getting familiar with it then you already have valuable expertise. Pivot now or panic later. If you want to stay in the speech space find what markets are being underserved with speech AI related solutions. Are there pain points there that can be solved by a STT API? If so, build those solutions. You can't compete at the infra layer and I'm not sure why you would want to try if you don't already have something unique about your offering beyond hosting open source models. It's never good if your competition is potentially just a single developer in a company standing up your entire service internally in a week.
If you are determined to stay in the AI infra space then you'll need to be tackling a hard problem that companies want solved. Maybe take a look at fine-tuning models. Hard problem and maybe there's a hunger for it. (It's a risky one to tackle too though since it's very possible general/foundational models will maintain a grip on "good enough".)
Like how do you plan on competing against multimodals, which keep getting cheaper and clearly can do audio->text? Or existing incumbents like Deepgram? Or just the generic APIs provided by the big clouds.
Gen AI feels more and more like NFTs and blockchains, and overall, a lot like pre-2001-bubble (or more accurately post y2k).
A very exciting and expensive solution in search of an actual problem, that will ultimately find its way, commoditised, in a small niche, while adjacent technologies take the lead for productive use-cases.
Does this logic also apply to industry-specific "AI Infra?," where the APIs are wrapping a service that solves a domain-specific problem using AI, rather than general purpose infra technology? And provides those APIs to other businesses within that industry?
Yes, Brannan cornered the market before he sold the shovels. Selling shovels is not that profitable if you skip that "Step 1".
> he owned the only store between San Francisco and the gold fields — a fact he capitalized on by buying up all the picks, shovels and pans he could find, and then running up and down the streets of San Francisco, shouting 'Gold! Gold on the American River!' He paid 20 cents each for the pans, then sold them for $15 a piece. In nine weeks, he made $36,000."
Author here - the article isnt questioning the value of AI per se, but value capture and competitive dynamics. Internet routers remained valuable but got commoditized, sort of a thing.
VC pouring money in LLM infra is legitimately crazy to me. It's clear as day that there will be winners of this AI cycle, but, as always, they will be companies that provide actual, real, tangible value. Making shovels works for huge companies like Nvidia or Intel, but it won't work for you. It's sad to see so much capital funneled in frameworks upon frameworks upon frameworks instead of fresh new ideas that could revolutionize the way we interact with our devices. I know it's a bit of a meme, but I'd rather see more Rabbit R1 and less LangChain.
Even OpenAI doesn't really have a product. Just throwing data at a bunch of video cards isn't value-generating in itself. We need a Dropbox or a Slack or an Instagram: something people love that makes their life easier or better.