Building Meta's GenAI infrastructure

danielhanchen · 2024-03-12T17:18:22 1710263902

float8 got a mention! x2 more FLOPs! Also xformers has 2:4 sparsity support now so another x2? Is Llama3 gonna use like float8 + 2:4 sparsity for the MLP, so 4x H100 float16 FLOPs? Pytorch has fp8 experimental support, whilst attention is still complex to do in float8 due to precision issues, so maybe attention is in float16, and RoPE / layernorms in float16 / float32, whilst everything else is float8?

GamerAlias · 2024-03-12T17:29:29 1710264569

I was thinking why is this one guy on HN so deeply interested and discussing technical details from a minor remark. Then I clocked the name. Great work on Gemma bugs

danielhanchen · 2024-03-12T17:59:19 1710266359

Oh thanks :) I always like small details :)

andy99 · 2024-03-12T19:18:49 1710271129

Is there float8 support in any common CPU intrinsics? It sounds interesting but curious what will be the impact if any on CPU inference.

teaearlgraycold · 2024-03-13T05:37:41 1710308261

I’m curious if there’s a meaningful quality difference between float8 and some uint8 alternative (fixed precision or a look up table).

CraigJPerry · 2024-03-13T07:48:46 1710316126

A LUT could be a significant performance penalty would it not? Instead of a float8 (potentially multiple in simd case) in a register, you’re now having to head out to at least L1 cache to dereference the value in the LUT.

Plain uint8 wouldn’t allow for the same accuracy range as float8 and it’s the accuracy not the precision (which uint would win for the largest values it can represent) that counts most.

danielhanchen · 2024-03-13T08:00:04 1710316804

Oh oh was just gonna comment as well, but saw this! I think x86 has like pshufb for LUTs (used them like ages ago, but forgot now :() I think also some game (was it Spiderman) used loads of lookup tables.

The issue with LUTs is don't you have to update the LUT itself? You can select which memory address to load up, but the LUT itself has to be differentiable maybe? TBH I'm not an expert on LUTs.

On fixed point - similarly ye you have to fix the precision ranges as well, so again I'm unsure on how one changes the fixed point numbers over time. I'll have to read more on fixed point.

Maybe 1.58bit using (-1, 0, 1) which gets rid of multiplications and just additions might be more useful, although you'll only get a 2x FLOP boost since you still need fp8 or fp16 addition.

protomolecule · 2024-03-13T09:02:54 1710320574

>I think x86 has like pshufb for LUTs

There is also VPERMI2B [0] which operates on a 128 byte LUT.

[0] https://en.wikichip.org/wiki/x86/avx512_vbmi

danielhanchen · 2024-03-13T09:12:23 1710321143

Oh I forgot about that!! But ye LUTs are very interesting and fascinating :) One of the hidden gems of CPU optimizations :)

ashvardanian · 2024-03-12T22:35:48 1710282948

Nope. Moreover, simulating it even with AVX-512 is quite an experience. Been postponing it for 2 years now... But first of all, you need to choose the version of float8 you want to implement, as the standards differ between GPU vendors.

janwas · 2024-03-13T02:03:31 1710295411

We use it in gemma.cpp [1]. This hybrid of E5M2 and E4M3 decodes to bf16 in ~14 instructions, so we can do that on the fly during dot products.

[1]: github.com/google/gemma.cpp

danielhanchen · 2024-03-13T02:35:24 1710297324

Congratulations on gemma.cpp!!

ipsum2 · 2024-03-12T19:44:05 1710272645

You're still bounded by memory bandwidth, so adding multiples to FLOPs is not going to give you a good representation of overall speedup.

jabl · 2024-03-12T19:49:38 1710272978

Well, those smaller floats require less BW to transfer back and forth as well. Perhaps not a reduction linear in the size of the float, as maybe smaller floats require more iterations and/or more nodes in the model graph to get an equivalent result.

But rest assured there's an improvement, it's not like people would be doing it if there wasn't any benefit!

andy99 · 2024-03-12T21:10:20 1710277820

The impact on bandwidth is the main reason smaller is better I belive, certainly when it's the bottleneck. I'm only really familiar with CPU but with say FP16 you might convert back to FP32 when you're doing the actual multiplication (so conversion plus multiplication is actually slower) but because you're moving half the data in and off you still get a huge speedup.

danielhanchen · 2024-03-13T02:34:55 1710297295

I can't remember some research paper somewhere even if you do float32 multiplications, but keep the data in bfloat16 by just simply truncating the lower mantissa bits, and doing packing, you still get speedups, since matrix multiplication is bound both by compute and cache access. If you can optimize on the cache side of things, speedups are definitely there.

danielhanchen · 2024-03-13T02:33:18 1710297198

I'm not sure exactly on how NVIDIA calculates FLOPs, but I do know for Intel's FLOPs, it's calculated from how many FMA units, how many loads can be done in tandem, and what the throughput is. And ye fp8 requires 2x less space. Sparse 2:4 might be less pronounced, since the matrix first needs to be constructed on the fly, and there is like a small matrix of indicator values.

j45 · 2024-03-12T22:47:57 1710283677

Is it safe to assume this is the same float16 that exists in Apple m2 chips but not m1?

j45 · 2024-03-13T01:17:57 1710292677

Clarification: bfloat16

“bfloat16 data type and arithmetic instructions (AI and others)”

https://eclecticlight.co/2024/01/15/why-the-m2-is-more-advan...

boywitharupee · 2024-03-13T01:16:59 1710292619

care to explain why attention has precision issues with fp8?

danielhanchen · 2024-03-13T01:39:43 1710293983

Oh so float8's L2 Norm from float32 is around I think 1e-4, whilst float16 is 1e-6. Sadly attention is quite sensitive. There are some hybrid methods which just before the attention kernel which is done in fp8, upcasts the Q and K from the RoPE kernel to become float16, then also leaves V to be in float8. Everything is done in fp8 on the fly, and the output is fp8. This makes errors go to 1e-6.

alecco · 2024-03-13T10:28:54 1710325734

Yes, but it's a bit more complicated. There are 2 FP8 formats: E5M2 and E4M3.

E5M2 is like an IEEE 754. But to compensate the smaller exponent, "E4M3’s dynamic range is extended by not representing infinities and having only one mantissa bit-pattern for NaNs".

Some people reported E4M3 is better for the forward pass (small range, more precision) and E5M2 is better for the backward pass (bigger range, less precision). And most implementations have some sort of scaling or other math tricks to shrink the error.

[0] FP8 Formats for Deep Learning (Nvidia/ARM/Intel) https://arxiv.org/abs/2209.05433

danielhanchen · 2024-03-13T11:04:22 1710327862

Fair points! Ye Pytorch's fp8 experimental support does scaling of the gradients. Interesting point on a larger range for the forward pass, and a small range for the gradients! I did not know that - so learnt something today!! Thanks! I'll definitely read that paper!

dougdonohoe · 2024-03-12T23:35:06 1710286506

Having lived through the dot-com era, I find the AI-era slightly dispiriting because of the sheer capital cost of training models. At the start of the dot-com era, anyone could spin up an e-commerce site with relatively little infrastructure costs. Now, it seems, only the hyper-scale companies can build these AI models. Meta, Google, Microsoft, Open-AI, etc.

herval · 2024-03-13T03:53:40 1710302020

I’m not sure we went through the same dot-com era, but in my experience, it was extremely expensive to spin up anything. You’d have to run your own servers, buy your own T1 lines, develop with rudimentary cgi… it was a very expensive mess - just like AI today

Which gives me hope that - like the web - hardware will catch up and stuff will become more and more accessible with time

Jensson · 2024-03-13T11:08:40 1710328120

> I’m not sure we went through the same dot-com era, but in my experience, it was extremely expensive to spin up anything. You’d have to run your own servers, buy your own T1 lines, develop with rudimentary cgi… it was a very expensive mess - just like AI today

To make your own competing LLM today you need hundreds of millions of dollars, the "very expensive" of this is on a whole different level. You could afford the things you talked about on a software engineering salary, it would be a lot of money for that engineer but at least he could do it, no way anyone but a billionaire could fund a new competing LLM today.

anon373839 · 2024-03-13T11:46:41 1710330401

I think the foundation models are a commodity, anyway. The bulk of the economic value, as usual, will be realized at the application layer. Building apps that use LLMs, including fine-tuning them for particular purposes, is well within reach even of indie/solo devs.

That’s why Sam Altman makes so much noise about “safety” - OpenAI would really like a government-backed monopoly position so they can charge higher rents and capture more of that value for themselves. Fortunately, I think that llama has already left the barn.

herval · 2024-03-13T16:02:08 1710345728

I think openai/anthropic/etc are banking on foundation models being the equivalent of the "datacenters" or AWS-equivalents of AI - there'll be PaaSes (eg replicate), and most businesses will just pay the "rent"

herval · 2024-03-13T16:00:35 1710345635

Only if you're creating a foundation model. The equivalent would be competing with a well-funded Amazon, back in 1999. You can compete in building LLM-powered products with much, much less money - less than a regular web app in 99

renegade-otter · 2024-03-13T00:01:01 1710288061

Not everything has to be AI. You can run a small business infra for MUCH less than you did back then, especially if you adjust for inflation (!).

Training AI models costs a fortune, but so far it's been just front-loading costs in hopes of a windfall. We'll see what actually happens.

boringg · 2024-03-13T02:44:24 1710297864

Front loading costs to eventually extract rents on usage with one hell of a capital wall protecting the assets.

Its easier to spin up a business for sure -- also easier to unwind it - there not as sticky as they used to be.

whatshisface · 2024-03-13T03:00:27 1710298827

If the government can stay back far enough that more than one AI company can train their models, it will end up working like steel mills - barely enough profit to pay the massive cost of capital due to competition. If the government regulates the industry into a monopoly, all bets are off. Their investors are going to push hard for shutting the door behind them so watch out.

The only question is - what tactic? I don't really know, but one trick I am aware of is "specifying to the vendor." In other words, the introduction of regulatory requirements that are at every step in the process a description of the most favored vendor's product. As the favored players add more features, potentially safety features, those features are required in new regulations, using very specific descriptions that more or less mandate that you reproduce the existing technology, to use a software engineer's term, bug-for-bug. If your product is better in some ways but worse in others, you might have a chance in the market - but to no avail, if the regulations demand exactly the advantages of the established suppliers.

woooooo · 2024-03-13T12:48:37 1710334117

Funny example, US Steel was a textbook case for a monopoly achieved privately because it wasn't regulated against.

whatshisface · 2024-03-13T16:52:59 1710348779

They were on top for a while, but later fell behind because they didn't invest. There were heavy tarrifs in place to "protect" the monopoly from foreign competition.

woooooo · 2024-03-13T18:03:42 1710353022

All true, but the initial agglomeration was 100% private

whatshisface · 2024-03-13T18:11:30 1710353490

If privately arising monopolies could only be kept from buying out their regulators, they'd privately break down before they became too odious... for example Google, which for years was the only remotely good search engine, is now merely one of the better search engines. If there had been a "department of consumer information safety," staffed by the best industry professionals status can buy, that might have not happened.

brookst · 2024-03-13T03:19:12 1710299952

This is typically called a high fixed cost business, like airlines, hotels/apartments, SpaceX, etc.

The dream may be barriers to entry that allow high margins (“rents” if you prefer the prejudicial), but all too often these huge capital costs bankrupt the company and lose money for investors (see: WeWork, Magic Leap). It is high risk, high return. Which seems fair.

nradov · 2024-03-13T15:56:45 1710345405

Nothing in the WeWork business model is inherently capital intensive. Fundamentally they just take out long-term office leases at low rates, then sublease the space to short-term tenants at higher rates. They don't really own major assets and have no significant IP.

boringg · 2024-03-13T13:49:22 1710337762

I understand the economics concept. I'm not sure WeWork was a great example it had significant other challenges such as a self-dealing founder and, frankly, a poor long term model.

I would wager that the concept needs a bit of a refresh as historically it has referred to high capital costs for the production of a hard good though in this case there is more than just a good produced theres a fair bit of influence and power associated with the good and a ton of downstream businesses that are reliant upon it if it goes according to plan.

brookst · 2024-03-13T15:14:42 1710342882

Agreed, and Magic Leap had its own problems. My point was just that “invest huge amounts of capital to create a moat and then monetize in the long run” Is an inherently risky strategy. Business would not work if society insisted that large, high risk investments could not product higher long term margins than less risky investments.

renegade-otter · 2024-03-13T15:04:56 1710342296

It's more like "disrupting the market". The problem is that it's a whole market.

Uber just now turned its first profit since 2009, and I would wager that if not for the newly found appreciation of efficiency and austerity, it would still be burning through money like a drunken socialist sailor.

Classic approach required basic math. "Here is my investment, here is what I am going to charge for rent". You actually can figure out when your investment starts paying off.

This new "model" requires tall, loud, truth-massaging founders to "charm" VCs into giving away billions, with the promise of trillions, I guess. The founders do talk about conquering the world, like, a lot.

I do not know what the WeWork investors were thinking when they expected standard real estate to "10x" their money while the tenants were drinking free beer on tap. The whole thing screamed "scam" even to a lay-person.

andy99 · 2024-03-12T23:40:39 1710286839

So far it's been pretty "democratic" - I feel in no way disadvantaged because I can't train a foundation model myself. Actually the ecosystem is a lot better than 25 years ago - there are open source (or source available) versions of basically everything you'd want to participate in modern AI/ML.

mewpmewp2 · 2024-03-12T23:57:44 1710287864

But none of those are remotely as good as GPT4 for example.

to11mtm · 2024-03-13T00:07:15 1710288435

Mixtral?

ametrau · 2024-03-13T01:04:52 1710291892

Obviously not even close

nl · 2024-03-13T14:27:31 1710340051

I too went through the dot com era: as in when Sun Microsystems had the tag line "we are the dot in dot com".

I assure you that before Apache and Linux took over that "dot" in the .com was not cheap!

Fortunately it only really lasted maybe 1993-1997 (I think Oracle announced Linux support in 1997, and that allowed a bunch of companies to start moving off Solaris).

But it wasn't until after the 2001 crash that people started doing sharded MySQL and then NoSQL to scale databases (when you needed it back then!).

It's early. You can do LORA training now on home systems, and for $500 you can rent enough compute to do even more meaningful fine-tuning. Lets see where we are in 5 and 10 years time.

(Provided the doomers don't get LLMs banned of course!)

danielhanchen · 2024-03-13T04:20:30 1710303630

Another way to compete with the big tech incumbents is instead of hardware, try maths and software hacks to level the playing field! Training models is still black magic, so making it faster on the software side can solve the capital cost issue somewhat!

toxik · 2024-03-13T05:58:13 1710309493

This kind of research is also incredibly capital intensive. You have to pay some of the smartest people around to work in it.

djhn · 2024-03-13T09:37:49 1710322669

That's labour and human capital intensive, not capital intensive. And I don't mean this as a technically correct nitpick: in terms of economics it's more accurate to call it the exact opposite of capital intensive.

toxik · 2024-03-13T10:17:38 1710325058

That’s a good point, I wanted to make the point that doing the research is also incredibly expensive because it requires some of the smartest people around, and the right background (and what even is that background?)

danielhanchen · 2024-03-13T11:11:10 1710328270

Ye not a bad point - also agree with djhn on stuff.

It's true it'll still be relatively expensive - but I would propose its relatively inexpensive if people want to make it faster, and have the drive to do it :) On the other hand, capital expenditures requires large amounts of money, which also works.

I guess some general CUDA, some maths, knowing how to code transformers from scratch, some Operating systems and hardware knowledge, and the constant drive to read new research papers + wanting to make things better.

I just think as humans, if you have drive, we can do it no matter the contraints!

djhn · 2024-03-13T10:28:50 1710325730

Yes, I agree with the general idea that it's not easy. Yet at least to some extent it might allow people and/or nations with (some degree of, relative) lack of capital but high levels of education and innovation to benefit and catch up.

richardw · 2024-03-13T07:05:22 1710313522

I find the market way more open and competitive than dot-com. Everyone is throwing up a chatbot or RAG solution. There are tradesmen and secretaries and infinite 19 year olds who are now able to wire together a no-code app or low-code bot and add value to real businesses. The hyper scalars are making some money but absolutely don't have this locked up. Any Groq or Mistral could wander in and eat their lunch, and we haven't really started the race yet. The next decade will be ridiculous.

infecto · 2024-03-13T11:49:50 1710330590

Could not have said it better. Nobody has won the race yet and things are getting better. Building a foundation model is not cheap but not out of reach still for a startup.

danielmarkbruce · 2024-03-12T23:39:07 1710286747

It's not quite the same thing. A model is just one part of a product. You can spin up a product with zero infra and calling APIs hosting models.

hackerlight · 2024-03-13T05:30:28 1710307828

Foundation models != application layer. The question is whether the application layer's lunch will be eaten by better foundation models.

tdudhhu · 2024-03-13T10:41:48 1710326508

As far as I know training is the main issue.

I don't know a lot about ML. Does anyone know if it is possible to keep training the system while it is running?

That would help a lot if you don't have the possibility to use huge training sets as a starting point.

xdeepak81 · 2024-03-13T11:20:30 1710328830

Ads and Search engine uses a continuous incremental training to add the new relevant information.

mindwok · 2024-03-13T02:04:47 1710295487

We will probably get there, it's just going to take time for hardware supply chains to catch up. I feel it's more comparable to mainframe eras - it took time for general purpose computing to become commoditised.

ZiiS · 2024-03-13T11:33:35 1710329615

Only hyper-scale companies like ATT could build the fibre; scrappy startups like Google and Amazon ate their lunch.

usiaiekamm · 2024-03-13T00:20:02 1710289202

They are also so far profitless (unless you are nvidia) and useless. The last gasp of an industry on its last legs.

rmbyrro · 2024-03-13T10:57:05 1710327425

Fine-tuning is quite accessible for the average small business or hacker, though.

islewis · 2024-03-12T16:41:29 1710261689

I know we won't get it this from FB, but I'd be really interested to see how the relationship of compute power to engineering hours scales.

They mention custom building as much as they can. If FB magically has the option to 10x the compute power, would they need to re-engineer the whole stack? What about 100x? Is each of these re-writes just a re-write, or is it a whole order of magnitude more complex?

My technical understanding of what's under the hood of these clusters is pretty surface level- super curious if anyone with relevant experience has thoughts?

bilekas · 2024-03-12T16:50:03 1710262203

I'm not 100% sure but I would.make an educated guess that that cluster in the first image for example is a sample of scalable clusters, so throwing more hardware at it could bring improvements but sooner or later the cost to improvements will call for an optimization or rewrite as you call it, so a bit of both usually. It seems a bit of a balancing act really!

jvalencia · 2024-03-12T23:22:05 1710285725

The cost of training quickly outpaces the cost of development as context length increases. So hardware is cheap until it isn't anymore, by orders of magnitude.

samstave · 2024-03-12T23:27:58 1710286078

But there is still significant cost in the physical buildouts of new pods/DCs, whatever and the human engineering hours to physically build, even though its a mix of resources across the vendors and FB? - it still would be interesting to know man hours into the physical build of the HW.

tintor · 2024-03-12T18:32:58 1710268378

"just a re-write"

mirekrusin · 2024-03-12T19:04:18 1710270258

...the idea is that at some point it "just re-writes" itself.

ametrau · 2024-03-13T01:06:32 1710291992

The day after that, we have true AGI.

jvanderbot · 2024-03-12T19:34:15 1710272055

So, I'd love to work on optimizing pipelines like this. How does one "get into" it? It seems a ML scientist with some C/C++ and infra knowledge just dips down into the system when required? Or is it CUDA/SIMD experts who move "up" into ML?

thegginthesky · 2024-03-12T23:43:27 1710287007

I know someone who works on this in Meta. His resume is computer science heavy, with a masters in Machine Learning. On the previous experience side, before getting into Meta, he had about a decade working as a Software Engineer with Machine Learning system in multiple languages, such as Go, C++ and Python.

To get the job he applied for a spot I'm Software Engineer applied in Machine Learning, he went through the multiple step interview process, and then when he got the job he did a few weeks of training and interviewing teams. One of the teams in charge of optimizing ML code in Meta picked him up and now he works there.

Because of Meta's scale, optimizing code that saves a few ms or watts is a huge impact in the bottom line.

In sum:

- Get a formal education in the area - Get work experience somewhere - Apply for a big tech job in Software Engineer applied with ML - Hope they hire you and have a spot in one of the teams in charge of optimizing stuff

jvanderbot · 2024-03-13T00:52:35 1710291155

This is helpful thank you. There's always some luck.

I have a PhD in CS, and lots of experience in optimization and some in throughput/speedups (in an amdahl sense) for planning problems. My biggest challenge is really getting something meaty with high constraints or large compute requirements. By the time I get a pipeline set up it's good enough and we move on. So it's tough to build up that skillset to get in the door where the big problems are.

KaiserPro · 2024-03-12T20:18:39 1710274719

A lot of the optimisation at this level is getting data into the right place at the right time, without killing the network.

Its also a group effort to provide simple to use primitives that "normal" ML people can use, even if they've never used hyper scale clusters before.

So you need a good scheduler, that understand dependencies (no, the k8s scheduler(s) are shit for this, plus it wont scale past 1k nodes without eating all of your network bandwidth), then you need a dataloader that can provide the dataset access, then you need the IPC that allows sharing/joining of GPUs together.

all of that needs to be wrapped up into a python interface that fairly simple to use.

Oh and it needs to be secure, pass an FCC audit (ie you need to prove that no user data is being used) have a high utilisation efficiency and uptime.

the model stuff is the cherry on the top

claytonjy · 2024-03-12T23:15:39 1710285339

can you say more about the network issues with thousands of k8s nodes? I'm regularly running 2-3000 nodes in a GKE cluster, majority have GPUs, is this something I need to be worrying about?

KaiserPro · 2024-03-13T10:40:50 1710326450

Only if you are paying for the network bandwidth. for example if there are nodes spanning more than one zone, and you pay for that traffic, you might want to think about moving stuff to a single zone.

For other settings, moving to something like opencue might be better (caveats apply)

jvanderbot · 2024-03-12T21:10:58 1710277858

Ok, but back to my main question, how do I get into this?

willsmith72 · 2024-03-12T21:38:40 1710279520

It looks more like an infra problem than ML. "Software architect"s mixed with devops/infra/sre people

jvanderbot · 2024-03-12T21:48:00 1710280080

Well since I'm not a ML engineer of any kind - that's good!

zooq_ai · 2024-03-12T22:24:28 1710282268

at the end of the day, you are still moving, storing and manipulating 1's and 0's, whether you are a front end engineer or a backend engineer or systems engieer or an ML engineer or an infra engineer

elbear · 2024-03-13T06:48:25 1710312505

yeah, but how do you get the hiring managers to see things in the same way? :)

zooq_ai · 2024-03-13T11:29:24 1710329364

well at least I fit my resume to match the 'job description' because at the end of the day it's all hallucinations and 'real' software engineers that has core computer science skills can literally do anything

chillee · 2024-03-13T01:48:15 1710294495

I work on PyTorch Compilers at Meta, and I think folks enter ML Systems from all directions :)

Some folks start with more familiarity in ML research and dip down as far as they need.

Other folks come from a traditional distributed systems/compilers/HPC background, and apply those skills to ML systems.

gajjanag · 2024-03-13T00:06:46 1710288406

Our group works on some of this stuff at Meta, and we have a pretty good diversity of backgrounds - high performance computing (the bulk), computer systems, compilers, ML engineers, etc. We are hiring.

Feel free to DM me to learn more.

jvanderbot · 2024-03-13T00:54:24 1710291264

I will, thank you. Any info is very helpful.

yalok · 2024-03-13T00:49:29 1710290969

start with something small - take some kernel function in C, and try to optimize it for your laptops assembly SIMD instruction set.

fuddle · 2024-03-12T17:23:44 1710264224

How much are they paying for H100's? If they are paying $10k: 350,000 NVIDIA H100 x $10k = $3.5b

trsohmers · 2024-03-12T18:22:46 1710267766

Significantly more than that; MFN pricing for NVIDIA DGX H100 (which has been getting priority supply allocation, so many have been suckered into buying them in order to get fast delivery) is ~$309k, while a basically equivalent HGX H100 system is ~$250k, coming to a price per GPU at the full server level being ~$31.5k. With Meta’s custom OCP systems integrating the SXM baseboards from NVIDIA, my guess is that their cost per GPU would be in the ~$23-$25k range.

fuddle · 2024-03-12T19:16:18 1710270978

350,000 NVIDIA H100 x $23k = $8b :0

verticalscaler · 2024-03-12T19:26:40 1710271600

Wait till you find out how much they spent on VR.

It is a real loophole in the economy. If you're a trillion dollar company the market will insist you set such sums on fire just to be in the race for $current-hype. If they do it drives their market cap higher still and if they don't they risk being considered un-innovative and therefore doomed to irrelevancy and the market cap will spiral downwards.

Sort of reminds me of The Producers.

oblio · 2024-03-12T19:31:49 1710271909

The thing is, this could be considered basic research, right? Basic research IS setting money on fire until (and if) that basic research turns into TCP/IP, Ethernet and the Internet.

verticalscaler · 2024-03-12T19:38:19 1710272299

I wish.

Funnily enough Arpanet and all that Xerox stuff were like <$50 million (inflation adjusted!) total. Some real forward thinkers were able to work the system by breaking off a tiny pittance of a much larger budget.

Where as I think this more appropriately can be considered the meta PR budget. They simply can't not spend it, would look bad for Wall Street. Have to keep up with the herd.

infecto · 2024-03-13T11:55:49 1710330949

Funny you pick a company that has very little to answer to the markets, out of all the large tech companies, META is the rare one that does not need to answer because Zuckerberg controls the company.

throwaway2037 · 2024-03-13T05:45:55 1710308755

    > Funnily enough Arpanet and all that Xerox stuff were like <$50 million (inflation adjusted!) total.

That doesn't say much. The industry was in utter infancy. How much do you think it cost to move Ethernet from 100Mbit/sec to 1GBbit/sec to 10GB to 100GB to 400GB to 800GB? At least one or two orders of magnitude.

How about the cost to build a fab for the Intel 8088 versus a fab that produces 5nm chips running @ 5GHz. Again, at least one or two orders of magnitude.

edmundsauto · 2024-03-13T23:21:55 1710372115

This suffers from hindsight bias, at the time it was impossible to know if Arpanet or flying cars was the path forward. A better comparison would be the total sum of investment : payoff ratio, and is not something we can see from where we are now. Only in the future does it make sense to evaluate the success of something. Unfortunately, comparison between eras is difficult to do fairly because conditions are so different between now and Xerox.

lotsofpulp · 2024-03-12T22:17:09 1710281829

> If you're a trillion dollar company the market will insist you set such sums on fire just to be in the race for $current-hype. If they do it drives their market cap higher still and if they don't they risk being considered un-innovative and therefore doomed to irrelevancy and the market cap will spiral downwards.

You don’t think earning increasing amounts of tens of billions of dollars in net income per year at some of the highest profit margins in the world at that size for 10+ years has anything to do with market cap?

verticalscaler · 2024-03-13T00:15:57 1710288957

$1T Market Cap lets it be known it will invest $10B a year into $current-hype that will change everything. P/E loosens speculatively on sudden new unbounded potential, Market Cap $1.1T. Hype funded. PR as innovator cemented.

throwaway2037 · 2024-03-13T05:48:25 1710308905

If you look at the R&D expenditure of Apple, it is mindboggling.

https://www.macrotrends.net/stocks/charts/AAPL/apple/researc...

Roughly 30B USD per year. And what are we getting? Slightly slimmer phones and 3500USD AR/VR headsets?

throwaway2037 · 2024-03-13T06:41:20 1710312080

> Market Cap $1.1T. Hype funded.

I'm confused. How does your stock price, which determines market cat, affect your cashflow to fund R&D? It does not.

niels_bom · 2024-03-15T07:00:44 1710486044

Capitalism is so weird sometimes.

bigcat12345678 · 2024-03-13T04:07:40 1710302860

Would you kindly provide sources to the numbers? What is MFN?

Thanks! (Your number is consistent with what I hear of, but I never managed to get solid sources to back them up)

YetAnotherNick · 2024-03-12T17:44:36 1710265476

> $3.5b

Which is a fourth of what they spent in VR/AR in a year. And Gen AI is something they could easily get more revenue as it has now become proven technology, and Meta could possibly leapfrog others because of the data moat.

dougb5 · 2024-03-12T18:38:18 1710268698

Proven technology, maybe, but proven product-market fit for the kinds of things Facebook is using it for? Their linked blog about AI features gives examples "AI stickers" and image editing... cool, but are these potential multi-billion dollar lifts to their existing business? I guess I'm skeptical it's worthwhile unless they're able to unseat ChatGPT with a market-leading general purpose assistant.

pests · 2024-03-12T19:11:48 1710270708

I have a few group chats just that devolve into hours of sending stickers or image generation back and forth, lately we've been "writing a book together" with @Meta AI as the ghost writer, and while it utterly sucks, its been a hilarious shared experience.

I don't think anyone else has gotten that group chat with AI thing so nailed.

SequoiaHope · 2024-03-12T19:35:20 1710272120

On the podcast TrashFuture, November Kelly recently described AI systems as “garbage dispensers” which is both a funny image (why would anyone make a garbage dispenser??) and an apt description. Certainly these tools have some utility, but there are a load of startups claiming to “democratize creativity” by allowing anyone to publish AI generated slop to major platforms. On the podcast this phrase was used during discussion of a website which lets you create AI generated music and push it to Spotify, a move which Spotify originally pushed back on but has now embraced. Garbage dispenser indeed.

YetAnotherNick · 2024-03-12T20:02:43 1710273763

> unseat ChatGPT with a market-leading general purpose assistant.

It's not impossible. The prediction from many(not that I believe it) is that over long run modelling tricks would become common knowledge and only thing that matters is compute and data, both of which Meta has.

Also there could be a trend of LLMs for ads or feed recommendation in the future as they has large completely unstructured dataset per user across multiple sites.

cj · 2024-03-12T21:40:07 1710279607

Compute, data, and most importantly distribution/users.

IMO standalone AI companies like OpenAI might be successful by providing infrastructure to other companies, but I can’t imagine ChatGPT remaining #1 many years from now.

The web is still trending towards being a walled garden. Maybe not right now, but long term I think people will use whatever AI is most convenient which probably will be AI built into a giant company with established user base (FB, GOOG, MSFT, and Apple if they ever get around to launching - would love Siri 2.0 if it meant not needing to open the ChatGPT iOS app)

NBJack · 2024-03-12T17:56:15 1710266175

What moat exactly? Much of the user data they have access to is drying up due to new regulations, some of which prohibit IIRC direct use on models as well. I'm not even sure they can use historical data.

Meta certainly has an edge in engineer count, undoubtedly. But I'd say they really, really want the metaverse to succeed more to have their on walled garden (i.e. equivalent power of Apple and Google stores, etc.). There's a reason they gave a hard pass to a Google partnership.

agar · 2024-03-13T00:32:05 1710289925

> There's a reason they gave a hard pass to a Google partnership.

AIUI, Google required Meta to basically cede control of a partnered OS to them:

"After years of not focusing on VR or doing anything to support our work in the space, Google has been pitching AndroidXR to partners and suggesting, incredibly, that WE are the ones threatening to fragment the ecosystem when they are the ones who plan to do exactly that.

"We would love to partner with them. They could bring their apps to Quest today! They could bring the Play store (with its current economics for 2d apps) and add value to all their developers immediately, which is exactly the kind of open app ecosystem we want to see. We would be thrilled to have them. It would be a win for their developers and all consumers and we’ll keep pushing for it.

"Instead, they want us to agree to restrictive terms that require us to give up our freedom to innovate and build better experiences for people and developers—we’ve seen this play out before and we think we can do better this time around."

-- From Mark Bosworth

Dr_Birdbrain · 2024-03-12T18:28:32 1710268112

I think the raw text inside Facebook groups is at least as valuable as Reddit data. Even if demographics data is restricted under European law, the raw text of people interacting is quite valuable.

verticalscaler · 2024-03-12T19:30:36 1710271836

Indeed, my deranged auntie posting on FB is approximately as valuable as my ADHD/PTSD quaranteeny nephew redditing.

fragmede · 2024-03-13T04:11:33 1710303093

That ignores all the user groups that are on Facebook. From apartment communities aka Nextdoor to grief support counseling to the mindfulness therapy groups, there’s a wealth of user comments a tad bit higher than Uncle John’s racist rants.

infecto · 2024-03-13T11:58:18 1710331098

Ahhh you had posted some other negative META criticism that was not even factual. Your made-up narratives really do not paint the correct picture.

calvinmorrison · 2024-03-12T18:53:34 1710269614

facebooks downfall will be their lock in. every other social media platform lets you view a public profile, discussion groups etc. it's all locked inside facebook.

YetAnotherNick · 2024-03-12T19:48:53 1710272933

> Much of the user data they have access to is drying up due to new regulations, some of which prohibit IIRC direct use on models as well.

Source would be appreciated, because this is opposite of obvious. Regulations against using public first party would be a big news and I haven't heard of anything like that. They use my data for recommending feed so why not for answering my question?

NBJack · 2024-03-21T19:06:44 1711048004

Insufficient reasons for consent: https://iapp.org/news/a/10-takeaways-from-the-irish-dpc-deci...

Decreased ability to 'mix' it: https://www.linkedin.com/pulse/ecpms-why-first-party-data-bi...

First party data alone can't tell you whether an ad resulted in a sale, unless you own the entire process on your platform. Contrast this with what Apple has via its app store; the fees do more than generate money.

vineyardmike · 2024-03-12T20:20:03 1710274803

It’s often forgotten now, but just a few years NVidia was cancelled production batches and writing down inventory when the GPU shortage cleared. No one needed more GPUs. It also happens to be when Meta first announced they were going to increase CapEx spending on compute.

I’m guessing that Meta got a sweetheart deal to help take a lot of inventory for NVidia and make commitments for future purchases.

transcriptase · 2024-03-12T23:38:12 1710286692

I don’t think it was that nobody needed GPUs. It was that nvidia thought they could get scalper margins by restricting supply after the shortage showed people were willing to pay scalper prices.

dekhn · 2024-03-12T19:01:01 1710270061

That sounds like a reasonable budget for 3 years of hardware at a major AI company.

ZiiS · 2024-03-12T17:34:07 1710264847

They may have to pay a premium to secure ~¼ of the output; certainly unlikely to be that steep a discount.

theptip · 2024-03-12T19:19:02 1710271142

Semi analysis posted recently noting that Meta locked in these purchases a while ago; something like a year or more. So they probably didn’t pay today’s spot rate.

loeg · 2024-03-12T21:01:05 1710277265

Yes, billions in GPU cap ex.

gingergoat · 2024-03-12T17:11:44 1710263504

The article doesn't mention MTIA, meta's custom ASIC for training & inference acceleration. https://ai.meta.com/blog/meta-training-inference-accelerator...

I wonder if they will use it in RSC.

benreesman · 2024-03-12T21:42:56 1710279776

I think it’s always useful to pay attention to the history on stuff like this and it’s a rare pleasure to be able to give some pointers in the literature along with some color to those interested from first-hand experience.

I’d point the interested at the DLRM paper [1]: that was just after I left and I’m sad I missed it. FB got into disagg racks and SDN and stuff fairly early, and we already had half-U dual-socket SKUs with the SSD and (increasingly) even DRAM elsewhere in the rack in 2018, but we were doing huge NNs for recommenders and rankers even for then. I don’t know if this is considered proprietary so I’ll play it safe and just say that a click-prediction model on IG Stories in 2018 was on the order of a modest but real LLM today (at FP32!).

The crazy part is they were HOGWILD trained on Intel AVX-2, which is just wild to think about. When I was screwing around with CUDA kernels we were time sharing NVIDIA dev boxes, typically 2-4 people doing CUDA were splitting up a single card as late as maybe 2016. I was managing what was called “IGML Infra” when I left and was on a first-name basis with the next-gen hardware people and any NVIDIA deal was still so closely guarded I didn’t hear more than rumors about GPUs for training let alone inference.

350k Hopper this year, Jesus. Say what you want about Meta but don’t say they can’t pour concrete and design SKUs on a dime: best damned infrastructure folks in the game pound-for-pound to this day.

The talk by Thomas “tnb” Bredillet in particular I’d recommend: one of the finest hackers, mathematicians, and humans I’ve ever had the pleasure to know.

[1] https://arxiv.org/pdf/1906.00091.pdf

[2] https://arxiv.org/pdf/2108.09373.pdf

[3] https://engineering.fb.com/2022/10/18/open-source/ocp-summit...

[4] https://youtu.be/lQlIwWVlPGo?si=rRbRUAXX7aM0UcVO

DEDLINE · 2024-03-12T16:56:33 1710262593

I wonder if Meta would ever try to compete with AWS / MSFT / GOOG for AI workloads

lifeisstillgood · 2024-03-12T17:56:57 1710266217

FB does not have the flywheel of running data centres - all three of those mentioned run hyper scale datacentres that they can then juice by “investing” billions in AI companies who then turn around and put those billions as revenue in the investors

OpenAI takes money from MSFT and buys Azure services

Anthropic takes Amazon money and buys AWS services (as do many robotics etc)

I am fairly sure it’s not illegal but it’s definitely low quality revenue

miohtama · 2024-03-12T22:35:07 1710282907

Such barter deals were also popular during the 00s Internet Bubble.

Here more on the deals (2003):

https://www.cnet.com/tech/services-and-software/aol-saga-ope...

Popular names included AOL, Cisco, Yahoo, etc.

Not sure if Amazon’s term sheets driving high valuation are nothing but AWS credits (Amazon’s own license to print money).

woah · 2024-03-12T18:18:54 1710267534

Sounds like it's free equity at the very least

lotsofpulp · 2024-03-12T22:22:19 1710282139

How is it free equity? Spending money to invest it somewhere involves risks. You might recover some of it if the investment is valued by others, but there is no guarantee.

miohtama · 2024-03-12T22:37:52 1710283072

You do not need cash in hands to invest. Instead, you print your own money (AWS credit) and use that to drive up the valuation, because this money costs you nothing today.

It might cost tomorrow though, when the company starts to use your services. However depending the deal structure they might not use all the credit, go belly up before credit is used or bought up by someone with real cash.

vineyardmike · 2024-03-12T20:21:44 1710274904

NVidia also invests in their AI customers.

fikama · 2024-03-13T08:00:31 1710316831

What do you mean? Could you elaborate please? Enumerate some deals so I could read more about it?

itslennysfault · 2024-03-12T20:48:17 1710276497

Neither did AWS when they started. They were just building out data centers to run their little book website and decided to start selling the excess capacity. Meta could absolutely do the same, but in the short term, I think they find using that capacity more valuable than selling it.

otterley · 2024-03-12T21:42:51 1710279771

> Neither did AWS when they started. They were just building out data centers to run their little book website and decided to start selling the excess capacity.

This is a myth. It simply isn't true. AWS was conceived as a greenfield business by its first CEO. Besides, S3 and SQS were the first AWS services; EC2 didn't appear till a few years later. And it wasn't built from excess Amazon server capacity; it was totally separate.

virtuallynathan · 2024-03-12T20:09:46 1710274186

Facebook has more datacenter space and power than Amazon, Google, and Microsoft -- possibly more than Amazon and Microsoft combined...

jedberg · 2024-03-12T20:46:25 1710276385

Unless you've worked at Amazon, Microsoft, Google, and Facebook, or a whole bunch of datacenter providers, I'm not sure how you could make that claim. They don't really share that information freely, even in their stock reports.

Heck I worked at Amazon and even then I couldn't tell you the total datacenter space, they don't even share it internally.

virtuallynathan · 2024-03-12T22:49:49 1710283789

You can just map them all... I have. I also worked at AWS :)

chatmasta · 2024-03-13T00:37:17 1710290237

This would be an interesting dataset to use for trading decisions (or sell to hedge funds).

But I wonder how much of their infrastructure is publicly mappable, compared to just the part of it that's exposed to the edge. (Can you map some internal instances in a VPC?)

That said, I'm sure there are a lot of side channels in the provisioning APIs, certificate logs, and other metadata that could paint a decently accurate picture of cloud sizes. It might not cover everything but it'd be good enough to track and measure a gradual expansion of capacity.

virtuallynathan · 2024-03-13T11:39:06 1710329946

I’m not sure mapping VPCs is super helpful - the physical infra is fairly distinct.

AWS has also disclosed 20 million Nitro adapters have been deployed, so you can do some backwards napkin math from that.

the-rc · 2024-03-13T01:26:08 1710293168

Mapping as in.. drawing the outlines of buildings and computing the square footage yourself?

virtuallynathan · 2024-03-13T11:37:10 1710329830

the-rc · 2024-03-13T13:35:00 1710336900

Then you should be aware that, for the longest time, Google was against multiple floors, until they suddenly switched to four floors in many locations:

https://www.datacenterfrontier.com/cloud/article/11431213/sc...

A decade ago, there was a burst in construction and in some places the bottleneck was not getting the machines or electricity, but how fast they could deliver and pour cement, even working overnight.

virtuallynathan · 2024-03-13T14:50:14 1710341414

Yep, I am aware, I have a square footage multiplier for their multi-story buildings.

jedberg · 2024-03-13T15:54:46 1710345286

But how can you know how many floors they have? And where are you getting the list of buildings from? And what makes you think your list is complete?

Also how do you know their efficiency? Google might have less space but also a way to pack twice as much compute in the same place.

Like I said, this is impossible to know without a lot of insider information from a lot of companies.

virtuallynathan · 2024-03-13T17:32:25 1710351145

Well, they tell you for one: https://datacenterpost.com/scaling-up-google-building-four-s... We also have Google Street view, etc.

Of course, it's all estimates. You can get fancier and count generators and transformers and stuff too.

virtuallynathan · 2024-03-12T22:55:51 1710284151

To date, facebook has built, or is building, 47,100,000 sq ft of space, totaling nearly $24bn in investment. Based on available/disclosed power numbers and extrapolating per sqft, I get something like 4770MW.

Last I updated my spreadsheet in 2019, Google had $17bn in investments across their datacenters, totaling 13,260,000 sq ft of datacenter space. Additional buildings have been built since then, but not to the scale of an additional 30mil sq ft.

Amazon operates ~80 datacenter buildings in Northern Virginia, each ~200,000 sq ft -- about 16,000,000sq ft total in that region, the other regions are much much smaller, perhaps another 4 mil sq ft. When I'm bored I'll go update all my maps and spreadsheets.

the-rc · 2024-03-13T01:23:53 1710293033

Does the square footage take into account multiple floors? What's the source? It can be misleading, because you don't know the compute density of what's inside. Using just public data, power is a more accurate proxy. Until at least 5-6 years ago, Google was procuring more electricity than Amazon. Before that, it had a further advantage from lower PUE, but I bet the big names are all comparable on that front by now. Anyone that has worked at several of them can infer that FB is not the largest (but it's still huge).

As for the dollars, were they just in 2019 or cumulative? The Google ones seem low compared to numbers from earnings.

virtuallynathan · 2024-03-13T11:36:52 1710329812

Google certainly has more compute density than Amazon, the numbers I was able to find from the local power company was 250MW at Council Bluffs back in 2015 or so.

Amazon builds out 32MW shells, and the most utilized as of 5 or 6 years ago was 24MW or so, with most being much less than that.

samstave · 2024-03-12T23:42:11 1710286931

At this point Power Companies (ala PG&E, etc) should be investing in AI companies in a big way. THen they make money off the AI companies to build out power infra - and vice versa.

I am surprised we havent heard about private electrical grid built out by such companies.

Surely they all have some owned power generation, but then if they do, the local areas where they DO build out power plants - they should have to build capacity for the local area, mayhaps in exchange for the normal tax subsidies they seek for all these large capital projects.

Cant wait until we pods/clusters in orbit. With radioisotope batteries to power them along with the panels. (I wonder how close to a node a RI battery can be? Can each node have its own RI?) (sas they can produce upto "several KW" -- but I cant find a reliable source for max wattage of an RI...)

SpaceX should build an ISS module thats an AI DC cluster.

And have all the ISS technologies build its LLM there based on all the data they create?

virtuallynathan · 2024-03-13T17:34:15 1710351255

I updated my map for AWS in Northern Virginia -- came up with 74 buildings (another source says 76, so i'll call it directionally correct). If I scale my sq ft by ~5% to account for missing buildings, we get 11,500,000sq ft in the northern virginia area for AWS.

I'll finish my other maps and share them later...

VirusNewbie · 2024-03-12T23:29:19 1710286159

But Google built data centers aren't the only data centers google is running their machine fleet in...

chatmasta · 2024-03-13T00:39:01 1710290341

Yeah, Google buys servers in public datacenters like those from Equinix. One "region" needn't be one datacenter, and sometimes AWS and GCP will even have computers in the same facility. It's actually quite annoying that "region" is such an opaque construct and they don't have any clear way to identify what physical building is hosting the hardware you rent from them.

the-rc · 2024-03-13T01:12:10 1710292330

Those are almost lost in the noise, compared to the big datacenters. (I've been inside two Atlanta facilities, one leased and one built from scratch, and the old Savvis one in Sunnyvale).

dsp · 2024-03-12T20:20:50 1710274850

[citation needed]

karmasimida · 2024-03-12T20:52:59 1710276779

I don't think so, AWS hasn't disclosed this numbers, like datacenter spaces occupied, so how do you know.

virtuallynathan · 2024-03-12T22:49:01 1710283741

I have mapped every AWS data center globally, and I worked at AWS.

Facebook publishes this data.

pgwhalen · 2024-03-12T21:23:31 1710278611

I have zero evidence, but this seems extremely unlikely. Do you have more than zero evidence?

meiraleal · 2024-03-12T22:40:47 1710283247

Meta can use all their datacenter space while Amazon, Google, and Microsoft datacenter space is mostly rented.

rthnbgrredf · 2024-03-12T17:59:57 1710266397

Meta could build their own cloud offering. But it would take years to match the current existing offerings of AWS, Azure and GCP in terms of scale and wide range of cloud solutions.

Cthulhu_ · 2024-03-12T19:39:42 1710272382

And then there's sales. All of those three - and more you haven't considered, like the Chinese mega-IT companies - spend huge amounts on training, partnerships, consultancy, etc to get companies to use their services instead of their competitors. My current employer seems all-in on Azure, previous one was AWS.

There was one manager who worked at two large Dutch companies and sold AWS to them, as in, moving their entire IT, workloads and servers over to AWS. I wouldn't be surprised if there was a deal made there somewhere.

oblio · 2024-03-12T19:33:27 1710272007

The real question is: why aren't they? They had the infrastructure needed to seed a cloud offering 10 years ago. Heck, if Oracle managed to be in 5th (6th? 7th?) place, Facebook for sure could have been a top 5 contender, at least.

krschultz · 2024-03-13T06:30:24 1710311424

Because they make more money using their servers for their own products than they would renting them to other people. Meta has an operating margin of 41% AFTER they burn a ton on Reality Labs, while AWS has a 21% margin with more disciplined spending. Social media is a more profitable business than infrastructure.

elbear · 2024-03-13T07:00:05 1710313205

Does Meta make money from anything other than ads? It's not a dismissive question. I'm curious if social media implies anything other than ads.

oblio · 2024-03-13T12:58:25 1710334705

> Advertising (over 97.8% of revenues): the company generated over $131 billion in advertising, primarily consisting of displaying ad products on Facebook, Instagram, Messenger, and third-party.

https://fourweekmba.com/how-does-facebook-make-money/

Thaxll · 2024-03-13T00:31:55 1710289915

Because it's not their business, they're not good at it and probably the ROI is not worth it.

Also how exactly they would do it, they don't have enough infra for renting, they would need to x10 what they have now.

KaiserPro · 2024-03-12T19:43:49 1710272629

because meta sucks at software, documentation and making sure end user products work in a supported way.

Offering reliable IaaS is super hard and capital intensive. Its also not profitable if you are perceived as shit.

logicchains · 2024-03-12T20:23:48 1710275028

>because meta sucks at software

Google started a cloud and their user-facing software is atrocious. Compared e.g. Angular to React, Tensorflow to Pytorch.

negus · 2024-03-12T23:03:02 1710284582

Why would you prefer Pytorch to Tensorflow/Keras?

chessgecko · 2024-03-12T23:39:29 1710286769

Tensorflow and keras have gotten better, but pytorch historically had better flexibility than keras and was much easier to debug/develop in than tensorflow.

bionhoward · 2024-03-12T18:28:34 1710268114

aww, those existing offerings are overcomplicated as hell, a fresh look could yield substantially simpler cloud developer experience and this would compete well against those other cloud offerings on simplicity alone

redleader55 · 2024-03-12T20:42:09 1710276129

For consumers, AI could just be stateless "micro service". Meta already has enough surfaces where customers can interact with AI.

crowcroft · 2024-03-12T19:00:31 1710270031

I think Meta have avoided doing this because it would complicate their business priorities. They don’t really do B2B.

carlossouza · 2024-03-12T20:45:25 1710276325

What do you mean by “they don’t do B2B”? They sell ads to companies, don’t they?

mjburgess · 2024-03-12T16:41:23 1710261683

I'd be great if they could invest in an alternative to nvidia -- then, in one fell swoop, destroy the moats of everyone in the industry.

math_dandy · 2024-03-12T16:50:56 1710262256

A company moving away from Nvidia/CUDA while the field is developing so rapidly would result in that company falling behind. When (if) the rate of progress in the AI space slows, then perhaps the big players will have the breathing room to consider rethinking foundational components of their infrastructure. But even at that point, their massive investment in Nvidia will likely render this impractical. Nvidia decisively won the AI hardware lottery, and that's why it's worth trillions.

whiplash451 · 2024-03-12T17:42:56 1710265376

People said the same thing when tensorflow was all the rage and pytorch was a side project.

Granted, HW is much harder than SW, but I would not discount Meta's ability to displace NVIDIA entirely.

Cthulhu_ · 2024-03-12T19:41:13 1710272473

I don't think they could; nvidia has tons of talent, Meta would have to steal that. Meta doesn't do anything in either consumer or datacenter hardware that isn't for themselves either.

Meta is a services company, their hardware is secondary and for their own usage.

Wazako · 2024-03-12T22:49:30 1710283770

meta has the Quest. It's not so bad that they're looking to create an LPU for their headset to offer local play.

mjburgess · 2024-03-12T17:28:47 1710264527

I'm more concerned to avoid nvidia (et al.) market domination, than chasing the top-edge of the genAI benefits sigmoid. This will prevent much broad-based innovation.

hx8 · 2024-03-12T17:55:34 1710266134

This space is so compeitive, even if Nvidia is asleep at the wheel a competitor will come and push them before too long. AMD has a history of noticing when their competitors are going soft and rapidly being compeitive.

paxys · 2024-03-12T16:58:57 1710262737

Except that "one fell swoop" would realistically be 20+ years of research and development from the top minds in the semiconductor industry.

logicchains · 2024-03-12T20:26:19 1710275179

It's not the hardware keeping NVidia ahead, it's the software. Hardware-wise AMD is competitive with NVidia, but their lack of a competitive CUDA alternative is hurting adoption.

brucethemoose2 · 2024-03-12T17:11:30 1710263490

Facebook very specifically bought and customized Intel SKUs tailored for AI workloads for some time.

John23832 · 2024-03-12T16:54:19 1710262459

https://engineering.fb.com/2023/10/18/ml-applications/meta-a...

aeyes · 2024-03-12T16:59:56 1710262796

Isn't Google trying to do this with their TPUs?

crakenzak · 2024-03-12T17:29:34 1710264574

I still, for the life of me, can't understand why Google doesn't just start selling their TPUs to everyone. Nvidia wouldn't be anywhere near their size if they only made H100s available through their DGX cloud, which is what Google is doing only making TPUs available through Google Cloud.

Good hardware, good software support, and market is starving for performant competitors to the H100s (and soon B100s). Would sell like hotcakes.

aseipp · 2024-03-12T18:17:22 1710267442

It is an absolutely massive amount of work to turn something designed for your custom software stack and data centers (custom rack designs, water cooling, etc) into a COTS product that is plug-and-play; not just technically but also things like sales, support, etc. You are introducing a massive amount of new problems to solve and pay for. And the in-house designs like TPUs (or Meta's accelerators) are cost effective in part because they don't do that stuff at all. They would not be as cheap per unit of work if they had to also pay off all that other stuff. They also have had a very strong demand for TPUs internally which takes priority over GCP.

dekhn · 2024-03-12T21:38:15 1710279495

Do you mean, sell TPU hardware to other companies that would run it in their data centers? I can't imagine that would ever really work. The only reason TPUs work at Google is because they have huge teams across many different areas to keep them running (SRE, hardware repair, SWE, hardware infra) and it's coupled to the design of the data centers. To vend and externalize the software would require google to setup similar teams for external customers (well beyond what Google Cloud provides for TPUs today) just to eke out some margin of profit. Plus, there is a whole proprietary stack running under the hood that google wouldn't want to share with potential competitors.

Google used to sell a search appliance-in-a-box and eventually lost interest because hardware is so high-touch.

aeyes · 2024-03-12T22:38:55 1710283135

> Google used to sell a search appliance-in-a-box and eventually lost interest because hardware is so high-touch.

We had a GSA for intranet search and other than the paint this was a standard Dell server. I remember not being impressed by what the GSA could do.

We also had Google Urchin for web analytics, it wasn't a hardware appliance but the product wasn't very impressive either. They then killed that and tried to get you onto Google Analytics.

They just didn't commit to these on premise enterprise products.

dekhn · 2024-03-12T23:08:52 1710284932

The server may have been dell, but it included a full stack of google3 software including chubby the lockserver.

We had one at my company and it was widely loved- far better intranet search and domain-specific search for biotech.

ajcp · 2024-03-12T17:54:27 1710266067

And undercut what they'd like to use as a huge motivator in people moving to GCP? Not likely. Even if they wanted to they can't keep up with their own internal demand.

Beyond that they might not be as stable or resilient outside of the closely curated confines of their own data-centers. In that case selling them would be more of an embarrassment.

htrp · 2024-03-12T18:05:16 1710266716

>Beyond that they might not be as stable or resilient outside of the closely curated confines of their own data-centers. In that case selling them would be more of an embarrassment.

Once you go out of your heavily curated hardware stack, the headaches multiply exponentially.

qiine · 2024-03-12T17:40:28 1710265228

Maybe selling hardware to customers worldwide + support like Nvidia does is actually not trivial ?

neuronexmachina · 2024-03-12T18:24:43 1710267883

The impression I got from this thread yesterday is that Google's having difficulty keeping up with the heavy internal demand for TPUs: https://news.ycombinator.com/item?id=39670121

elwell · 2024-03-12T17:19:59 1710263999

> Meta’s long-term vision is to build artificial general intelligence (AGI)

valzam · 2024-03-13T00:23:18 1710289398

Don't worry, this goal will change with the next hype cycle

latchkey · 2024-03-13T01:54:12 1710294852

I pity the fools that think AI is just another internet hype cycle.

brookst · 2024-03-13T03:24:54 1710300294

I’m old enough to remember the proud, defiant declarations that the internet was just a hype cycle.

bennyelv · 2024-03-13T06:28:28 1710311308

Well it was wasn’t it? There was a massive boom where loads of companies over promised what they would achieve, followed by a crash when everyone realised lots of them couldn’t, followed by stability for the smaller number that could.

It was the very definition of a hype cycle as far as I can see. Hype cycle doesn’t mean “useless and will go away”, you have the second upward curve and then productivity.

https://en.m.wikipedia.org/wiki/Gartner_hype_cycle

brookst · 2024-03-13T15:12:31 1710342751

I don’t disagree, but a lot of “analysis” was not that nuanced. At one time I worked for a company where 90% of revenue was from printed periodicals. Smart, capable executives assured the whole company that the internet was not a threat, just something college kids used for fun.

Colloquial, dismissive use of “hype cycle” does not usually mean “this will change the world but foolish things, soon forgotten, will also be done in the short term”. Though I agree a deeper understanding of the term can suggest that.

latchkey · 2024-03-13T03:44:44 1710301484

I got my first email in 1991 and started my first internet business in 1995 (a web dev shop). My entire life has been an endless hype cycle.