Hacker News new | past | comments | ask | show | jobs | submit login
"Nvidia is so far ahead that all the 4090s are nerfed to half speed" (twitter.com/realgeorgehotz)
207 points by BIackSwan 57 days ago | hide | past | favorite | 176 comments



NVIDIA is obviously not above market segmentation via dubious means (see: driver limitations for consumer GPUs), but I think binning due to silicon defects is a more likely explanation in this case.

Some 4090s have this extra fp16 -> fp32 ALU disabled because it's defective on that chip.

Other 4090s have it disabled because it failed as an Ada 6000 for some other reason, but NVIDIA didn't want to make a 4095 SKU to sell it under.

Or if you generalize this for every fusable part of the chip: NVIDIA didn't want to make a 4094.997 SKU that only one person gets to buy (at what price?)


Depending on who you ask, binning is segmentation. Generally demand isn't going to exactly match how the yields work out, so companies often take a bunch of perfectly good high-end chips, nerf them, and throw them in the cheapo version. You used to be able to (and still can, in some cases) take a low-end device and, if you'd won the chip lottery, "undo" the binning and have a perfectly functional high-end version. For some chips, almost all the nerfed ones had no defects. But manufacturers like nVidia hated it when customers pulled that trick, so they started making sure it was impossible.


> You used to be able to (and still can, in some cases) take a low-end device and, if you'd won the chip lottery, "undo" the binning and have a perfectly functional high-end version.

For the purposes you tested it, sure. Maybe some esoteric feature you don't use is broken. NVIDIA still can't sell it as the higher end SKU. The tests a chip maker runs to bin their chips are not the same tests you might run.

I'm sure chip makers make small adjustments to supply via binning to satisfy market demand, but if the "technical binning" is too far out of line from the "market binning", that's a lot of money left on the table that will get corrected sooner or later.

edit: And that correction might be in the form of removing redundancies from the chip design, rather than increasing the supply/lowering the price of higher end SKUs. The whole point here is, that's two sides of the same coin.


Disabling cores that have 100% passed QA is quite commonplace, especially for chips that have been on the market for over a year and thus are being built with yields as mature as they're going to get.

Artificially restricting supply of high-end chips and increasing supply of mid-range chips by disabling fully functional cores is how chip makers preserve their pricing structure. Without doing this, market pressures would force prices down on high-end chips and cause lower bins to mostly disappear from the market, leaving the product line with lower overall margins and a PR nightmare every time a new generation launches with pricing reset back to the initially high levels.

As a rule of thumb: if a chip product line goes a whole year without having new SKUs show up with a higher percentage of cores enabled or higher clock speeds for the same core count, then the manufacturer is artificially restricting supply to make more of the lower-bin parts than naturally occur in the fab output.

> And that correction might be in the form of removing redundancies from the chip design, rather than increasing the supply/lowering the price of higher end SKUs.

Those two courses of action take place on completely different timescales. Disabling cores and other binning tricks can be implemented in no more than a few months. Adding a new chip with a different number of copies of the same IP blocks takes well over a year. Removing redundancy within an IP block (eg. by having fewer spare SRAM blocks for a cache of a fixed capacity) isn't going to happen within a single chip generation.

In the semiconductor world, corrections of any kind tend toward "later" rather than "sooner".


Dumb question (maybe): I'm aware that "tester time" is very expensive for advanced integrated circuits. Could it be that disabled cores are actually "unknown" i.e. probably good, but money was saved by not even testing them?


It's more likely that any defect in a core causes the whole core to be disabled. Especially in the this case where I assume the FP16 x FP16 -> FP32 path uses the same hardware as the FP16 x FP16 -> FP16 path.


Exactly. They can easily sell more Ada 6000s, and I'm pretty sure they would do so rather than sell them for much less as 4090s.


I think this is just like intel does.

Runs fast? i9. slower? i7. missing cores? i5 slowest? i3

perfect chips probably not only have all the cores working, they also run at low voltages so don't get as hot.

I wonder if they can figure out what parts of the chip run at what speeds, and disable the ones that run slow/hot

I'm pretty sure gpus are overclocked by vendors, so there must be some sort of binning either by the vendors or they buy binned parts. I'll bet if parts could go faster, you would have an ASUS/MSI/etc 4090-2x-max-$$$$

https://www.tomshardware.com/reviews/glossary-binning-defini...


I recall reading before that as yields improved over process maturation Intel has ended up binning faster passing chips as lower SKUs just to meet demand.


I'm not sure that it's completed as a separate fp16 ALU. There's cute ways to share logic between a dual fp16 alu and a single fp32 ALU such that it's really just one ALU with those being different ops.


As I understand it, that's how the original MMX got started. It was largely reusing the x87 ALU, but breaking the carry chains at the obvious points.


There must be a whole layer of reroute if defective plumbing in drivers.


I don't even understand why binning non-defect cards is dubious.

It's like the logic people have on /r/pcmasterrace is that if they didn't bin, they would just release all 4090s at 4080 prices. No, there would just be less 4080s for people to buy. No chip maker is going to sell their chips at sub-market rates just because they engineered them to/have a fab that can produce them at very low defect rates.

Now, Nvidia certainly has done dubious things. They've hurt their partners (EVGA, you're missed), and skyrocketing the baseline GPU prices is scummy as hell. But binning isn't anything I necessarily consider dubious.


> No, there would just be less 4080s for people to buy.

Sure, and more 4090s at a lower price.


Nvidia wouldn't leave money on the table dropping the price of the 4090. There would just be more supply of 4090s at the same price. A card manufacturer selling them below MSRP would get immediately sanctioned by Nvidia.


Even a monopolist is constrained by supply and demand. If they could sell everything they make as a 4090 without trashing their margins, they would. The fact that 4080, 4070, 4060 lines exist means they can't.


Right. And the fact that a BMW 2-series exists means that BMW can't sell all the 7-series' that they want.

They're cutting down 4090s into 4080s to fulfill a demand for a cheaper chip, while still supplying their premium option. Your fanciful world concept of things being sold for no/minimum margins is just that: fanciful.


This starts as binning and ends up at down binning :)


This is why Nvidia needs competition. I love the performance of their hardware and the quality of their drivers, but I don't love being their customer. They have a long, long history of price discrimination using techniques like this. Back in the day it was "workstation" graphics for CAD programs that they would nerf for consumer cards by downgrading various features of OpenGL.

Different markets, same techniques. It's in the company DNA. That and their aversion to open source drivers and various other bad decisions around open source support that make maintaining a working Linux GPU setup way harder than it should be even to this day.


Oh, right, the CAD thing was sketchier. IIRC there wasn't any feature of OpenGL/DirectX that was nerfed, it was just an agreement with CAD companies to reject the GPUs that didn't have the workstation bit fused.

For this performance nerf, IDK, seems fine to me. Software companies do this all the time. Same piece of software but you have to pay to unlock features. I don't see why hardware should be any different.

Granted, even for the CAD nerf, it's a gray area. You pay for features, not for silicon, and NVidia is clear about what you have to pay for what features, so. But I'm a bit more biased on that one because my 10-person HW company had to spring for several of those workstation cards.


That sounds like a classic Anti-trust suit waiting to happen. CAD company and NVIDIA colluding to drive sales exclusively to each other. That's illegal and exploitative.


Solution is for NVIDIA to buy the CAD company, then everything is legal.


Actually probably not, you can't use your position in one market to benefit your position in another. But antitrust law is hardly enforced so yeah probably.


>They have a long, long history of price discrimination using techniques like this.

Had AI and Crypto not been a thing. These so called price discrimination was what kept the company afloat and continue to spend stupid money ( according to many ) on CUDA rather than making gaming better.

And when often people say "X" needs competition. What they really want is just cheaper price for the same thing. Would it be great if Nvidia had more competition? Absolutely, but is Nvidia not making progress and milking everything they have? Absolutely not. They invested even more in CUDA, Large Die Size Correction Tooling, Assisted EDA Design and many more to their arsenal to built their moat.

I also often found when a successful founder is still working at a company, that company is often pushed far harder by its founder than whatever market force is driving them. So we need competition for Intel 2009 - 2021, Microsoft in 2000, Any company who just sit there and no longer improves or failed to execute.

Nvidia? They are doing fine if not better than I could imagine. ( I just wished they spend a little more money to compete in the Mobile and Desktop Consumer SoC space. I guess that is coming soon. )


> What they really want is just cheaper price for the same thing

They'd probably be happy with competitive market pricing, which is unlikely to happen without a competitive market.

Nvidia's gross margins are extraordinary vs. AMD or intel margins (or even "overpriced" Apple.)

There is competition of a sort - for example at the low end with intel ARC and at the high end with AMD Instinct based supercomputers. However, competitors don't seem to be able to match the CUDA software platform, particularly for AI/ML. The deepening CUDA moat is hard for competitors to cross and for customers to escape.

In the cloud space it seems that Nvidia may face more competition with platforms like GCP/Cloud TPU and AWS/Trainium.


Their former competition wasn't any different. I still remember ATI/AMD pencil mods to unlock things they disabled.


they have to do that to server different segments. how would you price a 4090 that can be used for crypto, gaming, ai, cad, video editing if you were to discover that is cheaper to create the same chips for all of them, but the segments are really different, 90% coming from datacenter and 10% from gaming

we're lucky they still do gaming by limiting the datacenter chips

it's like getting a ferrari speed limited for usd20,000 and then I complain I don't get the acceleration of a usd100,000 model. they sold the product cheaper, they cared, they adapted. I'm happy they are still improving year after year for the same dollar value


> they have to do that to server different segments.

No, they don't.

That they are able to price discriminate this way is a sign that they are functionally a monopoly exercising pricing power, otherwise, they would be easily undercut in the market where they charge premium prices by a competitor.


They just shouldn't do that. If they can afford to sell the Ferrari for $20k, they should do so. To everyone who wants one, for whatever reason.


They would not be able to afford to sell them at that price if they sold all of them at that price. (Or at least, I doubt they would.)


That is why we need competition, to drive the price down towards the costs. And you must be able to provide enough volume to meet demand.


The danger, of course, is that they decide that they don't believe they can sell a single sku at USD20000 and make the same money they're making now. So then, the price goes up 2.5x to 10x depending on how greedy they want to be.


I disagree, and this line of thinking is positively dangerous.

Just because Ferrari might be capable of making that car for $20k, I don't have a fundamental right to demand it from them any more than I have a fundamental right to demand that you make me a sandwich right now for $5.

> they can afford

Before using the word "they" in a prescriptive sentence, think about whether you could substitute "I" and you would still be happy with it.


The goal isn’t to directly force them, but to create a market competitive enough that the only way to compete is to sell the best product they can with a minimal markup.

I have no issue selling into a competitive market, that’s just how things work for individuals. It’s only at the scale of countries and giant companies that the ability for anti competitive behavior really shows up.


There's no fundamental right. But wishing for competition is certainly reasonable! We should all be rooting for competition to improve the efficiency of our markets.


So company profit margins should be capped? At what level and how would that work exactly?

What about all other stuff? i.e. maybe you or somebody else can "afford" to sell their labour at 10-80% of what they are paid?


If NVIDIA had real competition they would do this naturally to gain market share. The GP saying they 'should' do x isn't something we can expect companies to do out of the goodness of their hearts, it's what the market should force them to do.


No, they wouldn't. Nvidia focuses more on premium and margin than on unit share. Nvidia looks at Apple. Apple has 75% of profit share with 25% of unit share of the whole smartphone market. Apple makes 3x more profit than all other smartphone makers combined. Why should Apple reduce pricing in such a situation?


I'm not sure if everything turning into a commodity and no companies having any surpluses would be ideal either. That would probably significantly slow down innovation in some ways.


> So company profit margins should be capped? At what level and how would that work exactly?

That a competitive market drives prices to zero economic profit is a fairly basic result; no active measures besides the existence of competition are necessary for this.

> What about all other stuff?

Yes, this applies in all competitive markets. If it doesn't apply in a market, there is a constraint on competition causing it.


> prices to zero economic profit is a fairly basic result

Yes and that's not necessarily a good thing in all markets. Very low profit margins can result in less innovations and would certainly discourage companies from taking risks (basically by definition)


So, are you going to tell your boss to decrease your salary to the level where it covers just your basic needs and no more?


That is precisely what would happen if there was infinite competition for job openings.


Nah, no thanks. I prefer this because I now have some damned fast GPUs in a rig on the cheap. I’m happy to have my $20k acceleration limited Ferrari.


It's a protected company...


speaking of driver quality...my pc has been regularly blue screening after the latest release and the transition from the Experience app to the Nvidia app...


There was a time when Intel seemed unbeatable. In 2000 they had a 500 billion USD valuation. That's almost a trillion dollars in today's (2024) USD. Today they are valued at 90 billion USD and Broadcom was thinking about buying them...

My point is these things don't seem to last in tech.


Over 23 years almost every person working there probably either moved on or changed roles. Corporations are made of people. It lasted as long as it should.


Let's don't ignore context. In 2000 JDS Uniphase was worth $125 billion. A lot of things were "worth" a lot of money in the year 2000. Anyway Intel got their ass handed back to them in 2003 with the AMD Opteron. Today is not Intel's first struggle.


What made Intel seem unbeatable was its process node advantage. Nvidia does not have fabrication plants, so it is able to get the best process node from whoever has it. Nvidia is therefore not vulnerable to what befell Intel.

What makes Nvidia seem unbeatable is that Nvidia does the best job on hardware design, does a good job on the software for the hardware and gets its designs out quickly such that they can charge a premium. By the time the competition makes a competitive design, Nvidia has the next generation ready to go. They seem to be trying to accelerate their pace to kill attempts to compete with them and so far, it is working.

Nvidia just does not do the same thing better in a new generation, but tries to fundamentally change the paradigm to obtain better than generational improvements across generations. That is how they introduced SIMT, tensor cores, FP8 and more recently FP4, just to name a few. While their competitors are still implementing the last round of improvements Nvidia made to the state of the art, Nvidia launches yet another round of improvements.

For example, Nvidia has had GPUs on the market with FP8 for two years. Intel just launched their B580 discrete GPUs and Lunar Lake CPUs with Xe2 cores. There is no FP8 support to be seen as far as I have been able to gather. Meanwhile, Nvidia will soon be launching its 50 series GPUs with FP4 support. AMD’s RDNA GPUs are not poised to gain FP8 until the yet to be released RDNA 4 and I have no idea when Intel’s ARC graphics will gain FP8. Apple’s recent M4 series does have FP8, but no FP4 support.

Things look look less bad for Nvidia’s competitors in the enterprise market, CDNA 3 launched with FP8 support last year. Intel had Gaudi 2 with FP8 support around the same time as Nvidia, and even launched Gaudi 3. Then there is tenstorrent with FP8 on the wormhole processors that they released 6 months ago. However, FP4 support is no where to be seen with any of them and they will likely not release it until well after Nvidia, just like nearly all of them did with FP8. This is only naming a few companies too. There are many others in this sector that have not even touched FP8 yet.

In any case, I am sure that in a generation or two after Blackwell, Nvidia will have some other bright idea for changing the paradigm and its competition will lag behind in adopting it.

So far, I have only discussed compute. I have not even touched on graphics, where Nvidia has had many more innovations, on top of some of the compute oriented changes being beneficial to graphics too. Off the top of my head, Nvidia has had variable rate shading to improve rendering performance, ray tracing cores to reinvent rendering, tensor cores to enable upscaling (I did mention overlap between compute and graphics), optical flow accelerators to enable frame generation and likely others that I do not recall offhand. These are some of the improvements of the past 10 years and I am sure that the next 10 years will have more.

We do not see Nvidia’s competition put forward nearly as many paradigm changing ideas. For example, AMD did “smart access memory” more than a decade after it had been standardized as resizeable bar, which was definitely a contribution, but not one they invented. For something that they actually did invent, we need to look at HBM. I am not sure if they or anyone else I mentioned has done much else. Beyond the companies I mentioned, there are Groq and Cerebras (maybe Google too, but I am not sure) with their SRAM architectures, but that is about it as far as I know of companies implementing paradigm changing ideas in the same space.

I do not expect Nvidia to stop being a juggernaut until they run out of fresh ideas. They have produced so many ideas that I would not bet on them running out of new ideas any time soon. If I were to bet against them, I would have expected them to run out of ideas years ago, yet here we are.

Going back to the discussion of Intel seeming to be unbeatable in the past, they largely did the same thing better in each generation (with occasional ISA extensions), which was enough when they had a process advantage, but it was not enough when they lost their process advantage. The last time Intel tried to do something innovative in its core market, they gave us Itanium, and it was such a flop that they kept doing the same thing incrementally better ever since then. Losing their process advantage took away what put them on top.


> In any case, I am sure that in a generation or two after Blackwell, Nvidia will have some other bright idea for changing the paradigm and its competition will lag behind in adopting it.

This is the most important point. Everyone seems to think that Nvidia just rests on its laurels while everyone and their dog tries to catch up with it. This is just not how (good) business works.


Nvidia once rested on their laurels and that was the Geforce FX over 20 years ago. Jensen was so pissed, he literally screamed at every person in the company.

But he also made sure that resting won't ever happen again. Andy Grove from Intel once said that only the paranoid survive and I bet Jensen is the most paranoid CEO alive. You won't see him in public that way because Jensen in public is Nvidia marketeer.

Nvidia has also understood early on how important marketing and brand recognition is. They learned it the hard way with the utter failure of their first chip the NV1 which was a technological master piece which no one wanted. Witht the bad GeForce FX Nvidia even made marketing videos to make fun of themselves and that helped. ATI didn't crush Nvidia as much as expeceted because of such activities despite having the ultra superior Radeon 9700/9800 series back then.


Nvidia has been very smart to be well prepared, but the emergency of Bitcoin and AI were two huge bits of good luck for them. It's very unlikely that there is another once-in-a-lifetime event that will also benefit Nvidia in that way. Nvidia will be successful in the future, but it will be through more normal, smart business, means.


Those are both computational challenges. Nvidia is well positioned for those due to their push into HPC with GPGPU. If there is another “once-in-a-lifetime” computational challenge, it will likely benefit Nvidia too.


I would never bet against the market finding a use for higher performance products.

~"Good fortune favors those who are prepared to take advantage of it", etc.


I've been using Gaudi chips for a little bit and they are totally fine (and the software stack is even pretty good, or at least the happy path is mostly covered for me). For example I set up training with autocasting, activation checkpointing, fused ops, profiling etc., without too much trouble. I'll write a long blog post about it soon but I think their issue with the Gaudi chips is simply making enough and convincing people to buy them before Falcon Shores (which will, I think, be Xe slice based, so more like a better PVC chip than a Gaudi).

In summary the software story was very surprisingly better than I expected (no Jax though).


> What made Intel seem unbeatable was its process node advantage. Nvidia does not have fabrication plants, so it is able to get the best process node from whoever has it. Nvidia is therefore not vulnerable to what befell Intel.

It's able to get the best process node from /whoever is willing to sell it to Nvidia/: it's vulnerable (however unlikely) to something very similar -- a competitor with a process advantage.


Exactly this, hard to grok why people think somehow the fab is the boat anchor around Intel's neck. No, it was the golden goose that kept Intel ahead until it didn't.

BK failed to understand the moat Intel had was the Fab. The moat is now gone and so is the value.


Intel didn't have a software stack moat.


In 2000 Intel had a huge software moat: Microsoft Windows, and the large install base of x86-only software.

Rich webapps hadn't been invented. Smartphones? If you're lucky your flip phone might have a colour screen. If you've got money to burn, you can insert a PCMCIA card into your Compaq iPAQ and try out this new "802.11b" thing. Java was... being Java.

Almost all the software out there - especially if it had a GUI, and a lot of it did - was distributed as binaries that only ran on x86.


So many devs are too young to remember a time before you would expect to just download some open source and compile it for x86/amd64/arm/emscripten/etc and be good to go. In the old days, if you didn't want to write that library code yourself, chances are all your AltaVista search would turn up was a guy selling a header file and a DLL and OCX[0] for $25. If you were lucky!

A vast amount of code was only intended to compile and run on a single OS and architecture (circa 2000, that was usually x86 Win32; Unix was dying and Wintel had taken over the world). If some code needed to be ported to another platform, it was as good as a from-scratch re-write.

[0] in case you wanted to use the thing in Visual Basic, which you very well might.


>In 2000 Intel had a huge software moat

"had". That's what helped prop up their monopoly but it didn't last. These days if can't run your software on another architecture, like ARM, you can run at least on AMD. AMD can basically run the same software as Intel. This isn't the situation for NVIDIA vs everyone else, so far.


Other than the huge amount of enterprise software which was only supported on Intel, most of the high-end server business below the mainframe level after the mid-90s, and the huge install base of x86 software keeping everyone but AMD out? Even their own Itanium crashed and burned on x86 compatibility.


Then there were software libraries and the Intel C/C++ Compiler that favored Intel. They would place optimized code paths that only ran on Intel hardware in third party software. Intel has stopped doing that in recent years as far as I know (the MKL has Zen specific code paths), but that is a fairly recent change (maybe the past 5 years).

There were also ISA extensions. Even if Intel had trouble competing on existing code, they would often extend the ISA to gain a temporary advantage over their competitors by enabling developers to write more optimal code paths that would run only on Intel’s most recent CPUs. They have done less of that ever since the AVX-512 disaster, but Intel still is the one defining ISA extensions and it historically gained a short term advantage whenever it did.

Interestingly, the situation is somewhat inverted as of late given Intel’s failure to implement the AVX-512 family of extensions in consumer CPUs in a sane way, when AMD succeeded. Intel now is at a disadvantage to AMD because od its own ISA extension. They recently made AVX-10 to try to fix that, but it adds nothing that was not already in AVX-512, so AMD CPUs after Zen 3 would have equivalent code paths from AVX-512, even without implementing AVX-10.


>Intel C/C++ Compiler that favored Intel

Thats where Nvidia learned to "optimize" Cuda software path. Single threaded x87 FPU on SSE2 capable CPUs.

https://arstechnica.com/gaming/2010/07/did-nvidia-cripple-it...

https://www.realworldtech.com/physx87/3/ "For Nvidia, decreasing the baseline CPU performance by using x87 instructions and a single thread makes GPUs look better."

They doubled down that approach with 'GameWorks' crippling performance on non Nvidia GPUs, Nvidia paid studios for including GameWorks in their games.


NVidia has software moat for specialized applications but not for AI, which is responsible for most of their sales now. Almost everyone in AI uses pytorch/jax/triton/flash attention and not CUDA directly. And if Google can support pytorch for their TPU and Apple for their M1 GPU, surely others could.


> NVidia has software moat for specialized applications but not for AI, which is responsible for most of their sales now. Almost everyone in AI uses pytorch/jax/triton/flash attention and not CUDA directly

And what does pytorch et al. use under the hood? cuBLAS and cuDNN, proprietary libraries written by NVidia. That is where most of the heavy lifting is done. If you think that replicating the functionality and performance that these libraries provide is easy, feel free to apply for a job at NVidia or their competitors. It is pretty well paid.


Did you read the last part? Pytorch uses drivers, and drivers exists for Google's TPU and Apple's M1 GPU as well and both works pretty well. I have tested both and it reaches similar MFU as Nvidia.


Maybe on a particular model/dataset but extremely unlikely in general. Again, like another commenter pointed out: if you truly believe it isn't that hard we would love to hire you at Meta ;)


Yes some operations are not supported in MPS/TPU and falls back to slower CPU. But for common architectures like transformers and convnets, it works very well for all the datasets.

I never claimed it was easy. I meant in my opinion it is in the order of 10s of millions dollars of investment, not a trillion dollar CUDA moat that people comment here.


Are M1 GPUs available for data center deployment at scale? Are Google TPUs available outside of Google? Can Amazon or Microsoft or other third parties deploy them?

Anyone that wants off the shelf parts at scale is going to turn to Nvidia.


So Nvidia's moat is mainly hardware and not software?


That's the point I am making. And the reason Amazon or Microsoft can't deploy them is the hardware, not CUDA.


Yeah, and if you're using Nvidia, you're using CUDA.


And because the world buys all the shelf stuff from Nvidia and uses CUDA, Nvidia get the largest debugging user base there is. This way Nvidia can continue to make CUDA even better and cycle repeats.


Intel didn't do a lot of things...


The Pentium math bug, Puma cablemodems, their shitty cellular modems that are far worse than Qualcomm's, gigabit chipset issues, 2.5GB chipset issues, and now the 13th/14th gen CPUs that destroy themselves.

And we just gave them billions in tax dollars. Failing upwards...


Atom C2000 waves from behind all this crowd!


I still have five C2758 nodes running just fine, though!


Nvidia is helping power the tool that destroys the software moat.


Interesting that it's entirely failed to do that so far.

Nvidia is helping power the next generation of big brother government programs.


Aren’t we like 0.000001% in the journey?


That is the largest goalpost move I've ever seen, but sure, at that order of magnitude anything is plausible.


> and Broadcom was thinking about buying them...

I'm thinking about buying Nvidia

(this is bullshit)


Between EVGA getting out of the Nvidia card business, Nvidia continuing to be problematic under Linux (even if that’s improving), all the nonsense with the new power connector, and the company’s general sliminess, I’m increasingly leaning towards an AMD (or potentially Intel) card for my next tower upgrade.

AMD and Intel might only be competing in the entry-to-midrange market sector but my needs aren’t likely to exceed what RX 8000 or next-gen Intel cards are capable of anyway.


AMD has inexplicably decided not to invest in software. Just like car manufacturers don't realize that a shitty infotainment system can keep people from buying the $100k car, AMD doesn't seem to realize that people aren't buying a GPU for ML if their ML framework doesn't run on it...

And this goes down to consumer drivers too. I've sworn to myself that I'm not buying AMD for my next laptop, after endless instability issues with the graphics driver. I don't care how great and cheap and performant and whatever it is when I'm afraid to open Google Maps because it might kernel panic my machine.


I have AMD in my desktop and my laptop and it has been pretty good under Linux (I use Fedora) the past year or two. AMD definitely was late to the game, and I still don't think they care as much as they should, but they are definitely working on it. I've been easily running GPU accelerated Ollama on my desktop and laptop through ROCm.

AMD is definitely not perfect but I don't think it's fair to say they decided not to invest in software. Better late than never, and I'm hoping AMD learned their lesson.


I thought they finally learned their lesson... then they cancelled the funding of ZLUDA... then they seem to have gone back on an agreement and demanded the open sourced version to be taken down.

The years it took them to get their Linux drivers into a usable shape are another issue.


How long ago was this? I bought an AMD laptop this year and it's been great with both windows and Linux. I can't say the same for my Nvidia pc ...


I think it went away either with Ubuntu 23.10 or 24.04 - but I don't know if they actually fixed it or just changed something random that masks the bug for now, only to come back with the next kernel version (I've had that issue before).

Given that the issue (or variants thereof, because there were at least 10 different workarounds to try) was somewhat widely reported, the time it took to get this fixed far exceeded anything I would consider tolerable.


> AMD has inexplicably decided not to invest in software

Perhaps they were distracted by dismantling Intel's CPU hegemony? I wouldn't fault them for that, fighting 2 Goliaths simultaneously isn't a sound strategy.


If that was the case, then they shouldn't have bought ATi.


AMD's acquisition of ATI was a net detriment to AMD for at least a decade. They ended up with a ton of debt, had to sell off their fabs but still had to use those fabs even after they were uncompetitive, and struggled to field high-end products in either CPU or GPU product lines. They didn't start to reap significant benefits from having both product lines under one corporate umbrella until they started scoring design wins for game console SoCs, and those provided sales volume but not much profit margin.


I made that choice several years ago. All new PCs I buy/build are AMD only.

The hardware is a bit finnicky, but honestly I prefer a thing to just be broken and tricky as opposed to nvidia intentionally making my life hard.


I wish I could say that was a realistic alternative for compute work (and on workstations and servers rather than consumer PCs). Unfortunately, it doesn't look like it - both in terms of the hardware offering (AFAICT), and ecosystem richness. Which is really a shame; not because AMD are saintly, but because NVIDIA have indeed indeed been slimey and non-forthcoming about so much, for so long.


EVGA also significantly reduced warranty on their PSUs. Changed PSU components without changing model number.


I am tempted to try the new Intel GPU as an upgrade for my current ~5yo build. I don’t need something high end, and I don’t need any AI stuff. But I use a dual boot Windows/Linux, and I am a bit worried about how it will behave under Linux.


Intel is by far the best out of the box experience under linux. I have 3 cards. I will get one of the new battlmage cards for my gaming pc.

Edit: the only downside is that the hw h265 encoder is pretty bad. Av1 is fine though


I don’t have any experience with discrete Intel cards, but yes their iGPUs have been flawless for me under Linux. Same for other components, to the point that I’d say a reasonable way of gauging how well a laptop will work with Linux is to look at how much of its hardware is Intel (CPU excluded). Intel wifi/bluetooth and ethernet are also great under Linux for example.


What does a "pretty bad" h265 implementation look like? Buggy? Inefficient or what?


Video encoders can vary widely in quality. There are lots of parameters that allow for a wide range of "correct" encodings of the same source video file. Realtime hardware encoders in general have lower visual quality for the same bitrate than software encoders that may be slower but more thoroughly search through different options for encoding each group of frames.

Decoding is much more deterministic, so speed and power efficiency are the main ways hardware decoders can differ.


Encoded media either comes out blocky, with artifacts, or plain old slow. Some also have bugs related to the encoder that app developers have to contend with.


The h265 quality is just not up to par. The av1 encoder does a nice enough job, so does the h264 one.


I was reading that Intel GPU firmware cannot be upgraded under Linux, only Windows, is that still the case?


AMD's 7xxx series cards were almost universally worse than their 6xxx equivalents. AMD cut memory bus width and reduced compute units, all in a quest to reduce power consumption because they're so power-hungry. They're still not as good as NVIDIA cards for power consumption.

The drivers are unreliable, Adrenalin is buggy, slow, and bloated; AMD's cards have poor raytracing, and AMD's compute is a dumpster fire, especially on Windows; ROCm is a joke.

None of the LLM or Stability Matrix stuff works on AMD GPUs under Windwos without substantial tweaking and even then it's unreliable garbage, whereas the NVIDIA stuff Just Works.

If you don't care about any of that and just want "better than integrated graphics", especially if you're on Linux where you don't need to worry about the shitshow that is AMD Windows drivers - then sure, go for AMD - especially the cards that have been put on sale (don't pay MSRP for any AMD GPU, ever. They almost always rapidly discount.)

AMD simply does not have the care to compete with NVIDIA for the desktop market. They have barely a few percent of the desktop GPU market; they're interested in stuff like gaming consoles.

Intel are the only ones who will push AMD - and it will push them to either compete or let their product line stagnate and milk as much profit out of the AMD fanboys as they can.


I'm ambivalent about this sort of thing (or, as another example, Intel's CPUs many years ago that offered paid firmware upgrades to enable higher performance).

On one hand, it's very bad because it reduces economic output from the exact same input resources (materials and labor and r&d).

On the other hand, allowing market segmentation, and more profits from the higher segments, allows more progress and scaling for the next generation of parts (smaller process nodes aren't cheap, and neither is chip R&D).


I want to add nvidia sales are 90% data center and 10% gaming, and the author being part of the 10% who wasn't abandoned is complaining they got a product at half the speed, way lower price, instead of half the price same specs as the datacenter client

man.


Or the 90% are charged absurd markup because clearly they can deliver the hardware for 10% use-case for 1000$ and still make money on top but they would rather charge the data centers 50k for the same product


There are companies lined up around the block to hand over $$$$$ for the only competitive GPUs on the market. Is the markup actually absurd?


There are investors lined up around the block to hand over $$$$$ for SFHs on the market while people delay family making because they cannot afford housing or end up on the streets.

So, is the market really absurd?

Just because some billionaires are desperate for growth to grow their hundred billions into trillions outbidding each other does not mean that 90% of humanity cannot make use of ML running locally on cheaper GPUs.


I consider those two very different problems. One is a supply problem, the other is a competition problem.

Also, housing is a basic human right, whereas fast GPUs probably are not.


Different markets, similar semantics. Both are artificially supply restricted.

Infact you can argue that something is really wrong in our governance if housing is human right and yet there are people profiteering from how unaffordable it has become.

I am more appalled at how long it has taken for the big tech other than Google to standardized ML workload and not be bound by CUDA.


As far as I know, there is no political barrier preventing Intel and AMD from making competitive GPUs.

There is, however, a political barrier to increasing the housing supply.


The artificial limit does not have to be political, it can be as simple as nvidia blowing fuses to reduce compute on lower tier cards.


Or they can't afford to sell the cards at consumer prices. If they take a loss in the consumer segmet, they can recoup by overcharging the datacenter customers.

That's how this scheme works. The card is most likely not profitable at consumer price points. Without this segmentation, consumer cards would trail many years behind the performance of datacenter cards.


You can theorize a million scenarios, but clearly no one here will know what really transpired for Nvidia to hobble their consumer chips. I really don't think consumer is loss leading, GPUs for AI is a fairly recent market while Nvidia has existed churning out consumer GPUs since 90s.

But clearly, lack of competition is one thing that supports whatever rent Nvidia seeks.


“On one hand, it's very bad because it reduces economic output from the exact same input resources (materials and labor and r&d).”

This is not true with the economies of scale in the semiconductor industry.


Interesting no one considers the environmental impact. This creates tonnes of e-waste with a shortened useful life.

We should demand that it's unlockable after a certain time.


Maybe not demand that it be unlockable, but rather if Nvidia were to provide a paid upgrade path to unlock these features that would help. They would need some way to prevent the open source drivers from accessing the features, though.


> but rather if Nvidia were to provide a paid upgrade path to unlock these features that would help.

You do not revert a blown e-fuse with a software update.


Yes, but an e-fuse would not be the only way to lock the feature if Nvidia were to want to unlock it later.


The point being that they knowingly went down the e-fuse route to make sure nobody could unlock it later.

They probably learnt their lesson in the early 2000s, when users could do a bit of soldering and apply a patch to unlock more performance on cards that had been artificially throttled.


E-waste is mostly a fake concept. By the time a 4090 has outlived its usefulness for gaming it's also likely that no one wants it for AI (if they ever did).


Aren't you giving a good reason why its not a fake concept there?


Nope. Putting electronics in landfills simply isn't that harmful.


Okdokie well not really going touch that bit, but just still want to point out that the way you have it written it sounds like the second claim is an argument for the first and it makes it a little confusing.


Halving the clock also reduces heat dissipation and extends component life.


In this case, the 2-slot RTX 6000 consumes 300 W whereas the "nerfed" 3.5-slot 4090 can draw 450 W.

So I don't think the nerfing here was to lower power consumption. It's just market segmentation to extract maximum $$$$ from ML workloads.

nvidia have always been pretty open about this stuff - they have EULA terms saying the GeForce drivers can't be used in data centres, software features like virtual GPUs that are only available on certain cards, difficult cooling that makes it hard to put several cards into the same case, awkward product lifecycles, contracts with server builders not to put gaming GPUs into workstations or servers, removal of nvlink, and so on.


I didn’t say they don’t do artificial segmentation. I just noted that, in this case, it might have an upside for the user. There might also be some binning involved- maybe the parts failed as A300 parts.


Yeah, somebody knew the new power connectors were going to be sus, so halving the power was at least somewhat safe thing to do


Yeah, this is far from new too


Binning and market segmentation are not mutually exclusive. Of course they're going to put their best-performing chips in the most expensive segment.


The difference is whether chips with no defects get artificially binned.

In a competitive market, if you have a surplus of top-tier chips, you lower prices and make a little more $$ selling more power.

With a monopoly (customers are still yours next upgrade cycle), giving customers more power now sabotages your future revenue.


I don't think anyone has ever gone to TSMC and said "hey we're short on our low end chips, can you lower your yields for a bit?"


Can this be fixed by removing the efuse or having a custom firmware?


You can't lol. How do you wanna restore a blown fuse on nanometer level INSIDE the GPU die. Its simply not possible.

By the way, AMD also uses fuse blowing if you e.g. overclock some of their CPUs to mark them as warranty voided. They give you a warning in the BIOS and if you resume a fuse inside the CPU gets blown that will permanently indicate that the CPU has been used for overclocking (and thus remove the warranty)


> You can't lol. How do you wanna restore a blown fuse on nanometer level INSIDE the GPU die. Its simply not possible.

I wouldn't dismiss this so aggressively.

Frequently (more frequently than not), efuses are simply used as configuration fields checked by firmware. If that firmware can be modified, the value of the efuse can be ignored. It's substantially easier to implement a fused feature as a bit in a big bitfield of "chicken bits" in one-time programmable memory than to try to physically fuse off an entire power or clock domain, which would border on physically irreversible (this is done sometimes, but only where strictly necessary and not often).


NVidia is smarter than this - they sign all their firmware, so you can't just modify the firmware and bypass this. No signed firmware means no functioning card. The famous example of this was the 'reduced hash functionality' RTX 3060 cards that accidentally had the 'reduced hash' feature disabled in a signed copy of firmware that Nvidia released. If they hadn't accidently released this firmware, the reduced hash stuff would have worked forever.


I am indeed well aware of how firmware validation works. Finding a vulnerability in firmware validation is however much more likely than reversing OTP for almost all varieties of OTP, even NVidia’s firmware validation which is generally regarded as pretty strong.


This reminds me of when Motorola introduced the first carrier-enforced, signature lock on Android with no "OEM Unlock" option and a bunch of ... questionably informed... people insisted it was by passable. (To my knowledge, it has not been bypassed, and jesus that was 15 years ago probably)

Granted, it's Nvidia, and they've been featured in devices that were notoriously hackable, but also, it's not 2018 anymore.

Needless to say, people should understand when they buy an Nvidia card, they should fully expect to use Nvidia firmware, with whatever that entails.

EDIT: I'd really like to remember the name of this device. It was the same era as Blackberry releasing... the Storm? Some resistitive-touch device with a physically clickable screen. Motorola Storm? I really wish I could recall. (sub-edit: I think the Storm was the Blackberry device. So something else...)


Maybe the Fire? It looks like a Storm. I don't think they made any resistive Android devices though, even the original Droid was capacitive.

I think there might have been one or two OMX devices in this era with locked bootloaders that weren't bypassed due to a lack of research, but I actually find this example a bit amusing: early Qualcomm Motorola Android phones were touted as "unhackable" due to their use of fuses (Qualcomm even went on a marketing pitch calling them "Q-Fuses"), but were extremely quickly unlocked using trivial TrustZone supervisor vulnerabilities (iirc, there was an SMC that literally had a write-what-where primitive in it).


This is true but that's why the firmware is signed so you can't patch it.


> How do you wanna restore a blown fuse on nanometer level INSIDE the GPU die. Its simply not possible.

Bullshit. There will be hackers in the future who can do it in their garage. Just... not anytime soon.

> By the way, AMD also uses fuse blowing if you e.g. overclock some of their CPUs to mark them as warranty voided. They give you a warning in the BIOS and if you resume a fuse inside the CPU gets blown that will permanently indicate that the CPU has been used for overclocking (and thus remove the warranty)

Emphasis on "some." You can buy plenty of CPUs from them made for overclocking.


Not anytime soon as in 50? 100 years? 4090 is on 5nm node, how about you first demonstrate its possible on something 100x bigger? For example Intel started locking CPU multipliers in Pentium manufactured in 0.8 μm (800 nm), unlock that fuse :)


> Bullshit. There will be hackers in the future who can do it in their garage. Just... not anytime soon.

I'll take your bet on this. Silicon designers aren't unaware of this potential vulnerability, and if you want to prevent eFuses from being un-blown, you can design for that. I would place money on there not being any commercially viable way to restore an eFuse in a 4090 die at any point in the future. You can probably do it, but it would require millions of dollars in FIB and SEM equipment and likely would destroy the chip for any useful purpose.

Usually the only useful reason to attempt to recover/read/unblown fuses is to read out private keys built into chips.


> You can probably do it, but it would require millions of dollars in FIB and SEM equipment and likely would destroy the chip for any useful purpose.

The price tag and size of these things are what I'm talking about. SOME day it will get much cheaper and smaller. A 4090 will be useless at that point, but I still play with 8086s and vacuum tubes, so...

No point in betting though. We'll both be dead by then.


I’d still take the bet. FIB and SEM stuff is highly specialized and miniaturizing it to be obtainable by a garage user seems unlikely even in the distant future. Either way, you still couldn’t take a 4090 and make it functional as a 6000 series. You’d destroy it in the process if it was even possible at all.


Nah, it's going to become democratized. It's already started.

There's been a few different stories like this lately:

https://interestingengineering.com/videos/guy-builds-integra...

People said the same thing you're saying now about computers. You're just being silly and forgetting history.


Making a hobby etching thing that isn’t anywhere close to the state of the art is cool, but not exactly anywhere close to what is needed to look at a modern chip.

You’re trivializing the challenge of modifying something that is on the order of 50nm wide and specifically designed to not be able to be tampered with.


I feel like you don't understand how time works. ;) We've barely had any of this technology 50 years. Give me 500 years and I absolutely guarantee you that I'll fuck up some 4090s with some gadget the size of a mobile phone that costs the 10 cents and it'll work perfectly fine.


e-fuses (https://en.wikipedia.org/wiki/EFuse) are typically etched into the silicon as an exactly-once operation, meant to irrevocably set a configuration. Some devices, for instance, have an e-fuse that makes it impossible to change cryptographic trust signatures after the e-fuse has been blown.


That's intriguing; one might assume that a key feature of eFuse would be the ability to reset easily. But I guess it could be implemented without it


There are two kinds of thing that are both called eFuse: current-limiting devices which are almost always resettable, and one-time programmed bits which are intentionally one-time programmable (some are implemented as physical fuses which are blown by overcurrent, others are implemented using various other types of write-once nonvolatile memory, and some bad ones are implemented by making normal flash pretend to be write-once using firmware-level protections).

Professionally I usually see OTP referred to as "fuses," "OTP," "straps," or "chicken bits," with the specific word "eFuse" reserved for the current-limiting device. But in popular media the trend seems the opposite.


No, the essence of a fuse is that it can never be reset. A circuit breaker is different.


Why would that be the assumption? An eFuse is a literal trace that gets burnt up, you can't connect it back again.


There is competition, but the competition isn't winning 1st, 2nd or 3rd.

There was a day in which Intel was top dog and for a long time, now AMD are competing in 1st place.

AMD and/or other competitors will have their day, but today its Nvidia.


This is so sketch. Although, I have not seen any reports of misrepresentation, but I hope EU looks into this.


  https://xcancel.com/realGeorgeHotz/status/1868356459542770087


Did they do this to the 3090 Ti too?


so the next questions, how to deposit a small bit of metal to fix the fuse?


How do you go to from that screen shot to this conclusion?


You go from the eFuse mentioned at the beginning to the conclusion in the screenshot at the end, not the other way around.


Am I missing something then? Is there some context to the linked tweet and screen shot that didn't come up?


Perhaps this part isn't appearing above the image for you?

"NVIDIA is so far ahead that all the 4090s are nerfed to half speed.

There's an eFuse blown on the AD102 die to halve the perf of FP16 with FP32 accumulate. RTX 6000 Ada is the same die w/o the blown fuse.

This is not binning, it's segmentation. Wish there was competition."


I wonder why everyone on here isn’t saying “copyright is stupid”. You know it grants a monopoly?


What?


Imagine running this card at full speed, with fully unlocked potential. What would happen to new tiny power connectors? I am betting insta-fire.


It would just throttle like it already does.


Throttle means that performance gain would not be there. Unthrottled power is needed for (hypothetical) unthrottled GPU chip. Unthrottled power is impossible on current power design, unless melted connectors are not a concern.


stock must go up


Yeah but we can only use 10% of our brain too, so.


It's interesting how we the people (broadly) accept this practice in software and even some hardware, but not in other areas. Note how frustrated people are when you hear about "unlocking" sensors and services available on cars.

If a product is made, and the cost to provide that product is the same one way or the other but you cripple it to create segmentation, then that is greed. Period. Objectively. And if you're okay with that, then fine, no problem. Just don't try to tell me it isn't maximization of profit.

There are no heroes in the megacorp space, but it would be nice for AMD and Intel to bring Nvidia to heel.


Maximization of profit is what all companies do. For publicly traded companies, it is considered a duty to their shareholders and not doing so will result in executives getting booted out and even lawsuits.

With that out of the way, market segmentation is often good for budget customers, who, in the case of Nvidia GPUs, are gamers. They get GPUs that run their games just as well as the uncrippled model, for a much lower price. Without market segmentation, all the GPUs would go to Amazon, Microsoft, Google, etc... since they are the ones with the big budget, gamers will be left with GPUs they can't afford, and Nvidia with less profits as they will lose most of the market for gamers.

With market segmentation, Nvidia wins, gamers win, AI companies and miners lose. And I don't know about you, but I think that AI companies and miners deserve the premiums they pay.

It sounds stupid to pay for crippled hardware, but when buying a GPU, the silicon is only a small part of the price, the expensive part is all the R&D, and that cost is the same no matter how many chips they sell, and it makes sense to maximize these sales, and segmentation is how they do it without sacrificing their profits.

Of course, should AMD or Intel come back, they would do their own market segmentation too, in fact, they already do.


Aside from possible technical explanations (e.g., the binning of products based on defects which permit sub-optimal performance as creato describes: <https://news.ycombinator.com/item?id=42435397>), there's market segmentation.

French polymath (economist, engineer, bureaucrat) Jules Dupuit famously described this concerning railway carriage accomodations and the parlous state of third-class carriages:

It is not because of the several thousand francs which they would have to spend to cover the third class wagons or to upholster the benches. ... [I]t would happily sacrifice this [expense] for the sake of its popularity.

Its goal is to stop the traveler who can pay for the second class trip from going third class. It hurts the poor not because it wants them to personally suffer, but to scare the rich.

<https://www.inc.com/bill-murphy-jr/why-does-air-travel-suck-...>

More on Dupuit:

<https://en.wikipedia.org/wiki/Jules_Dupuit>

Market segmentation by performance is a long-standing practice in the information technology world. IBM would degrade performance of its mainframes by ensuring that a certain fraction of CPU operations were no-ops (NOPs), meaning that for those clock cycles the system was not processing data. A service engineer would remove those limits on a higher lease fee (IBM leased rather than sold machines, ensuring a constant revenue stream). It's common practice in other areas to ship products with features installed but disabled and activated for only some paying customers.

Another classic example: the difference between Microsoft Windows NT server and workstation was the restriction of two registry keys:

We have found that NTS and NTW have identical kernels; in fact, NT is a single operating system with two modes. Only two registry settings are needed to switch between these two modes in NT 4.0, and only one setting in NT 3.51. This is extremely significant, and calls into question the related legal limitations and costly upgrades that currently face NTW users.

<https://landley.net/history/mirror/ms/differences_nt.html>


People have wrong intuitions about a lot of things and price discrimination is one of them. That's the modern world for you.


How is this modern world? LOL




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: