...what? 60 thousand dollars for a dedicated computer that you can't use is not everyone, not on their own computers, and is also a crazy large amount of money for nearly everyone. Sure there are some that could, but that's not what I said.
Most of the cost of a phone isn’t the processor, so probably closer to x1000. Hardware may get that much cheaper, but it was never guaranteed, and we’re not making progress as fast as we used to.
Moore's law didn't stop, just Dennard scaling. Expect graphics and AI to continue to improve radically in performance/price, while more ordinary workloads see only modest improvements.
GPU TDP seems on the verge of going exponential, cost per transistor isn't really decreasing so much at the very latest nodes, and even that article seems to suggest it'd likely be decades before 300x flops/$
Plus, that's the energy costs involved when running a computer now worth 60k, I'm pretty sure that in the current socio-economic climate those power costs will surpass the initial acquisition cost (those 60k, that is) pretty easily.
I wanted to add that I was writing it metaphorically in a way, as in, seeing as how high those energy bills will be they might as well all add up to 60k.
Not sure about most of the people in here, but I would get really nervous at the thought of running something that eats up 3x300 watts per hour, for 24/7, just as part of a personal/hobby project. The incoming power bills would be too high, you have to be in the wage-percentile for which dropping 60k on a machine just to carry out some hobby project is ok, i.e. you’d have to be “high-ish” middle-class at least.
The recent increases in consumer power prices are a heavy blow for most of the middle-class around Europe (not sure about how things are in the States), so a project like this one is just a no-go for most of middle-class European programmers/computer people.
At full power 3 of those would cost me ~$3.50 per day ($0.15 per kWh is what I paid for last month's electricity, though I could pay less if I made some difference choices), I occasionally have a more expensive coffee order, or have a cocktail worth three times as much.
Things are getting more expensive here but nothing like the situation in Europe (essentially none of our energy was imported from Russia, historically ~10% of oil imports but that was mostly to refine and re-export, we have all the natural gas locally that we need) The US crossed the line into being a net hyrdocarbon energy exporter a while ago (unsure what the case is recently but it is at worst about at parity)
Eh, 60k is just a bit more expensive than your average car, and lots of people have cars, and that's just how things are today. I imagine capabilities will be skyrocketing and prices will fall drastically at the same time.
You could just run this on a desktop CPU, there's nothing stopping you in principle, you just need enough RAM. A big memory (256GB) machine is definitely doable at home. It's going to cost 1-2k on the DIMMs alone, less if you use 8x32GB, but that'll come down. You could definitely do it for less than $5k all in.
Inference latency is a lot higher in relative terms, but even for things like image processing running a CNN on a CPU isn't particularly bad if you're experimenting, or even for low load production work.
But for really transient loads you're better off just renting seconds-minutes on a VM.
There isn't any reason you can't run a neural net on a CPU. It's still just a bunch of big matrix operations. The advantage of the GPU is it's a lot faster, but "a lot" might be 1 second versus 10 seconds, and for some applications 10 seconds of inference latency is just fine (I have no idea how long this model would take). All the major ML libraries will operate in CPU-only mode if you request it.
I believe you're confusing the amount of A100 graphics cards used to train the model (the cluster was actually made up of 800 A100s), and the amount you need to run the model :
> The model [...] is supposed to run on multiple GPUs with tensor parallelism.
> It was tested on 4 (A100 80g) and 8 (V100 32g) GPUs, [but should work] with ≈200GB of GPU memory.
I don't know what the price of a V100 is, but given $10k a piece for A100s we would be closer to the $60k estimate.
The $10k price is for an A100 with 40GB ram, so you need 8 of those. If you can get your hands on the 80GB variant, 4 are enough.
Also, if you want to have a machine with eight of these cards, it will need to be a pretty high-spec rack-mounted or large tower. To feed these GPUs, you will want to have a decent amount of PCIe-4 lanes, meaning EPYC are the logical choice. So that's $20k for an AMD EPYC server with at least 1.6kw PSUs etc etc.
You don't need a "decent amount" of PCIe-4 lanes. You just need 16 of them. And they can be PCIe 3.0 and will work just fine. Deep learning compute boxes predominantly use a PCIe switch. e.g. the ASUS 8000 box, which handles eight cards just fine. You only need a metric tonne of PCIe bandwidth if you are constantly shuttling data in and out of the GPU, e.g. in a game or exceedinyl large training sets of computer vision data. A little latency of a few hundred milliseconds moving data to your GPU in a training session that will take hours if not days to complete is neither here nor then. I suspect this model, with a little tweaking, will run just fine on an eight way RTX A5000 setup, or a five-way A6000 completely unhindered. That puts the price around $20,000 to $30,000. If I put two more A5000s in my machine, I suspect I could figure out how to get the model to load.
It also sounds like they haven't optimized their model, or done any split on it, but if they did, I suspect they could load it up and have it infer slower on fewer GPUs, by using main memory.
Which will work just fine with NVIDIA SWITCH and a decent GPU compute case from ASUS or IBM or even building your own out of an off-the-shelf PCIe switch and consumer motherboard.
And also, NVIDIA does not sell them to the consumer market whatsoever. Linus Tech Tips could only show one because someone in the audience sent theirs over for review.
You're grossly overestimating. People who make 60k annually are getting a bit rarer nowadays, it's not like everyone can afford it. For the majority of people it'd be a multi-decade project, for a few it might only take 7 years, very few people could buy it all at once.
Isn’t that already the case? Sure, it costs $60K, but that is accessible to a surprisingly large minority, considering the potency of this software.