More

artemisart · 2025-08-28T10:10:20 1756375820

That's very true and what's segmenting the market, but I don't understand why you're saying the 5090 supports only 12B model when it can go up to 50-60B (= a bit less than 64B to leave room for inference) as it supports FP4 as well.

nabla9 · 2025-08-28T10:29:07 1756376947

Its for comparison using raw, non optimized models. Both can do much better when you optimize for inference.

Information is in the ratio of these numbers. They stay the same.

artemisart · 2025-08-28T10:41:35 1756377695

Ok then just to clarify: you can fit 4x larger models on the Spark vs 5090, not 17x.

ilirium · 2025-08-28T11:39:33 1756381173

@nabla9 have tried to tell you that for DGX Spark, you can also use optimized models; therefore, this means that Spark can also be used for inference with bigger models, such as those exceeding 200B.

Please compare the same things: carrots VS carrots, not apples VS eggs.

artemisart · 2025-08-28T11:51:22 1756381882

I don't understand what's not optimized on 5090. If we're comparing with Apple chips or AMD Strix Halo yes you will have very different hardware + software support, no FP4 etc. but here everything is CUDA, Blackwell vs Blackwell, same FP4 structured sparsity, so I don't get how it would be honest to compare a quantized FP4 model on Spark with an unoptimized FP16 model on a 5090 ?

NewsaHackO · 2025-08-28T13:20:35 1756387235

To me, what I think they are saying is that the Spark can use a FP16 unoptimized model with 200B parameters. However I don't really know.

reissbaker · 2025-08-28T13:29:08 1756387748

You can't. The Spark has 128GB VRAM; the highest you can go in FP16 is 64B — and that's with no space for context.

200B is probably a rough estimate of Q4 + some space for context.

The Spark has 4x the VRAM of a 5090. That's all you need to know from a "how big can it go" perspective.

canucker2016 · 2025-08-28T22:29:21 1756420161

from the NVidia DGX Spark datasheet:

  With 128 GB of unified system memory, developers can experiment, fine-tune, or inference models of up to 200B parameters. Plus, NVIDIA ConnectX™ networking can connect two NVIDIA DGX Spark supercomputers to enable inference on models up to 405B parameters.

reissbaker · 2025-08-29T03:11:13 1756437073

The datasheet isn't telling you the quantization (intentionally). Model weights at FP16 are roughly 2GB per billion params. A 200B model at FP16 would take 400GB just to load the weights; a single DGX Spark has 128GB. Even two networked together couldn't do it at FP16.

You can do it, if you quantize to FP4 — and Nvidia's special variant of FP4, NVFP4, isn't too bad (and it's optimized on Blackwell). Some models are even trained at FP4 these days, like the gpt-oss models. But gigabytes are gigabytes, and you can't squeeze 400GB of FP16 weights into only 128GB (or 256GB) of space.

The datasheet is telling you the truth: you can fit a 200B model. But it's not saying you can do that at FP16 — because you can't. You can only do it at FP4.

canucker2016 · 2025-08-29T11:23:50 1756466630

I never claimed the 200B model was FP16.

If the 200B model was at FP16, marketing could've turned around and claimed the DGX Spark could handle a 400B model (with an 8-bit quant) or a 800B model at some 4-bit quant.

Why would marketing leave such low-hanging fruit on the tree?

They wouldn't.

hnuser123456 · 2025-08-28T15:02:58 1756393378

You and nabla9 are both the one comparing apples and eggs. 4x more RAM means 4x larger models when everything else is held the same to make a fair comparison.

artemisart · 2025-08-25T10:51:38 1756119098

The economics don't make sense, each video is stored ~ once (+ replication etc. but let's say O(1)) but viewed n times, so server-side upscaling on the fly is way too costly and currently not good enough client-side.

pier25 · 2025-08-25T19:03:44 1756148624

Are you considering that the video needs to be stored for potentially decades?

Also shorts seem to be increasing exponentially... but Youtube viewership is not. So compute wouldn't need to increase as fast as storage.

I obviously don't know the numbers. Just saying that it could be a good reason why Youtube is doing this AI upscaling. I really don't see why otherwise. There's no improvement in image quality, quite the contrary.

artemisart · 2025-08-25T10:47:40 1756118860

> Then, once that is perfected, they will offer famous content creators the chance to sell their "image" to other creators, so less popular underpaid creators can record videos and change their appearance to those of famous ones, making each content creator a brand to be sold.

I'm frightened by how realistic this sounds.

artemisart · 2025-07-30T22:01:00 1753912860

Yes I don't understand how they can claim it's optimized for legibility when the base font does the inverse.

artemisart · 2025-05-29T10:30:24 1748514624

Gitless is this fork https://marketplace.visualstudio.com/items?itemName=maattdd.... it's not updated but still works well.

artemisart · 2025-05-17T18:20:23 1747506023

pyrefly is not tied to vscode? Also please try to be more considerate of people preferences, and pycharm is not strictly better. Remote dev on vscode is very convenient for me, should I go on the Internet saying that pycharm is trash? No

oellegaard · 2025-05-17T19:01:28 1747508488

I’m not saying VS Code is trash but I think it’s closer to a text editor than an IDE. I even use it for some things non python but I remain curious to the fact why people use it for python.

It might not be tied to VS Code but the title clearly says “New […] IDE experience for…” which is why I commented. I had hoped to see something for PyCharm or even a new IDE.

fastasucan · 2025-05-18T13:43:30 1747575810

Thats what you think, other people think something else.

artemisart · 2025-05-17T14:36:45 1747492605

But it is, as long as the positional embedding are sufficient, i.e. use relative positional embeddings here.

artemisart · 2025-05-05T12:02:54 1746446574

But do you understand it will harm Hollywood? This is the economic, country scale equivalent of saying "fuck your movies", do you think the answer will be "oh sorry, I'll keep buying yours" or "fuck your movies too"?

kjksf · 2025-05-05T12:39:14 1746448754

It will reduce the profit margins of Hollywood studios. Maybe reduce the bonuses of their CEO.

It'll help thousands of Americans that used to be employed by said studios but were fired not because they couldn't do the job but because the greed of studio execs made them move production outside of the US, playing financial arbitrage games against the interest of the US.

Trump is trying to reverse this by making movies outside of the US, especially by US companies, more expensive. The goal here is not some statement about Hollywood but to bring movie production jobs back to US. The extras, the people who build sets, people who make food for actors etc. And jobs bring tax revenue and grow GDP, which helps every american.

SpicyLemonZest · 2025-05-05T13:48:44 1746452924

It’s categorically impossible that this could bring movie production jobs to the US. If a policy along these lines is implemented, both foreign and domestic consumers will launch a massive retaliatory boycott. (I will personally begin that boycott today unless Hollywood executives denounce the proposal.) The current dominance of American media offers absolutely no leverage, because it’s trivial and cost-free to go cold turkey on it.

card_zero · 2025-05-05T13:09:24 1746450564

This kills the movies.

tremon · 2025-05-06T08:52:43 1746521563

You really think CEOs will choose to reduce their bonuses to help thousands of common Americans?

artemisart · 2025-04-29T11:49:20 1745927360

ChatGPT free gets it right without reasoning mode (still explained some steps) https://chatgpt.com/share/6810bc66-5e78-8001-b984-e4f71ee423...

artemisart · 2025-04-25T19:12:59 1745608379

The first sentence of the introduction ends with "we introduce Dynamic-Length Float (DFloat11), a lossless compression framework that reduces LLM size by 30% while preserving outputs that are bit-for-bit identical to the original model" so yes it's lossless.