Lots of AI HW is focused on RAM (512GB!). I have a cost-sensitive application th...

xyzsparetimexyz · 2025-03-05T15:12:16 1741187536

Isn't that just any discrete (Nvidia,AMD) GPU?

jms55 · 2025-03-06T06:14:30 1741241670

Like others have said, basically traditional GPUs (RTX 40/50 series in particular, 20/30 series have much weaker tensor cores).

In terms of software, recent NVIDIA and AMD research has focused on fast evaluation of small ~4 layer MLPs using FP8 weights for things like denoising, upscaling, radiance caching, and texture and material BRDF compression/decompression.

NVIDIA has just put out some new graphics API extensions and samples/demos for loading a chunk of neural net weights and performing inference from within a shader.

stefan_ · 2025-03-05T15:54:55 1741190095

Just buy any gaming card? Even something like the Jetson AGX Orin boasts 275 TOPS (but they add in all kind of different subsystems to reach that number).

johntitorjr · 2025-03-05T16:20:16 1741191616

The Jetson is interesting!

Can you elaborate on how the TOPS value is inflated? What GPU would be the equivalent of the Jetson AGX Orin?

stefan_ · 2025-03-05T17:56:06 1741197366

The problem with the TOPS is that they add in ~100 TOPS from the "Deep Learning Accelerator" coprocessors, but they have a lot of awkward limitations on what they can do (and software support is terrible). The GPU is an Ampere generation, but there is no strict consumer GPU equivalent.

NightlyDev · 2025-03-05T15:15:17 1741187717

Most recent GPUs will do. An older RTX 4070 is over 400 TOPS, the new RTX 5070 is around 1000 TOPS, and the RTX 5090 is around 3600 TOPS.

johntitorjr · 2025-03-05T16:16:15 1741191375

Yeah, that's basically where I'm at with options. Not ideal for a cost sensitive application.

Havoc · 2025-03-05T15:02:52 1741186972

Greyskull cards might be a fit. Think they’re not entirely plug and play though