Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Lots of AI HW is focused on RAM (512GB!). I have a cost-sensitive application that needs speed (300+ TOPS), but only 1GB of RAM. Are there any HW companies focused on that space?


Isn't that just any discrete (Nvidia,AMD) GPU?


Like others have said, basically traditional GPUs (RTX 40/50 series in particular, 20/30 series have much weaker tensor cores).

In terms of software, recent NVIDIA and AMD research has focused on fast evaluation of small ~4 layer MLPs using FP8 weights for things like denoising, upscaling, radiance caching, and texture and material BRDF compression/decompression.

NVIDIA has just put out some new graphics API extensions and samples/demos for loading a chunk of neural net weights and performing inference from within a shader.


Just buy any gaming card? Even something like the Jetson AGX Orin boasts 275 TOPS (but they add in all kind of different subsystems to reach that number).


The Jetson is interesting!

Can you elaborate on how the TOPS value is inflated? What GPU would be the equivalent of the Jetson AGX Orin?


The problem with the TOPS is that they add in ~100 TOPS from the "Deep Learning Accelerator" coprocessors, but they have a lot of awkward limitations on what they can do (and software support is terrible). The GPU is an Ampere generation, but there is no strict consumer GPU equivalent.


Most recent GPUs will do. An older RTX 4070 is over 400 TOPS, the new RTX 5070 is around 1000 TOPS, and the RTX 5090 is around 3600 TOPS.


Yeah, that's basically where I'm at with options. Not ideal for a cost sensitive application.


Greyskull cards might be a fit. Think they’re not entirely plug and play though




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: