Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Each H100 can do 60 TFLOPS of f32 operations, while a single RTX 3080 can do roughly half that (just under 30). So complete back-of-the-envelope answer would be 16x as long (since nanochat is targeting four hours with 8xH100)

64 hours isn’t too bad at all!

(An RTX 2080 can only do 10 TFLOPS for fp32, so that would be again 3x as long.)



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: