6.7b is pretty small, no? Do you even need offloading for that on a 3090? I'd be... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

Ajedi32 on Feb 20, 2023 | parent | context | favorite | on: Running large language models like ChatGPT on a si...

6.7b is pretty small, no? Do you even need offloading for that on a 3090? I'd be curious to see what's needed to run opt-30b or opt-66b with reasonable performance. The README suggests that even opt-175b should be doable with okay performance on a single NVIDIA T4 if you have enough RAM.

nathan_compton on Feb 20, 2023 [–]

It is entirely possible to run 6.7B parameter models on a 3090, although I believe you need 16 bit weights. I think you can squeeze a 20b parameter model onto the 3090 if you go all the way down to 8.

Consider applying for YC's Spring batch! Applications are open till Feb 11.
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact