Hacker News new | past | comments | ask | show | jobs | submit login

This is so fun. A question for you (or anyone else familiar with this topic), what hardware you would recommend for someone just getting into training GPT2 models? Would a Radeon RX 580 be enough?



You cannot train any GPT-2 models with an AMD GPU. Nvidia's CUDA is still the de facto toolkit.

Either use Colab (free), or a preemptible GPU instance on GCE w/ the Deep Learning VM image (relatively cheap). Using consumer GPUs is a recipe for frustration.


>You cannot train any GPT-2 models with an AMD GPU.

It seems like you can. I know of at least one person who has finetunned 1.5b on a 16GB AMD. I think u/sillysaurusx had some part in it, but apparently translating the code from CUDA was fairly easy.


There are also several people on Twitter who have mentioned training it on AMD GPUs.


Works fine on AMD. Grab a Tensorflow-ROCM image and go to town.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: