Hacker News new | past | comments | ask | show | jobs | submit login

Large scale sets are only needed for training. For inference, 8x NVIDIA A100 80G will allow inference for 300b models (GPT-3 is 175b) or 1200b models with 4-bit quantization (quantization impact is negligible for large models), so a single machine is sufficient.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: