Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yeah. I just mean in terms of VRAM usage.


Yes, that's what I mean as well.

It's between 7B and 13B in terms of VRAM usage and 70B in terms of performance.

Tim Dettmers (QLoRA creator) released code to run Mixtral 8x7b in 4GB of VRAM. (But it benchmarks better than Llama-2 70B).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: