Hacker News new | past | comments | ask | show | jobs | submit login

By who? Only comparison I have seen is that it sucks vs. EXL2 https://oobabooga.github.io/blog/posts/gptq-awq-exl2-llamacp...



The issue is benchmarks for LLMs or models formats are tough to compare, as there are many factors at play. But beyond ooba's comparison, many other sources recommend GPTQ or AWQ for GPU inference as it gives better quality for the same quant level (AWQ apparently takes more VRAM though, but better quality). Given how many models are available I would take these tests with a grain of salts.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: