> maximum-2B model If you are resource limited, remember that you can also play ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		selcuka on June 7, 2024 \| parent \| context \| favorite \| on: Qwen2 LLM Released > maximum-2B model If you are resource limited, remember that you can also play with the quantization to fit more parameters into less amount of RAM. Phi-3-mini [1] (a 3.8B model) is 7.64GB with full (16-bit floating point) precision, but it is only 2.39GB when quantized to 4 bits. That being said, I haven't personally tested it, but have heard good things for CodeGemma 2B [2]. [1] https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf... [2] https://huggingface.co/google/codegemma-2b-GGUF

coder543 on June 7, 2024 [–]

CodeGemma-2b does not come in the "-it" (instruction tuned) variant, so it can't be used in a chat context. It is just a base model designed for tab completion of code in an editor, which I agree it is pretty good at.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact