I fixed some chat template issues for llama.cpp and other inference engines! To ...

diggan · 2025-07-09T09:10:28 1752052228

> fixed some chat template issues

This seems to be a persistent issue with almost all weight releases, even from bigger companies like Meta.

Are the people who release these weights not testing them in various inference engines? Seems they make it work with Huggingface's Transformers library, then call it a day, but sometimes not even that.

danielhanchen · 2025-07-14T07:23:07 1752477787

Oh so chat template issues yes are quite pervasive sadly - for eg Llama as you mentioned, but also Qwen, Mistral, Google, the Phi team, DeepSeek - it's actually very common!

My take is large labs with closed source models also did have issues during the beginning, but most likely have standardized the chat template (for eg OpenAI using ChatML). The OSS community on the other hand keeps experimenting with new templates - for example adding tool calling causes a large headache. For example in https://unsloth.ai/blog/phi3 - we found many bugs in OSS models.

clarionbell · 2025-07-09T11:12:34 1752059554

No they don't. Why would they? Most of them are using a single inference engine, most likely developed inhouse. Or they go for something like vLLM, but llama.cpp especially is under their radar.

The reason is simple. There isn't much money in it. llama.cpp is free and targets lower end of the hardware spectrum. Corporations will run something else, or even more likely, offload the task to contractor.

danielhanchen · 2025-07-14T07:25:08 1752477908

The chat template issues are actually not on llama.cpp's side, but on all engines (including vLLM, SGLang etc) For eg see https://www.reddit.com/r/unsloth/comments/1l97eaz/deepseekr1... - which fixed tool calling for DeepSeek R1

segmondy · 2025-07-09T01:54:27 1752026067

doing the good work, thanks daniel!

danielhanchen · 2025-07-09T02:35:00 1752028500

Thank you!

v5v3 · 2025-07-09T13:15:31 1752066931

Thanks

danielhanchen · 2025-07-14T07:23:20 1752477800

Thanks!