Is the tokenizer the same? It may "work" without actually working optimally until llama.cpp patches it in.
And the instruct model was just uploaded.
Is the tokenizer the same? It may "work" without actually working optimally until llama.cpp patches it in.
And the instruct model was just uploaded.