Is Llama 2 currently the way to go for fine-tuning your own models? Are there ot...

kcorbitt · on Sept 12, 2023

Depends on your use case. If you're doing pure classification then there are smaller encoder-only models like DeBERTa that might get you better performance with a much smaller model size (so cheaper inference).

But if you need text generation and are ok with a 7B+ parameter model, Llama 2 or one of its derivatives is what I'd strongly recommend. The community around it is much larger than any of the alternatives so the tooling is better, and it's either state of the art or close to it on all evals when compared to other similarly-sized open models.

If you're comfortable sharing more details of the task you're trying to do I might be able to give more specific advice.

loudmax · on Sept 12, 2023

The Huggingface Leaderboard is mostly dominated by Llama 2 variants: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderb...

It depends a lot on what you're trying to do. If have a focused use case of the type of fine-tuning you want, you can probably get away with one of the smaller models.

Another thing to look out for is Retrieval Augmented Generation (RAG). I don't see it in wide use yet, but it may turn out to more useful than fine tuning for a lot of situations.

0x008 · on Sept 22, 2023

RAG is THE rage right now. Everybody is talking about it in enterprise world because they want to make all their legacy documents searchable.

daemonologist · on Sept 12, 2023

We've found Flan-T5 to be useful for text-to-text (mostly document QA). Haven't done a lot of testing on fine-tuning yet though.

jw903 · on Sept 12, 2023

It's one of widely fine tuned model for now. Take a look at this colab for fine tuning on your dataset https://github.com/mlabonne/llm-course/blob/main/Fine_tune_L...