7B LLaMA is a terrible general purpose model, but the finetunes are pretty good at very specific roles, like dialogue/roleplay, a dungeon master bot or even code completion.
The metrics are good though, perhaps placing this closer to 13B.
And 8K context is huge. When you can stuff that much example text in, it gives the model more to "latch onto," and its also the point where you would start worrying about RAM/VRAM consumption for a ~13B model.
You must have missed the memo... It's now super easy to extend the context of 2k llama models to 8k, 16k, or even 32k with just a small fine tune and a tweak to the code.
You still need the memory to be able to go that high, but it's totally doable.
But I assumed full training would give better perplexity for large contexts, and perhaps this method would be more effective at 16K+ with an 8K model to start with.
Possibly, but the perplexity has shown to decrease while fine-tuning a 2048 model on larger context sizes for outputs within it's original context limit...so, more research needed.
Some model datasets like Manticore, Chronos or the infamous Pygmalion are more "secretive," but you can find the dataset gathering scripts on Github or in community chats.
That blog post demonstrates that it's not "easily" finetuneable, just possible to finetune. There's many technical considerations even beyond hardware (dataset formatting, training hyperparameter nuances) that do not make it accessible to the newbie experimenting with LLMs.
It's a rabbithole, and unfortunately there's no good shortcuts.
It isn't script kiddie level but it isn't hard I finetuned a 15B parameter reddit bot with an afternoon of time and a day of training on a 3090. Bot got a few thousand Karma in a couple of days before I turned it off (proof of concept done).
If all you have is an M1 or whatever, ya, you need a real workstation and depending on your use ChatGPT might be cheaper/better.
Why have there been thousands of overnight AI/GPT startups and products in the last few months and NOT a single simple intuitive "fine tuning wizard" app? That seems like such an obvious glaring gap.
Because the ChatGPT API (and analogous competitors) is cheap enough that it's both faster and more cost effective to just use it instead instead of using your own model, with maybe some shenanigans to handle its shortcomings without increasing cost much if at all.. And that was before gpt-3.5-turbo-0613, which dropped the price more and is about 2-3x faster.
There are startups that do finetuning on your own data, but with zero hints on how to preprocess your data and absurd costs (both upfront training and GPUs for serving inference) that's it's extremely difficult to argue from a customer business perspective compared to just using an API.
> Why have there been thousands of overnight AI/GPT startups and products in the last few months and NOT a single simple intuitive "fine tuning wizard" app?
Vapourware GPT startup inc is valued at $2bn the afternoon after you form the company and buy your first macbook.
Actual usage of Ai, fine tuning etc. I can offer you $100,000 for 30% of your company if you can demonstrate a fully working product.
The metrics are good though, perhaps placing this closer to 13B.
And 8K context is huge. When you can stuff that much example text in, it gives the model more to "latch onto," and its also the point where you would start worrying about RAM/VRAM consumption for a ~13B model.