llama3 at 8billion params is weak sauce for anything serious, it just isn't in t...

llama3 at 8billion params is weak sauce for anything serious, it just isn't in the same galaxy as Sonnet 3.5 or GPT-4o. The smaller and faster models like Phi are even worse. Once you progress past asking trivial questions to a point where you need to trust the output a bit more, its not worth effort in time, money and/or sweat effort to run a local model to do it.

A novice isn't going to know what they need because they don't know what they don't know. Try asking a question to LLaMA 3 at 8 billion and the same question to LLaMA 3 at 70 billion. There is a night and day difference. Sonnet, Opus and GPT-4o run circles around LLaMA 3 70b. To run LLaMA at 70 billion you need serious horse power as well, likely thousands of dollars in hardware investment. I say it again... the calculus in time, money, and effort isn't favorable to running open models on your own hardware once you pass the novice stage.

I am not ungrateful that the LLaMA's are available for many different reasons, but there is no comparison between quality of output, time, money and effort. The API's are a bargain when you really break down what it takes to run a serious model.