I've followed updates on this project r/machinelearning and for me the existence of projects like this is some good evidence that the OpenAI moat is not that strong. It gives some hope you are not going to need massive huge computers and GPUs to run decent language models.
The moat between OpenAI and projects like this is access to expensive computing resources, an organizational mandate to invest in aforementioned compute, and finally, access to high quality training data. They were first to market so they’re only getting further ahead thanks to the invaluable real world usage data.
I don’t think their moat ever had anything to do with innovative architecture. If anything, projects like this will widen the chasm since it’s easier for OpenAI to implement new architecture in their models than it is for an independent researcher to scale their ideas.
If projects like this are seeds then the problem is that OpenAI owns all the land.
OpenAI is not the end all and be all, and projects like this will only inspire more challengers. OpenAI copying this architecture will delegitimize them as hardcore innovators, whom they're not.
I see this is a credible challenge because it most certainly is.
I agree, I sounded more defeatist than I intended. No doubt the moat is expansive, but it’s not impossible to cross. At least until the regulators get involved. [I did it again, didn’t I.]
regulators exists so that big guys won't be worried that small guys will out innovate them. If somebody asks for regulation, they became too big to innovate themselves.
Yeah, my sense is that OpenAI moat is primarily just through the RLHF dataset right now. Most of the other things – foundational models and datasets, embeddings (roughly the plugins announcement today) – have generally already been commoditized. Just getting that dataset for fine tuning is one of the last major hurdles for the ChatGPT geist.
The Open Assistant community ( https://open-assistant.io/ ) is building a crowdsourced dataset for RLHF, with apparently high quantity of high quality contributions.
How is this "proven"? Can you point to some benchmarks that demonstrate this? Everything I've seen thus far has been fairly mediocre/terrible outside of a few cherry-picked prompt/response combinations.
They're awesome technical achievements and will likely improve of course but you're making some very grand statements.
There are benchmarks in the original LLaMA paper[1]. Specifically, on page 4 LLaMA 13B seems to beat GPT-3 in BoolQ, HellaSwag, WinoGrande, ARC-e and ARC-c benchmarks (not by much though). Examples that you've seen are likely to be based on some form quantisation / poor prompt that degrade output. My understanding that the only quantisation that doesn't seem to hurt the output is llm.int8 by Tim Dettmers. You should be able to run LLaMA 13B (8 bit quantised) on the 3090 or 4090 consumer grade GPU as of now. Also, you'd need a prompt such as LLaMA precise[2] in order to get ChatGPT like output.
I had a similar impression from what I saw. Maybe it does perform as well as GPT-3 on narrow tasks that it was explicitly fine-tuned on, but that similarity in performance seems to collapse as soon as you go off the beaten track and give it harder tasks that involve significant reasoning. Consistent with that I've seen a few different sources claim that a small model fine-tuned off the outputs of a large one would likely struggle with unfamiliar tasks or contexts that require transfer learning or abstraction.
After seeing how it actually performs in practice, it's hard to have confidence that these benchmarks are reliable measures of model quality.
OpenAIs moat is the primary interfaces for the world are Google, Microsoft, and Apple, and they are attached to one of those. The others will almost certainly replicate their success and we will have three primary AI interfaces, Google, Microsoft, and Apple. That’s the power of collusive monopoly as your moat.
I hope this project will thrive.