I've followed updates on this project r/machinelearning and for me the existence...

textninja · on March 24, 2023

The moat between OpenAI and projects like this is access to expensive computing resources, an organizational mandate to invest in aforementioned compute, and finally, access to high quality training data. They were first to market so they’re only getting further ahead thanks to the invaluable real world usage data.

I don’t think their moat ever had anything to do with innovative architecture. If anything, projects like this will widen the chasm since it’s easier for OpenAI to implement new architecture in their models than it is for an independent researcher to scale their ideas.

If projects like this are seeds then the problem is that OpenAI owns all the land.

pffft8888 · on March 24, 2023

That's a bit defeatest, don't you think?

OpenAI is not the end all and be all, and projects like this will only inspire more challengers. OpenAI copying this architecture will delegitimize them as hardcore innovators, whom they're not.

I see this is a credible challenge because it most certainly is.

textninja · on March 24, 2023

I agree, I sounded more defeatist than I intended. No doubt the moat is expansive, but it’s not impossible to cross. At least until the regulators get involved. [I did it again, didn’t I.]

d0mine · on March 24, 2023

regulators exists so that big guys won't be worried that small guys will out innovate them. If somebody asks for regulation, they became too big to innovate themselves.

lallysingh · on March 24, 2023

I don't think so. OpenAI doesn't do anything but AI. You can't ask GPT to actually do anything for you but generate output.

Siri, the crap it is, can actually do things for you. Alexa can do things for you.

The real value of this system is as a control interface and OpenAI has nothing to control.

flangola7 · on March 24, 2023

Did you see the plugins announcement?

lallysingh · on March 26, 2023

Yeah (after my comment) but the big players will compete instead of doing plug-ins.

fnordpiglet · on March 24, 2023

But Microsoft does.

gaogao · on March 24, 2023

Yeah, my sense is that OpenAI moat is primarily just through the RLHF dataset right now. Most of the other things – foundational models and datasets, embeddings (roughly the plugins announcement today) – have generally already been commoditized. Just getting that dataset for fine tuning is one of the last major hurdles for the ChatGPT geist.

sdrinf · on March 24, 2023

The Open Assistant community ( https://open-assistant.io/ ) is building a crowdsourced dataset for RLHF, with apparently high quantity of high quality contributions.

boppo1 · on March 24, 2023

Running models is one thing, but surely lots of power will remain with those who have the capital/hardware to train them?

MacsHeadroom · on March 24, 2023

As of yesterday you can train 33B parameter models on a single consumer 24GB VRAM GPU with https://github.com/johnsmith0031/alpaca_lora_4bit.

It's already proven that 13B parameters is enough to beat GPT-3 175B quality. It's likely that 33B parameters is enough for GPT-4.

dudeinhawaii · on March 24, 2023

How is this "proven"? Can you point to some benchmarks that demonstrate this? Everything I've seen thus far has been fairly mediocre/terrible outside of a few cherry-picked prompt/response combinations.

They're awesome technical achievements and will likely improve of course but you're making some very grand statements.

rnosov · on March 24, 2023

There are benchmarks in the original LLaMA paper[1]. Specifically, on page 4 LLaMA 13B seems to beat GPT-3 in BoolQ, HellaSwag, WinoGrande, ARC-e and ARC-c benchmarks (not by much though). Examples that you've seen are likely to be based on some form quantisation / poor prompt that degrade output. My understanding that the only quantisation that doesn't seem to hurt the output is llm.int8 by Tim Dettmers. You should be able to run LLaMA 13B (8 bit quantised) on the 3090 or 4090 consumer grade GPU as of now. Also, you'd need a prompt such as LLaMA precise[2] in order to get ChatGPT like output.

[1] https://arxiv.org/pdf/2302.13971v1.pdf

[2] https://www.reddit.com/r/LocalLLaMA/comments/11tkp8j/comment...

danysdragons · on March 24, 2023

I had a similar impression from what I saw. Maybe it does perform as well as GPT-3 on narrow tasks that it was explicitly fine-tuned on, but that similarity in performance seems to collapse as soon as you go off the beaten track and give it harder tasks that involve significant reasoning. Consistent with that I've seen a few different sources claim that a small model fine-tuned off the outputs of a large one would likely struggle with unfamiliar tasks or contexts that require transfer learning or abstraction.

After seeing how it actually performs in practice, it's hard to have confidence that these benchmarks are reliable measures of model quality.

lupire · on March 24, 2023

Alpaca is fine-tuning LLaMa not training from scratch.

sterlind · on March 24, 2023

holy moly. just two nights ago I used that repo to train the 13B model on my RTX 4090. I thought 30B was weeks away,

fnordpiglet · on March 24, 2023

OpenAIs moat is the primary interfaces for the world are Google, Microsoft, and Apple, and they are attached to one of those. The others will almost certainly replicate their success and we will have three primary AI interfaces, Google, Microsoft, and Apple. That’s the power of collusive monopoly as your moat.