They trained on publicly-available (no signup with TOS agreement) data, on the theory that training is fair use.
You signed up and agreed to their TOS to use GPT-4.
The legal situations are not similar.
OTOH, lots of people are openly using GPT-4 in one way or another to develop models, though they might generally be at arm’s length from people intending to sell services.
So set up a shell company that uses GPT4 to make public domain examples of what RLHF data would look like, and then the parent company takes that data afterwards since it's public domain. Shell company didn't break TOS.
Great question! I don’t know the end game there. Maybe if they suspected their model was used they would sue, and in discovery find you used their model for training?
Maybe we don't need to worry, OpenLLaMA is under training right now. It will be the commercial version of LLaMA.
> Update 05/22/2023
> We are happy to release our 700B token checkpoint for the OpenLLaMA 7B model and 600B token checkpoint for the 3B model. We’ve also updated the evaluation results. We expect the full 1T token training run to finish at the end of this week.
If they think they can prove you used it to develop a competing service, sue you for breaking the TOS and recover the greater of the harm it did to their business or the amount of your profits from the service that are due to the uae of GPT-4 in violation of the agreement.
What I don't understand - is there anything that would prevent Alice from publishing ChatGPT prompts and outputs for anyone to use, with no T&C attached?
Once Alice has done that, is there anything to prevent Bob, who has never agreed to ChatGPT ToS, to use those prompts and outputs to train his own models to compete with OpenAI's?
(Purely from a contractual/legal/IP angle rather than ML/technical.)
Right but cease and desist usually relates to intellectual property or copyright matters, typically not TOS violations. Please correct me if I am mistaken.
Cease and desist can be used for any issues where the person or entity issuing the C&D thinks they have a legal right that is being violated and wants to put the violator on notice in the hopes of securing a change in behavior short of legal action.
I wonder how they reconcile naming themselves "Open"AI, telling people that generated works can be used however they please, except for training a potential competitor.