openai should pay creators, but: 1. scraping the internet and making AI out of i...

Palmik · 2025-01-29T15:37:48 1738165068

I agree, (2) seems much less problematic since the AI outputs are not copyrightable and since OpenAI gives up ownership of the outputs. [1]

So, if you really really care about ToS, then just never enter into a contract with OpenAI. Company A uses OpenAI to generate data and posts it on the open Internet. Company B scrapes open Internet, including the data from Company A [2].

[1]: Ownership of content. As between you and OpenAI, and to the extent permitted by applicable law, you (a) retain your ownership rights in Input and (b) own the Output. We hereby assign to you all our right, title, and interest, if any, in and to Output.

[2]: This is not hypothetical. When ChatGPT got first released, several big AI labs accidentally and not so accidentally trained on the contents of the ShareGPT website (site that was made for sharing ChatGPT outputs). ;)

epse · 2025-01-29T15:35:58 1738164958

#1 destroys peoples willingness to publish and unfairly hogs bandwidth / creates costs for small hosters

#2 makes a big corp a bit angry

Indeed not the same thing

haswell · 2025-01-29T15:42:53 1738165373

Yes, they are different actions.

But arguably these actions share enough characteristics that it’s reasonable to place them in the same category. Something like: “products that exist largely/solely because of the work of other people”. The nonconsensual nature of this and the lack of compensation is what people understandably take issue with.

There is enough similarity that it evokes specific feelings about OpenAI when they suddenly find themselves on the other side of the situation.

zbshqoa · 2025-01-29T15:24:18 1738164258

Number 2 is already possible with open models. You can do distillation using Llama, which could likely be doing #1 to build their models (I'm not sure it's the case though)

Winsaucerer · 2025-01-29T15:43:27 1738165407

I'm genuinely not sure which one you think is worse (if any). (1) seems worse, but your reply suggests to me maybe you think (2) is worse.

meowface · 2025-01-29T16:56:55 1738169815

Not that poster, but I think both are equally fine.

It's funny if OpenAI were to complain about this, but at least on Twitter I don't see that much whining about it from OpenAI employees. Sam publicly praised DeepSeek.

I do see some of them spreading the "they're hiding GPUs they got through sanction evasion" theory, which is disappointing, though.

jillyboel · 2025-01-29T15:44:14 1738165454

You're right, (1) is violating the rights of a large portion of the population, (2) is violating the rights of one company

latexr · 2025-01-29T15:36:06 1738164966

> are not the same thing.

You’re right. The second one is far more ethical. Especially when stealing from a thief.

Doesn’t Sam Altman keep parroting they’re developing AI “for the good of humanity”? Well then, someone taking their model and improving on it, making it open-source, having it consume less, and having a cheaper API, should make him delighted. Unless he *gasp* was full of shit the whole time. Who could have guessed?

perryizgr8 · 2025-01-29T17:39:28 1738172368

> Doesn’t Sam Altman keep parroting they’re developing AI “for the good of humanity”?

“I don't want to live in a world where someone else makes the world a better place better than we do”

- Gavin Belson

tw1984 · 2025-01-29T17:05:09 1738170309

#1 is stealing from all average joes ever lived on earth

#2 is taking advantages from closedAI.

they are indeed different

sksrbWgbfK · 2025-01-29T15:38:48 1738165128

> 2. using the AI from #1 to create another AI

2. scraping the AI from #1 and making AI out of it

bugglebeetle · 2025-01-29T15:34:26 1738164866

Yeah, #1 is way worse and #2 falls under “turnabout is fair play.”