Hacker News new | past | comments | ask | show | jobs | submit login

As far as I'm aware training does not currently constitute "piracy".

It's fine to advocate for a redefinition but be explicit about it.




I think the point here is about the procurement of the training data, in violation of copyright laws ("piracy"), rather than that the training itself is piracy.

The suspicion[0] is that OpenAI trained their models on a large text dump including libgen (in the so-called "books2").

If a person downloads a book from Library Genesis, they're a pirate; if OpenAI does it, so are they.

[0] https://twitter.com/theshawwn/status/1320282152689336320




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: