> distributing that model is, in fact, copyright infringement
is it? If i distributed digits of pi (to the umpteenth billion decimals), it theoretically contains copyright information in their digits.
The distribution of the copyrighted material is the infringement, but not if the data is _meant_ to produce other effects, and it is reasonable that the data is used for some other purpose _other than_ to replicate the copyrighted works.
> the training data provides value.
and so does a textbook. A student reading the book (regardless of how that book was obtained - paid or not) does not pay royalties from the knowledge obtained.
is it? If i distributed digits of pi (to the umpteenth billion decimals), it theoretically contains copyright information in their digits.
The distribution of the copyrighted material is the infringement, but not if the data is _meant_ to produce other effects, and it is reasonable that the data is used for some other purpose _other than_ to replicate the copyrighted works.
> the training data provides value.
and so does a textbook. A student reading the book (regardless of how that book was obtained - paid or not) does not pay royalties from the knowledge obtained.