Researchers in our lab created a huge dataset of facial expressions from images on the web, annotated it and published the URLs to the images and the annotations for research but made sure to search only for images with proper licenses. I don't think that you are allowed to just go download any old image and train on it. I understand the many many people do it, but it's not legal (as far as I know, please correct me if I'm wrong).
> I don't think that you are allowed to just go download any old image and train on it.
My understanding as a two-year student of ML is that you are allowed in the US to go download any old image, train on it, and then release the model as long as the outputs are "sufficiently transformative."
To be clear: "transformative" not meaning merely "altered" but really meaning "repurposed"; if the new work is something people could feasibly use instead of the old work (harming the author's original market), it isn't "transformative".
Yes. For example, arfa ran into this question when launching https://thisfursonadoesnotexist.com/. Lots of furry artists had exactly the same concerns with his work there, but that work is decisively transformative.
Copilot seems ... well, less transformative. I'm still not sure how to feel.