ChatGPT scraped data from various sources on the internet.
> The model was trained using text databases from the internet. This included a whopping 570GB of data obtained from books, webtexts, Wikipedia, articles and other pieces of writing on the internet. To be even more exact, 300 billion words were fed into the system.
I believe it's unfair to these sources that ChatGPT drives away their clicks, and in turn the ad income that would come with them.
Scraping data seems fine in contexts where clicks aren't driven away from the very site the data was scraped from. But in ChatGPT's case, it seems really unfair to these sources and the work that the authors put, as people would no longer even to attempt to go to these sources.
Can this start breaking the ad-based model of the internet, where a lot of sites rely upon the ad income to run servers?
People get internet hostile at me for this question, but it really is that simple. They've automated you, and it's definitely going to be a problem, but if it's acceptable for your brain to do the same thing, you're going to have to find a different angle to attack it than "fairness".