Not just Meta, 40 EU companies urged EU to postpone roll out of the ai act by two years due to it's unclear nature. This code of practice is voluntary and goes beyond what is in the act itself. EU published it in a way to say that there would be less scrutiny if you voluntarily sign up for this code of practice. Meta would anyway face scrutiny on all ends, so does not seem to a plausible case to sign something voluntary.
One of the key aspects of the act is how a model provider is responsible if the downstream partners misuse it in any way. For open source, it's a very hard requirement[1].
> GPAI model providers need to establish reasonable copyright measures to mitigate the risk that a downstream system or application into which a model is integrated generates copyright-infringing outputs, including through avoiding overfitting of their GPAI model. Where a GPAI model is provided to another entity, providers are encouraged to make the conclusion or validity of the contractual provision of the model dependent upon a promise of that entity to take appropriate measures to avoid the repeated generation of output that is identical or recognisably similar to protected works.
The quoted text makes sense when you understand that the EU provides a carveout for training on copyright protected works without a license. It's quite an elegant balance they've suggested despite the challenges it fails to avoid.
Copyright is not a god given right. It's an economic incentive created by government to make desired behavior (writing an publishing books) profitable.
Yes, 100%. And that’s why throwing copyright selectively in the bin now when there’s an ongoing massive transfer of wealth from creators to mega corps, is so surprising. It’s almost as if governments were only protecting economic interests of creators when the creators were powerful (eg movie studios), going after individuals for piracy and DRM circumvention. Now that the mega corps are the ones pirating at a scale they get a free pass through a loophole designed for individuals (fair use).
Anyway, the show must go on so were unlikely to see any reversal of this. It’s a big experiment and not necessarily anything that will benefit even the model providers themselves in the medium term. It’s clear that the ”free for all” policy on grabbing whatever data you can get is already having chilling effects. From artists and authors not publishing their works publicly, to locking down of open web with anti-scraping. Were basically entering an era of adversarial data management, with incentives to exploit others for data while protecting the data you have from others accessing it.
Why? Copyright is 1) presented as being there to protect the interests of the general public, not creators, 2) Statute of Anne, the birth of modern copyright law, protected printers - that is "big businesss" over creators anyway, so even that has largely always been a fiction.
But it is also increasingly dubious that the public gets a good deal out of copyright law anyway.
> From artists and authors not publishing their works publicly
The vast majority of creators have never been able to get remotely close to make a living from their creative work, and instead often when factoring in time lose money hand over fist trying to get their works noticed.
I generally let it slide because these copyright discussions tend to be about America, and as such it can be assumed American law and what it inherits from British law is what pertains.
>Copyright is 1) presented as being there to protect the interests of the general public, not creators,
yes, in the U.S in the EU creators have moral rights to their works and the law is to protect their interests.
There are actually moral rights and rights of exploitation, in EU you can transfer the latter but not the former.
>But it is also increasingly dubious that the public gets a good deal out of copyright law anyway.
In the EU's view of copyright the public doesn't need to get a good deal, the creators of copyrighted works do.
> There are actually moral rights and rights of exploitation, in EU you can transfer the latter but not the former.
And when we talk about copyright we generally talk about the rights of exploitation, where the rationale used today is about the advancement of arts and sciences - a public benefit. There's a reason the name is English is copy-right, where the other Germanic languages focuses more on the work - in the Anglosphere the notion of moral rights as separate from rights of exploitation is well outside the mainstream.
> In the EU's view of copyright the public doesn't need to get a good deal, the creators of copyrighted works do.
Most individual nations copyright law still does uphold the pretence of being for the public good, however. Without that pretence, there is no moral basis for restricting the rights of the public the way copyright law does.
But it has nevertheless been abundantly clear all the way back to the Statute of Anne that any talk of either public goods or rights of exploitation for the creator are excuses, and that these laws if anything mostly exist for the protection of business interests.
>Most individual nations copyright law still does uphold the pretence of being for the public good, however. Without that pretence, there is no moral basis for restricting the rights of the public the way copyright law does.
I of course do not know all the individual EU country's rules, but my understanding was that the EU's view was what it was because derived at least from the previous understanding of its member nations. So the earlier French laws before ratification and implementation of the EU directive on author's rights in Law # 92-597 (1 July 1992) were also focused on the understanding of creators having creator's rights and that protecting these was the purpose of Copyright law, and that this pattern generally held throughout EU lands (at least any lands currently in the EU, I suppose pre-Brexit this was not the case)
You probably have some other examples but in my experience the European laws have for a long time held that copyright exists to protect the rights of creators and not of the public.
> So the earlier French laws before ratification and implementation of the EU directive on author's rights in Law # 92-597 (1 July 1992) were also focused on the understanding of creators having creator's rights
French law, similar to e.g. Norwegian and German law, separated moral and proprietary rights.
Moral rights are not particularly relevant to this discussion, as they relate specifically to rights to e.g. be recognised as the author, and to protect the integrity of a work. They do not relate to actual copying and publication.
What we call copyright in English is largely proprietary/exploitation rights.
The historical foundation of the latter is firmly one of first granting righths on a case by case basis, often to printers rather than cretors, and then with the Statue of Anne that explicitly stated the goal of "encouragement of learning" right in the title of the act. This motivation was later e.g. made explicit in the US constitution.
Since you mention France, the National Assembly after the French Revolution took the stance that works by default were public property, and that copyright was an exception, in the same vein as per the Statute of Anne and US Constitution ("to promote the progress of science and useful arts").
Depository laws etc., which are near universal, are also firmly rooted in this view that copyright is a right grants that is provided on a quid pro quo basis: The work needs to be secured for the public for the future irrespective of continued commercial availability.
> Why? Copyright is 1) presented as being there to protect the interests of the general public, not creators
Doesn’t matter, both the ”public interest” and ”creator rights” arguments have the same impact: you’re either hurting creators directly, or you’re hurting the public benefit when you remove or reduce the economic incentives. The transfer of wealth and irreversible damage is there, whether you care about Lars Ulrichs gold toilet or our future kids who can’t enjoy culture and libraries to protect from adversarial and cynical tech moguls.
> 2) Statute of Anne, the birth of modern copyright law, protected printers - that is "big businesss" over creators anyway, so even that has largely always been a fiction.
> The vast majority of creators have never been able to get remotely close to make a living from their creative work
Nobody is saying copyright is perfect. We’re saying it’s the system we have and it should apply equally.
Two wrongs don’t make a right. Defending the AI corps on basis of copyright being broken is like saying the tax system is broken, so therefore it’s morally right for the ultra-rich to relocate assets to the Caymans. Or saying that democracy is broken, so it’s morally sound to circumvent it (like Thiel says).
You've put into words what I've been internally struggling to voice. Information (on the web) is a gas, it expands once it escapes.
In limited, closed systems, it may not escape, but all it takes is one bad (or hacked) actor and the privacy of it is gone.
In a way, we used to be "protected" because it was "too big" to process, store, or access "everything".
Now, especially with an economic incentive to vacuum literally all digital information, and many works being "digital first" (even a word processor vs a typewriter, or a PDF that is sent to a printer instead of lithograph metal plates)... is this the information Armageddon?
copyright is the backbone of modern media empires. It both allows small creators and massive corporations to seek rent on works, but since the works are under copyright for a century its quite nice to corporations
It is a "right" created by law, is the point. This is not a right that is universally recognised, nor one that has existed since time immemorial, but a modern construction of governments that governments can choose to change or abolish.
what is a right that has existed since time immemorial? Generally rights that have existed "forever" are codified rights and, in the codification, described as being eternal. Hence Jefferson's reference to inalienable rights, which probably came as some surprise to King George III.
on edit: If we had a soundtrack the Clash Know Your Rights would be playing in this comment.
Except of course that the point is that copyright is generally not described this way.
See my more extensive overview in another response.
The history of copyright law is one where it is regularly described either in the debates around the passing of the laws, or in the laws themselves, as a utilitarian bargain between the public and creators.
E.g. since you mention Jefferson and mention "inalienable", notably copyright is in the US not an inaliable right at all, but a right that the US constitution grants Congress the power to enact "to promote the progress of science and useful arts". It says nothing about being an inalienable or eternal right of citizens.
And before you bring up France, or other European law, I suggest you read the other comment as well.
But to add more than I did in the other comment, e.g. in Norway, the first paragraph of the copyright low ("Lov om opphavsrett til åndsverk mv.") gives 3 motivations: 1 a) to grant rights to creators to give incentives for cultural production, 1 b) to limit those rights to ensure a balance between creators rights and public interests, 1 c) to provide rules to make it easy to arrange use of copyrighted works.
There's that argument about incentives and balancing public interests again.
This is the historical norm. It is not present in every copyright law, but they share the same historical nucleus.
Early copyright was a take on property rights, applied to supposed labour of the soul and subsequent ownership of its fruits.
Copyright stems from the 15-1600s, while utilitarianism is a mid-1800s kind of thing. The move from explicitly religious and natural rights motivations to language about "intellect" and hedonism is rather late, and I expect it to be tied to an atheist and utilitarian influence from socialist movements.
The first modern copyright law dates to 1709, and was most certainly not a "take on property rights". Neither were pre-Statute of Anne monopoly grants.
I can find nothing to suggest a "religious and natural rights" motivation, nor any language about "intellect and hedonism".
Statute of Anne - which specifically gives a utilitarian reason 150 years before your "mid-1800s" estimate also predates socialism by a similar amount of time, and dates to a time were there certainly wasn't any major atheist influence either, so this is utterly ahistorical nonsense.
"I. Whereas printers, booksellers, and other persons have of late frequently taken the liberty of printing, reprinting, and publishing, or causing to be printed, reprinted, and published, books and other writings, without the consent of the authors or proprietors of such books and writings, to their very great detriment, and too often to the ruin of them and their families: for preventing therefore such practices for the future, and for the encouragement of learned men to compose and write useful books; may it please your Majesty, that it may be enacted, and be it enacted by the Queen's most excellent majesty, by and with the advice and consent of the lords spiritual and temporal, and commons, in this present parliament assembled, and by the authority of the same;"
This is all about ownership, and protecting the state from naughty texts being printed, which was the actual driving force behind the legislation. There is nothing utilitarian in this.
Copyright originates in the Statute of Anne[0]; its creation was therefore within living memory when the United States declared their independence.
No rights have existed 'forever', and both the rights and the social problems they intend to resolve are often quite recent (assuming you're not the sort of person who's impressed by a building that's 100 years old).
George III was certainly not surprised by Jefferson's claim to rights, given that the rights he claimed were copied (largely verbatim) from the Bill of Rights 1689[1]. The poor treatment of the Thirteen Colonies was due to Lord North's poor governance, the rights and liberties that the Founding Fathers demanded were long-established in Britain, and their complaints against absolute monarchy were complaints against a system of government that had been abolished a century before.
you should probably reread the text I responded to and then what I wrote, because you seem to think I believe there are rights that are not codified by humans in some way and are on a mission to correct my mistake.
>George III was certainly not surprised by Jefferson's claim to rights, given that the rights he claimed were copied (largely verbatim) from the Bill of Rights 1689
to repeat: Hence Jefferson's reference to inalienable rights, which probably came as some surprise to King George III.
inalienable modifies rights here, if George is surprised by any rights it is inalienable ones.
>Copyright originates in the Statute of Anne[0]; its creation was therefore within living memory when the United States declared their independence.
title of post is "Meta says it won't sign Europe AI agreement", I was under the impression that it had something to do with how the EU sees copyright and not how the U.S and British common law sees it.
Hence multiple comments referencing EU but I see I must give up and the U.S must have its way, evidently the Europe AI agreement is all about how copyright works in the U.S, prime arbiter of all law around the globe.
at any rate rights that are described as being eternal or some other version of that such as inalienable, or in the case of copyright moral and intrinsic, are rights that if the government, that has heretofore described that as inviolate, where to casually violate them then the government would be declaring its own nullification to exist further by its previously stated rules.
Not to say this doesn't happen, I believe we can see it happening in some places in the world right now, but these are classes of laws that cannot "just" be changed at the government's whim, and in the EU copyright law is evidently one of those classes of law, strange as it seems.
A lot of cultures have not historically considered artists’ rights to be a thing and have had it essentially imposed on them as a requirement to participate in global trade.
Even in Europe copyright was protected only for the last 250 years, and over the last 100 years it’s been constantly updated to take into consideration new technologies.
The only real mistake the EU made was not regulating Facebook when it mattered. That site caused pain and damage to entire generations. Now it's too late. All they can do is try to stop Meta and the rest of the lunatics from stealing every book, song and photo ever created, just to train models that could leave half the population without a job.
Meta, OpenAI, Nvidia, Microsoft and Google don't care about people. They care about control: controlling influence, knowledge and universal income. That's the endgame.
Just like in the US, the EU has brilliant people working on regulations. The difference is, they're not always working for the same interests.
The world is asking for US big tech companies to be regulated more now than ever.
Facebook's power comes from how it gathered and monetised data, how it acquired rivals like Instagram and WhatsApp, and how it locked in network effects.
If regulators had blocked those acquisitions or enforced stricter antitrust and data privacy rules, there's a chance the social media landscape today would be more competitive. Politicians and regulators probably received some kind of incentive or didn't get it. They didn't see how dangerous Zuk's greedy algorithms would become. They thought it was just a social site. They had no idea what Facebook employees were building behind the scenes. By the time they realised, it was already too late.
China was the only one that acted. The US and EU looked the other way. If they'd stepped in back in 2009 with rules on privacy, neutrality, and transparency, today's internet could've been a lot more open and competitive.
To be fair, "copy"right has only been needed for as long as it's been possible to copy things. In the grand scheme of human history, that technology is relatively new.
Copyright predates mechanical copying. However, people used to have to petition a King or similar to be granted a monopoly on a work, and the monopoly was specific to that work.
The Statue of Anne - the first recognisable copyright law in anything remotely the modern sense dates to 1709. Long after the invention of movable type. Mechanical in the sense of printing with a press using movable type, not anything highly automated.
Having to petition for monopoly rights on an individual basis is nothing like copyright, where the entire point is to avoid having to ask for exceptions by creating a right.
"intellectual property" only exists because society collectively allows it to. it's not some inviolable law of nature. society (or the government that represents them) can revoke it or give it away.
Yes, that is why (most?) anarchists consider property that one is not occupying and using to be fiction, held up by the state. I believe this includes intellectual property as well.
A person being alive is not at all similar to the concept of intellectual property existing. The former is a natural phenomenon, the latter is a social construct.
Sounds like a reasonable guideline to me. Even for open source models, you can add a license term that requires users of the open source model to take "appropriate measures to avoid the repeated generation of output that is identical or recognisably similar to protected works"
This is European law, not US. Reasonable means reasonable and judges here are expected to weigh each side's interests and come to a conclusion. Not just a literal interpretation of the law.
> This is European law, not US. Reasonable means reasonable and judges here are expected to weigh each side's interests and come to a conclusion. Not just a literal interpretation of the law.
I think you've got civil and common law the wrong way round :). US judges have _much_ more power to interpret law!
It is European law, as in EU law, not law from a European state. In EU matters, the teleogocial interpretation, i.e. intent applies:
> When interpreting EU law, the CJEU pays particular attention to the aim and purpose of EU law (teleological interpretation), rather than focusing exclusively on the wording of the provisions (linguistic interpretation).
> This is explained by numerous factors, in particular the open-ended and policy-oriented rules of the EU Treaties, as well as by EU legal multilingualism.
> Under the latter principle, all EU law is equally authentic in all language versions. Hence, the Court cannot rely on the wording of a single version, as a national court can, in order to give an interpretation of the legal provision under consideration. Therefore, in order to decode the meaning of a legal rule, the Court analyses it especially in the light of its purpose (teleological interpretation) as well as its context (systemic interpretation).
In the US, for most laws, and most judges, there's actually much less power to interpret law. Part of the benefit of the common law system is to provide consistency and take that interpretation power away from judges of each case.
My claim is that at a system-level, judges in the US have more power to interpret laws. Your claim is that "in each individual case, the median amount of interpretation is lower in the US than the EU". But you also concede that this is because the judges rely on the interpretations of _other_ judges in cases (e.g. if the Supreme Court makes a very important decision which clarifies how a law should be interpreted, and this is then carried down throughout the rest of the justice system, then this means that there has been a really large amount of interpretation).
Instead of a license term you can put that in your documentation - in fact that is exactly what the code of practice mentions (see my other comment) for open source models.
An open source cocaine production machine is still an illegal cocaine production machine. The fact that it's open source doesn't matter.
You seem to not have understood that different forms of appliances need to comply with different forms of law. And you being able to call it open source or not doesn't change anything about its legal aspects.
And every law written is a compromise between two opposing parties.
I'm not sure what do you mean. Statement "open source models can just add a clause to the terms of use, restricting how it can be used" is false, because in that case, it won't be open source. Does it mean that the open source needs to comply with the laws? Absolutely, but that might mean that open source models are effectively illegal.
Except that it’s seemingly impossible to prevent against prompt injection. The cat is out the bag. Much like a lot of other legislation (eg cookie law, being responsible for user generated content when you have millions of it posted per day) it’s entirely impractical albeit well-meaning.
I don't think the cookie law is that impractical? It's easy to comply with by just not storing non-essential user information. It would have been completely nondisruptive if platforms agreed to respect users' defaults via browser settings, and then converged on a common config interface.
It was made impractical by ad platforms and others who decided to use dark patterns, FUD and malicious compliance to deceive users into agreeing to be tracked.
I recently received an email[0] from a UK entity with an enormous wall of text talking about processing of personal information, my rights and how there is a “Contact Card” of my details on their website.
But with a little bit of reading, one could ultimately summarise the enormous wall of text simply as: “We’ve added your email address to a marketing list, click here to opt out.”
The huge wall of text email was designed to confuse and obfuscate as much as possible with them still being able to claim they weren’t breaking protection of personal information laws.
If you ask someone if they killed your dog and they respond with a wall of text, then you’re immediately suspicious. You don’t even have to read it all.
The same is true of privacy policies. I’ve seen some companies have very short policies I could read in less than 30s, those companies are not suspicious.
Companies won't be talking about killing your dog though, so if you conflate this onto things lawyers may have required that suspicion may be completely un-based pigeonholing efforts.
Long policies can be needed depending on the litigiousness of the working environment. I used to work in an industry where that was beyond common, and it was required to be defensible in court. Accounting for factors that aren't commonly known like biases of judges towards individuals vs companies.
Because they track usage stats for site development purposes, and there was no convergence on an agreed upon standard interface for browsers since nobody would respect it. Their banners are at least simple yes/no ones without dark patterns.
But yes, perhaps they should have worked with e.g. Mozilla to develop some kind of standard browser interface for this.
This is actually not true. I just read the European commission's cookie policy.
The main reason they need the banner is because they show you full page popups to ask you to take surveys about unrelated topics like climate action. They need consent to track whether or not you've taken these surveys
Their banner is just as bad as any other I have seen, it covers most of the page and doesn't go away until I click yes. If you're trying to opt out of cookies on other sites, that's probably why it takes you longer (just don't do that).
They create profiles of visitors that allow them to, e.g. through polls.
It's usually a click or two to "reject all" or similar with serious organisations. Some german corporations are nasty and conflate paywall and data collection and processing consent.
It is impractical for me as a user. I have to click on a notice on every website on the internet before interacting with it - often which are very obtuse and don’t have a “reject all” button but a “manage my choices” button which takes to an even more convoluted menu.
Instead of exactly as you say: a global browser option.
As someone who has had to implement this crap repeatedly - I can’t even begin to imagine the amount of global time that has been wasted implementing this by everyone, fixing mistakes related to it and more importantly by users having to interact with it.
Yeah, but the only reason for this time wasteage is because website operators refuse to accept what would become the fallback default of "minimal", for which they would not need to seek explicit consent. It's a kind of arbitrage, like those scammy website that send you into redirect loops with enticing headlines.
The law is written to encourage such defaults if anything, it just wasn't profitable enough I guess.
Not even EU institutions themselves are falling back on deaults that don't require cookie consent.
I'm constantly clicking away cookie banners on UK government or NHS (our public healthcare system) websites. The ICO (UK privacy watchdog) requires cookie consent. The EU Data Protection Supervisor wants cookie consent. Almost everyone does.
And you know why that is? It's not because they are scammy ad funded sites or because of government surveillance. It's because the "cookie law" requires consent even for completely reasonable forms of traffic analysis with the sole purpose of improving the site for its visitors.
This is impractical, unreasonable, counterproductive and unintelligent.
> It's because the "cookie law" requires consent even for completely reasonable forms of traffic analysis with the sole purpose of improving the site for its visitors
Yup. That's what those 2000+ "partners" are all about if you believe their "legitimate interest" claims: "improve traffic"
This is a personal decision to be made by the data "donor".
The NHS website cookie banner (which does have a correct implementation in that the "no consent" button is of equal prominence to the "mi data es su data" button) says:
> We'd also like to use analytics cookies. These collect feedback and send information about how our site is used to services called Adobe Analytics, Adobe Target, Qualtrics Feedback and Google Analytics. We use this information to improve our site.
In my opinion, it is not, as described, "completely reasonable" to consider such data hand-off to third parties as implicitly consented to. I may trust the NHS but I may not trust their partners.
If the data collected is strictly required for the delivery of the service and is used only for that purpose and destroyed when the purpose is fulfilled (say, login session management), you don't need a banner.
The NHS website is in a slightly tricky position, because I genuinely think they will be trying to use the data for site and service improvement, at least for now, and they hopefully have done their homework to make sure Adobe, say, are also not misusing the data. Do I think the same from, say, the Daily Mail website? Absolutely not, they'll be selling every scrap of data before the TCP connection even closes to anyone paying. Now, I may know the Daily Mail is a wretched hive of villainy and can just not go there, but I do not know about every website I visit. Sadly the scumbags are why no-one gets nice things.
>This is a personal decision to be made by the data "donor".
My problem is that users cannot make this personal decision based on the cookie consent banners because all sites have to request this consent even if they do exactly what they should be doing in their users' interest. There's no useful signal in this noise.
The worst data harvesters look exactly the same as a site that does basic traffic analysis for basic usability purposes.
The law makes it easy for the worst offenders to hide behind everyone else. That's why I'm calling it counterproductive.
[Edit] Wrt NHS specifically - this is a case in point. They use some tools to analyse traffic in order to improve their website. If they honour their own privacy policy, they will have configured those tools accordingly.
I understand that this can still be criticised from various angles. But is this criticism worth destroying the effectiveness of the law and burying far more important distinctions?
The law makes the NHS and Daily Mail look exactly the same to users as far as privacy and data protection is concered. This is completely misleading, don't you think?
I don't think it's too misleading, because in the absence of any other information, they are the same.
What you could then add to this system is a certification scheme to permit implicit consent of all the data handling (including who you hand it off to and what they are allowed to do with it, as well as whether they have demonstrated themselves to be trustworthy) is audited to be compliant with some more stringent requirements. It could even be self-certification along the lines of CE marking. But that requires strict enforcement, and the national regulators so far have been a bunch of wet blankets.
That actually would encourage organisations to find ways to get the information they want without violating the privacy of their users and anyone else who strays into their digital properties.
>I don't think it's too misleading, because in the absence of any other information, they are the same.
But other information not being absent we know that they are not the same. Just compare privacy policies for instance. The cookie law makes them appear similar in spite of the fact that they are very different (as of now - who knows what will happen to the NHS).
I do understand the point, but other then allowing a process of auditing to allow a middle ground of consent implied for first-party use only and within some strictly defined boundaries, what else can you do? It's a market for lemons in terms of trustworthy data processors. 90% (bum-pull figures, but lines up with the number of websites that play silly buggers with hiding the no-consent button) of all people who want to use data will be up to no good and immediately try to bend and break every rule.
I would also be in favour of companies having to report all their negative data protection judgements against them and everyone they will share your data with in their cookie banner before giving you the choice as to whether you trust them.
If any rule is going to be broken and impossible to enforce, how can that be a justification for keeping a bad rule rather than replacing it with more sensible one?
I said they'd try to break them. Which requires vigilance and regulators stepping in with an enormous hammer. So far national regulators have been pretty weaksauce which is indeed very frustrating.
I'm not against improving the system, and I even proposed something, but I am against letting data abusers run riot because the current system isn't quite 100% perfect.
I'll still take what we have over what we had before (nothing, good luck everyone).
Then we clearly disagree on what they should be doing.
And this is the crux of the problem. The law helps a tiny minority of people enforce an extremely (and in my view pointlessly) strict version of privacy at the cost of misleading everybody else into thinking that using analytics for the purpose of making usability improvements is basically the same thing as sending personal data to 500 data brokers to make money off of it.
I would draw the line where my personal data is exchanged with third parties for the purpose of monetisation. I want the websites I visit to be islands that do not contribute to anyone's attempt to create a complete profile of my online (and indeed offline) life.
I don't care about anything else. They can do whatever A/B testing they want as far as I'm concerned. They can analyse my user journey across multiple visits. They can do segmentation to see how they can best serve different groups of users. They can store my previous search terms, choices and preferences. If it's a shop, they can rank products according to what they think might interest me based on previous visits. These things will likely make the site better for me or at least not much worse.
Other people will surely disagree. That's fine. What's more important than where exactly to draw the line is to recognise that there are trade-offs.
The law seems to be making an assumption that the less sites can do without asking for consent the better most people's privacy will be protected.
But this is a flawed idea, because it creates an opportunity for sites to withhold useful features from people unless and until they consent to a complete loss of privacy.
Other sites that want to provide those features without complete loss of privacy cannot distinguish themselves by not asking for consent.
Part of the problem is the overly strict interpretation of "strictly necessary" by data protection agencies. There are some features that could be seen as strictly necessary for normal usability (such as remembering preferences) but this is not consistently accepted by data protection agencies so sites will still ask for consent to be on the safe side.
>This is impractical, unreasonable, counterproductive and unintelligent.
It keeps the political grifters who make these regulations employed, that's kind of the main point in EU/UKs endless stream of regulations upon regulations.
The reality is the data that is gathered is so much more valuable and accurate if you gather consent when you are running a business. Defaulting to a minimal config is just not practical for most businesses either. The decisions that are made with proper tracking data have a real business impact (I can see it myself - working at a client with 7 figure monthly revenue).
Im fully supportive of consent, but the way it is implemented is impractical from everyone’s POV and I stand by that.
Are you genuinely trying to defend businesses unnecessarily tracking users online? Why can't businesses sell their core product(s) and you know... not track users? If they did that, then they wouldn't need to implement a cookie banner.
Retargetting etc is massive revenue for online retailers. I support their right to do it if users consent to it. I don’t support their right to do it if users have not consented.
The conversation is not about my opinion on tracking, anyway. It’s about the impracticality of implementing the legislation that is hostile and time consuming for both website owners and users alike
Plus with any kind of effort put into a standard browser setting you could easily have some granularity, like: accept anonymous ephemeral data collected to improve website, but not stuff shared with third parties, or anything collected for the purpose of tailoring content or recommendations for you.
Are you genuinely acting this obtuse? what do you think walmart and every single retailer does when you walk into a physical store? it’s always constant monitoring to be able to provide a better customer experience. This doesn’t change with online, businesses want to improve their service and they need the data to do so.
If you're talking about the same jurisdiction of this privacy laws, then this is illegal. Your are only allowed to retain videos for 24h and only use it for basically calling the police.
walmart has sales associates running around gathering all those data points, as well as people standing around monitoring. Their “eyes” aren’t regulated.
The question still stands then: Does it happen in Tesco in the EU? Because that is illegal.
The original idea was that it should be legal to track people, because it is ok in the analog world. But it really isn't and I'm glad it is illegal in the EU. I think it should be in the US also, but the EU can't change that and I have no right to have political influence about foreign countries so that doesn't matter.
it’s illegal for Tesco to have any number of employees watching/monitoring/“tracking” in the store with their own eyes and using those in-store insights to drive better customer experiences?
Making statistics about sex, age, number of children, clothing choice, walking speed without consent, sounds illegal. I think it isn't forbidden for the company, but for the individual already, because that's voyeuristic behaviour.
Watching what is bought is fine, but walking around to do that is useless work, because you have that in the accounting/sales data already.
There is stuff like PayPal and now per company apps, that works the same as on the web: you need to first sign a contract. I would rather that to be cracked done on, but I see that it is difficult, because you can't forbid individual choice. But I think the incentive is that products become cheaper when you opt-in to data collection. This is already forbidden though, you can't combine consent with other benefits, then it isn't free consent anymore. I expect a lawsuit in the next decades.
EXTREMELY curious to see where in EU law it states that a store creating internal reports based on purely VISUAL statistics that employees can observe like walking speed, sex, number of children, etc is illegal.
That is only true if you agree with ad platforms that tracking ads are fundamentally required for businesses, which is trivially untrue for most enterprises. Forcing businesses to get off privacy violating tracking practices is good, and it's not the EU that's at fault for forcing companies to be open about ad networks' intransigence on that part.
Regulators often barely grasp how current markets function and they are supposed to be futurists now too? Government regulatory interests almost always end up lining up with protecting entrenched interests, so it's essentially asking for a slow moving group of the same mega companies. Which is very much what Europes market looks like today. Stasis and shifting to a stagnating middle.
And also to prevent European powers trying to kill each other for the third time in a century, setting the whole world on fire in the process - for the third time in a century.
Contrary to the constant whining, most of them are actually quite wealthy. And thanks to strong right to repair laws, they can keep using John Deere equipment without paying extortionate licensing fees.
They're wealthy because they were paid for not using their agricultural land, so they cropped down all the trees on parts of their land that they couldn't use, to classify it as agricultural, got paid, and as a side effect caused downstream flooding
Well, the topic is really whether or not the EU's regulations are effective at producing desired outcomes. The comment you're responding to is making a strong argument that it isn't. I tend to agree.
There's a certain hubris to applying rules and regulations to a system that you fundamentally don't understand.
For those of us outside the US, it's not hard to understand how regulations work. The US acts as a protectionist country, it sets strict rules and pressures other governments to follow them. But at the same time, it promotes free markets, globalisation, and neoliberal values to everyone else.
The moment the EU shows even a small sign of protectionism, the US complains. It's a double standard.
So the solution is to allow the actual entrenched interests to determine the future of things when they also barely grasp how the current markets function and are currently proclaiming to be futurists?
The best way for "entrenched interests" to stifle competition is to buy/encourage regulation that keeps everybody else out of their sandbox pre-emptively.
For reference, see every highly-regulated industry everywhere.
You think Sam Altman was in testifying to the US Congress begging for AI regulation because he's just a super nice guy?
That's a bit oversimplified. Humans have been creating authority systems trying to control others lives and business since formal societies have been a thing, likely even before agriculture. History is also full of examples of arbitrary and counter productive attempts at control, which is a product of basic human nature combined with power, and why we must always be skeptical.
As a member of 'humanity', do you find yourself creating authority systems for AI though? No.
If you are paying for lobbyists to write the legislation you want, as corporations do, you get the law you want - that excludes competition, funds your errors etc.
The point is you are not dealing with 'humanity', you are dealing with those who represent authority for humanity - not the same thing at all. Connected politicians/CEOs etc are not actually representing 'humanity' - they merely say that they are doing so, while representing themselves.
No, regulation exists because we all agreed it's not such a great thing to allow any company to do anything they want with no consequences. e.g, pouring toxic waste into rivers.
But what started as a good thing has become a tool of those same companies to prevent competition. How we regulate needs to be rethought beyond the simplistic "more = better"
eu resident here. i’ve observed with sadness what a scared and terrified lots the europeans have become. but at least their young people can do drugs, party 72 hours straight, and graffiti all walls in berlin so hey what’s not to like?
one day some historian will be able to pinpoint the exact point in time that europe chose to be anti-progress and fervent traditionalist hell-bent on protecting pizza recipes, ruins of ancient civilization, and a so-called single market. one day!
No, that... that's exactly what we have today. An oligarchy persists through captured state regulation. A more free market would have a constantly changing top.
Depends on the time horizon you look at. A completely unregulated market usually ends up dominated by monopolists… who last a generation or two and then are usurped and become declining oligarchs. True all the way back to the Medici.
In a rigidly regulated market with preemptive action by regulators (like EU, Japan) you end up with a persistent oligarchy that is never replaced. An aristocracy of sorts.
The middle road is the best. Set up a fair playing field and rules of the game, but allow innovation to happen unhindered, until the dust has settled. There should be regulation, but the rules must be bought with blood. The risk of premature regulation is worse.
Calculated, not callous. Quite the opposite: precaution kills people every day, just not as visibly. This is especially true in the area of medicine where innovation (new medicines) aren’t made available even when no other treatment is approved. People die every day by the hundreds of thousands of diseases that we could be innovating against.
You're both right, and that's exactly how early regulation often ends up stifling innovation. Trying to shape a market too soon tends to lock in assumptions that later prove wrong.
Sometimes you can't reverse the damage and societal change after the market has already been created and shaped. Look at fossil fuels, plastic, social media, etc. We're now dependent on things that cause us harm, the damage done is irreversible and regulation is no longer possible because these innovations are now embedded in the foundations of modern society.
Innovation is good, but there's no need to go as fast as possible. We can be careful about things and study the effects more deeply before unleashing life changing technologies into the world. Now we're seeing the internet get destroyed by LLMs because a few people decided it was ok to do so. The benefits of this are not even clear yet, but we're still doing it just because we can. It's like driving a car at full speed into a corner just to see what's behind it.
I think it’s one of those “everyone knows” things that plastic and social media are bad, but I think the world without them is way, way worse. People focus on these popular narratives but if people thought social media was bad, they wouldn’t use it.
Personally, I don’t think they’re bad. Plastic isn’t that harmful, and neither is social media.
I think people romanticize the past and status quo. Change is scary, so when things change and the world is bad, it is easy to point at anything that changed and say “see, the change is what did it!”
People don't use things that they know are bad, but someone who has grown up in an environment where everyone uses social media for example, can't know that it's bad because they can't experience the alternative anymore. We don't know the effects all the accumulating plastic has on our bodies. The positive effects of these things can be bigger than the negative ones, but we can't know that because we're not even trying to figure it out. Sometimes it might be impossible to find out all the effects before large scale adoption, but still we should at least try. Currently the only study we do before deciding is the one to figure out if it'll make a profit for the owner.
> We don't know the effects all the accumulating plastic has on our bodies.
This is handwaving. We can be pretty well sure at this point what the effects aren’t, given their widespread prevalence for generations. We have a 2+ billion sample size.
No, we can't be sure. There's a lot of diseases that we don't know the cause of, for example. Cancers, dementia, Alzheimer's, etc. There is a possibility that the rates of those diseases are higher because of plastics. Plastic pollution also accumulates, there was a lot less plastic in the environment a few decades ago. We add more faster than it gets removed, and there could be some threshold after which it becomes more of an issue. We might see the effect a few decades from now. Not only on humans, but it's everywhere in the environment now, affecting all life on earth.
You're not arguing in a way that strikes me as intellectually honest.
You're hypothesizing the existence of large negative effects with minimal evidence.
But the positive effects of plastics and social media are extremely well understood and documented. Plastics have revolutionized practically every industry we have.
With that kind of pattern of evidence, I think it makes sense to discount the negatives and be sure to account for all the positives before saying that deploying the technology was a bad idea.
I agree that plastics probably do have more positives than negatives, but my point is that many of our innovations do have large negative effects, and if we take them into use before we understand those negative effects it can be impossible to fix the problems later. Now that we're starting to understand the extent of plastic pollution in our environment, if some future study reveals that it's a causal factor in some of our diseases it'll be too late to do anything about it. The plastic is in the environment and we can't get it out with regulation anymore.
Why take such risks when we could take our time doing more studies and thinking about all the possible scenarios? If we did, we might use plastics where they save lives and not use them in single-use containers and fabrics. We'd get most of the benefit without any of the harm.
I'm sure it's very good the first time you take it. If you don't consider all the effects before taking it, it does make sense. You feel very good, but the even stronger negative effects come after. Same can be said about a lot of technology.
Addiction is a matter of degree. There's a bunch of polls where a large majority of people strongly agree that "they spend too much time on social media". Are they addicts? Are they "coosing to use it"? Are they saying it's too much because that's a trendy thing to say?
WHAT?! Do you think we as humanity would have gotten to all the modern inventions we have today like the internet, space travel, atomic energy, if we had skipped the fossil fuel era by preemptively regulating it?
How do you imagine that? Unless you invent a time machine, go to the past, and give inventors schematics of modern tech achievable without fossil fuels.
Maybe not as fast as we did, but eventually we would have. Maybe more research would have been put into other forms of energy if the effects of fossil fuels were considered more thoroughly and usage was limited to a degree that didn't have a chance cause such fast climate change. And so what if the rate of progress would have been slower and we'd be 50 years behind current tech? At least we wouldn't have to worry about all the damage we've caused now, and the costs associated with that. Due to that damage our future progress might halt while a slower, more careful society would continue advancing far into the future.
I think it's an open question whether we can reboot society without the use of fossil fuels. I'm personally of the opinion that we wouldn't be able to.
Simply taking away some giant precursor for the advancements we enjoy today and then assuming it all would have worked out somehow is a bit naive.
I would need to see a very detailed pipeline from growing wheat in an agrarian society to the development of a microprocessor without fossil fuels to understand the point you're making. The mining, the transport, the manufacture, the packaging, the incredible number of supply chains, and the ability to give people time to spend on jobs like that rather than trying to grow their own food are all major barriers I see to the scenario you're suggesting.
The whole other aspect of this discussion that I think is not being explored is that technology is fundamentally competitive, and so it's very difficult to control the rate at which technology advances because we do not have a global government (and if we did have a global government, we'd have even more problems than we do now). As a comment I read yesterday said, technology concentrates gains towards those who can deploy it. And so there's going to be competition to deploy new technologies. Country-level regulation that tries to prevent this locally is only going to lead to other countries gaining the lead.
You might be right, but I'm wasn't saying we should ban all use of any technology that has any negative effects, but that we should at least try to understand all the effects before taking it into use, and try to avoid the worst outcomes by regulating how to use the tech. If it turns out that fossil fuels are the only way to achieve modern technology then we should decide to take the risk of the negative effects knowing that there's such a risk. We shouldn't just blindly rush into any direction that might give us some benefit.
Regarding competition, yes you're right. Effective regulation is impossible before we learn global co-operation, and that's probably never going to happen.
Very naive take that's not based in reality but would only work in fiction.
Historically, all nations that developed and deployed new tech, new sources of energy and new weapons, have gained economic and military superiority over nations who did not, which ended up being conquered/enslaved.
UK would not have managed to be the world power before the US, without their coal fueled industrial era.
So as history goes, if you refuse to take part in, or cannot keep up in the international tech, energy and weapons race, you'll be subjugated by those who win that race. That's why the US lifted all brakes on AI, to make sure they'll win and not China. What EU is doing, self regulating itself to death, is ensuring its future will be at the mercy of US and China. I'm not the one saying this, history proves it.
You're right, in a system based on competition it's not possible to prevent these technologies from being used as soon as they're invented if there's some advantage to be gained. We need to figure out global co-operation before such a thing is realistic.
But if such co-operation was possible, it would make sense to progress more carefully.
There is no such thing as "global cooperation" in our reality for things beyond platitudes. That's only a fantasy for sci-fi novels. Every tribe wants to rule the others, because if you don't, the other tribes will rule you.
It's been the case since our caveman days. That's why tribes that don't focus on conquest end up removed form the gene pool. Now extend tribe to nation to make it relevant to current day.
The internet was created in the military at the start of the fossil era, there is no reason, why it should be affected by the oil era. If we wouldn't travel that much, because we don't use cars and planes that much, the internet would be even more important.
Space travel does need a lot of oil, so it might be affected, but the beginning of it were in the 40s so the research idea was already there.
Atomic energy is also from the 40s and might have been the alternative to oil, so it would thrive more if we haven't used oil that much.
Also all 3 ARE heavily regulated and mostly done by nation states.
How would you have won the world wars without oil?
Your augment only work in a fictional world where oil does not exist and you have the hindsight of today.
But when oil does exist and if you would have chosen not to use it, you will have long been steamrolled by industrialized nations powers who used their superior oil fueled economy and military to destroy or enslave your nation and you wouldn't be writing this today.
I thought we are arguing about regulating oil not to not use oil at all.
> How would you have won the world wars without oil?
You don't need to win world wars to have technological advancement, in fact my country didn't. I think the problem with this discussion, is that we all disagree what to regulate, that's how we ended up with the current situation after all.
I interpreted it to mean that we wouldn't use plastic for everything. I think we would be fine having glass bottles and paper, carton, wood for grocery wrapping. It wouldn't be so individual per company, but this not important for the economy and consumers, and also would result in a more competitive market.
I also interpreted it to mean that we wouldn't have so much cars and don't use planes beside really important stuff (i.e. international politics). The cities simply expand to the travel speed of the primary means of transportation. We would simply have more walkable cities and would use more trains. Amazon probably wouldn't be possible and we would have more local producers. In fact this is what we currently aim for and it is hard, because transition means that we have larger cities then we can support with the primary means of transportation.
As for your example inventions: we did have computers in the 40s and the need for networking would arise. Space travel is in danger, but you can use oil for space travel without using it for everyday consumer products. As I already wrote, we would have more atomic energy, not sure if that would be good though.
Depends what those assumptions are. If by protecting humans from AI gross negligence, then the assumptions are predetermined to be siding towards human normals (just one example). Lets hope logic and understanding of the long term situation proceeds the arguments in the rulesets.
You're just guessing as much as anyone. Almost every generation in history has had doomers predicting the fall of their corner of civilization from some new thing. From religion schisms, printing presses, radio, TV, advertisements, the internet, etc. You can look at some of the earliest writings by English priests in the 1500s predicting social decay and destruction of society which would sound exactly like social media posts in 2025 about AI. We should at a minimum under the problem space before restricting it, especially given the nature of policy being extremely slow to change (see: copyright).
I'd urge you to read a book like Black Swan, or study up on statistics.
Doomers have been wrong about completely different doom scenarios in the past (+), but it says nothing about to this new scenario. If you're doing statistics in your head about it, you're wrong. We can't use scenarios from the past to make predictions about completely novel scenarios like thinking computers.
(+) although they were very close to being right about nuclear doom, and may well be right about climate change doom.
I'd like for you to expand your point on understanding statistics better. I think I have a very good understanding of statistics, but I don't see how it relates to your point.
Your point is fundamentally philosophical, which is you can't use the past to predict the future. But that's actually a fairly reductive point in this context.
GP's point is that simply making an argument about why everything will fail is not sufficient to have it be true. So we need to see something significantly more compelling than a bunch of arguments about why it's going to be really bad to really believe it, since we always get arguments about why things are really, really bad.
> which is you can't use the past to predict the future
Of course you can use the past to predict (well, estimate) the future. How fast does wheat grow? Collect a hundred years of statistics of wheat growth and weather patterns, and you can estimate how fast it will grow this year with a high level of accuracy, unless a "black swan" event occurs which wasn't in the past data.
Note carefully what we're doing here: we're applying probability on statistical data of wheat growth from the past to estimate wheat growth in the future.
There's no past data about the effects of AI on society, so there's no way to make statements about whether it will be safe in the future. However, people use the statistics that other, completely unrelated, things in the past didn't cause "doom" (societal collapse) to predict that AI won't cause doom. But statistics and probability doesn't work this way, using historical data about one thing to predict the future of another thing is a fallacy. Even if in our minds they are related (doom/societal collapse caused by a new technology), mathematically, they are not related.
> we always get arguments about why things are really, really bad.
When we're dealing with a completely new, powerful thing that we have no past data on, we absolutely should consider the worst, and of course, the median, and best case scenarios, and we should prepare for all of these. It's nonsensical to shout down the people preparing for the worst and working to make sure it doesn't happen, or to label them as doomers, just because society has survived other unrelated bad things in the past.
Ah, I see your point is not philosophical. It's that we don't have historical data about the effect of AI. I understand your point now. I tend to be quite a bit more liberal and allow things to play out because I think many systems are too complex to predict. But I don't think that's a point that we'll settle here.
The experience with other industries like cars (specially EV) shows that the ability of EU regulators to shape global and home markets is a lot more limited than they like to think.
Not really china make big policy bet a decade early and win the battle the put the whole government to buy this new tech before everyone else, forcing buses to be electric if you want the federal level thumbs up, or the lottery system for example.
So I disagree, probably Europe will be even more behind in ev if they doesn't push eu manufacturers to invest so heavily in the industry.
You can se for example than for legacy manufacturers the only ones in the top ten are Europeans being 3 out of 10 companies, not Japanese or Korean for example, and in Europe Volkswagen already overtake Tesla in sales Q1 for example and Audi isn't that much away also.
Regulating it while the cat is out of the bag leads to monopolistic conglomerates like Meta and Google.
Meta shouldn't have been allowed to usurp instagram and whatsapp, Google shouldn't have been allowed to bring Youtube into the fold. Now it's too late to regulate a way out of this.
It’s easy to say this in hindsight, though this is the first time I think I’ve seen someone say that about YouTube even though I’ve seen it about Instagram and WhatsApp a lot.
The YouTube deal was a lot earlier than Instagram, 2006. Google was way smaller than now. iPhone wasn’t announced. And it wasn’t two social networks merging.
Very hard to see how regulators could have the clairvoyance to see into this specific future and its counter-factual.
Technically untrue, monopoly busting is a kind of regulation. I wouldn't bet on it happening on any meaningful scale, given how strongly IT benefits from economies of scale, but we could be surprised.
> before we have any idea what the market is going to look like in a couple years.
Oh, we already know large chunks of it, and the regulations explicitly address that.
If the chest-beating crowd would be presented with these regulations piecemeal, without ever mentioning EU, they'd probably be in overwhelming support of each part.
But since they don't care to read anything and have an instinctive aversion to all things regulatory and most things EU, we get the boos and the jeers
I literally lived this with GDPR. In the beginning every one ran around pretending to understand what it meant. There were a ton of consultants and lawyers that basically made up stuff that barely made sense. They grifted money out of startups by taking the most aggressive interpretation and selling policy templates.
In the end the regulation was diluted to something that made sense(ish) but that process took about 4 years. It also slowed down all enterprise deals because no one knew if a deal was going to be against GDPR and the lawyers defaulted to “no” in those orgs.
Asking regulators to understand and shape market evolution in AI is basically asking them to trade stocks by reading company reports written in mandarin.
> In the end the regulation was diluted to something that made sense(ish) but that process took about 4 years.
Is the same regulation that was introduced in 2016. The only people who pretend not to understand it are those who think that selling user data to 2000+ "partners" is privacy
It doesn't seem unreasonable. If you train a model that can reliably reproduce thousands/millions of copyrighted works, you shouldn't be distributibg it. If it were just regular software that had that capability, would it be allowed? Just because it's a fancy Ai model it is ok?
> that can reliably reproduce thousands/millions of copyrighted works, you shouldn't be distributibg it. If it were just regular software that had that capability, would it be allowed?
LLMs are hardly reliable ways to reproduce copyrighted works. The closest examples usually involve prompting the LLM with a significant portion of the copyrighted work and then seeing it can predict a number of tokens that follow. It’s a big stretch to say that they’re reliably reproducing copyrighted works any more than, say, a Google search producing a short excerpt of a document in the search results or a blog writer quoting a section of a book.
It’s also interesting to see the sudden anti-LLM takes that twist themselves into arguing against tools or platforms that might reproduce some copyrighted content. By this argument, should BitTorrent also be banned? If someone posts a section of copyrighted content to Hacker News as a comment, should YCombinator be held responsible?
They're probably training them to refuse, but fundamentally the models are obviously too small to usually memorise content, and can only do it when there's many copies in the training set. Quotation is a waste of parameters better used for generalisation.
The other thing is that approximately all of the training set is copyrighted, because that's the default even for e.g. comments on forums like this comment you're reading now.
The other other thing is that at least two of the big model makers went and pirated book archives on top of crawling the web.
LLMs even fail on tasks like "repeat back to me exactly the following text: ..." To say they can exactly and reliably reproduce copyrighted work is quite a claim.
You can also ask people to repeat a text and some will fail.
What I want to say is that even if some LLMs (probably only older ones) will fail doesn't mean future ones will fail (in the majority). Especially if benchmarks indicate they are becoming smarter over time.
It is entirely unreasonable to prevent a general purpose model to be distributed for the largely frivolous reason that maybe some copyrighted works could be approximated using it. We don´t make metallurgy illegal because it's possible to make guns with metal.
When a model that has this capability is being distributed, copyright infringement is not happening. It is happening when a person _uses_ the model to reproduce a copyrighted work without the appropriate license. This is not meaningfully different to the distinction between my ISP selling me internet access and me using said internet access to download copyrighted material. If the copyright holders want to pursue people who are actually doing copyright infringement, they should have to sue the people who are actually doing copyright infringement and they shouldn't have broad power to shut down anything and everything that could be construed as maybe being capable of helping copyright infringement.
Copyright protections aren't valuable enough to society to destroy everything else in society just to make enforcing copyright easier. In fact, considering how it is actually enforced today, it's not hard to argue that the impact of copyright on modern society is a net negative.
If the Xerox machine had all of the copyrighted works in it and you just had to ask it nicely to print them I think you'd say the tool is in the wrong there, not the user.
Xerox already went through that lawsuit and won, which is why photocopiers still exist. The tool isn't in the wrong for being told to print out the copyrighted works. The user still had to make the conscious decision to copy that particular work. Hence, still the user's fault.
You take the copyrighted work to the printer, you don't upload data to an LLM first, it is already in the machine. If you got LLMs without training data (however that works) and the user needs to provide the data, then it would be ok.
You don't "upload" data to an LLM, but that's already been explained multiple times, and evidently it didn't soak in.
LLMs extract semantic information from their training data and store it at extremely low precision in latent space. To the extent original works can be recovered from them, those works were nothing intrinsically special to begin with. At best such works simply milk our existing culture by recapitulating ancient archetypes, a la Harry Potter or Star Wars.
If the copyright cartels choose to fight AI, the copyright cartels will and must lose. This isn't Napster Part 2: Electric Boogaloo. There is too much at stake this time.
One of the reasons the New York Times didn't supply the prompts in their lawsuit is because it takes an enormous amount of effort to get LLMs to produce copyrighted works. In particular, you have to actually hand LLMs copyrighted works in the prompt to get them to continue it.
It's not like users are accidentally producing copies of Harry Potter.
Helpfully the law already disagrees. That Xerox machine tampers with the printed result, leaving a faint signature that is meant to help detect forgeries. You know, for when users copy things that are actually illegal to copy. Xerox machine (and every other printer sold today) literally leaves a paper trail to trace it back to them.
You're quite right. Still, it's a decent example of blaming the tool for the actions of its users. The law clearly exerted enough pressure to convince the tool maker to modify that tool against the user's wishes.
According to the law in some jurisdictions it is. (notably most EU Member States, and several others worldwide).
In those places actually fees are included ("reprographic levy") in the appliance, and the needed supply prices, or public operators may need to pay additionally based on usage. That money goes towards funds created to compensate copyright holders for loss of profit due to copyright infringement carries out through the use of photocopiers.
Xerox is in no way singled out and discriminated against. (Yes, I know this is an Americanism)
If I've copied someone else's copyrighted work on my Xerox machine, then give it to you, you can't reproduce the work I copied. If I leave a copy of it in the scanner when I give it to you, that's another story. The issue here isn't the ability of an LLM to produce it when I provide it with the copyrighted work as an input, it's whether or not there's an input baked-in at the time of distribution that gives it the ability to continue producing it even if the person who receives it doesn't have access to the work to provide it in the first place.
To be clear, I don't have any particular insight on whether this is possible right now with LLMs, and I'm not taking a stance on copyright law in general with this comment. I don't think your argument makes sense though because there's a clear technical difference that seems like it would be pretty significant as a matter of law. There are plenty of reasonable arguments against things like the agreement mentioned in the article, but in my opinion, your objection isn't one of the.
You can train a LLM on completely clean data, creative commons and legally licensed text, and at inference time someone will just put a whole article or chapter in the model and has full access to regenerate it however they like.
Re-quoting the section the parent comment included from this agreement:
> > GPAI model providers need to establish reasonable copyright measures to mitigate the risk that a downstream system or application into which a model is integrated generates copyright-infringing outputs, including through avoiding overfitting of their GPAI model. Where a GPAI model is provided to another entity, providers are encouraged to make the conclusion or validity of the contractual provision of the model dependent upon a promise of that entity to take appropriate measures to avoid the repeated generation of output that is identical or recognisably similar to protected works.
It sounds to me like an LLM you describe would be covered if they people distributing it put in a clause in the license saying that people can't do that.
Yes, it is covered technically. But practically nobody knows what is infringing in the non-literal infringement case. It all depends on the judge and context. Was this idea sufficiently original or was it a necessity, or a generic pattern? Each level of abstraction can get protection from copyright. You can only know if you sue/get sued.
I find non-literal copyrights (total concept and feel, abstraction filtration comparison/AFC) to be a perverse way to interpret "protected expression" as "protected abstraction". It is a betrayal of future creative activities to prop up the past ones.
It's a trojan horse, they try to do the same thing that is happening in the banking sector.
By this they want AI model provider to have a strong grip on their users, so controling their usage to not risk issues with the regulator.
Then, the European technocrats will be able control the whole field by being able to control the top providers, that then will overreach by controlling their users.
> One of the key aspects of the act is how a model provider is responsible if the downstream partners misuse it in any way
AFAICT the actual text of the act[0] does not mention anything like that. The closest to what you describe is part of the chapter on copyright of the Code of Practice[1], however the code does not add any new requirements to the act (it is not even part of the act itself). What it does is to present a way (which does not mean it is the only one) to comply with the act's requirements (as a relevant example, the act requires to respect machine-readable opt-out mechanisms when training but doesn't specify which ones, but the code of practice explicitly mentions respecting robots.txt during web scraping).
The part about copyright outputs in the code is actually (measure 1.4):
> (1) In order to mitigate the risk that a downstream AI system, into which a general-purpose AI model is integrated, generates output that may infringe rights in works or other subject matter protected by Union law on copyright or related rights, Signatories commit:
> a) to implement appropriate and proportionate technical safeguards to prevent their models from generating outputs that reproduce training content protected by Union law on copyright and related rights in an infringing manner, and
> b) to prohibit copyright-infringing uses of a model in their acceptable use policy, terms and conditions, or other equivalent documents, or in case of general-purpose AI models released under free and open source licenses to alert users to the prohibition of copyright infringing uses of the model in the documentation accompanying the model without prejudice to the free and open source nature of the license.
> (2) This Measure applies irrespective of whether a Signatory vertically integrates the model into its own AI system(s) or whether the model is provided to another entity based on contractual relations.
Keep in mind that "Signatories" here is whoever signed the Code of Practice: obviously if i make my own AI model and do not sign that code of practice myself (but i still follow the act requirements), someone picking up my AI model and signing the Code of Practice themselves doesn't obligate me to follow it too. That'd be like someone releasing a plugin for Photoshop under the GPL and then demanding Adobe release Photoshop's source code.
As for open source models, the "(1b)" above is quite clear (for open source models that want to use this code of practice - which they do not have to!) that all they have to do is to mention in their documentation that their users should not generate copyright infringing content with them.
In fact the act has a lot of exceptions for open-source models. AFAIK Meta's beef with the act is that the EU AI office (or whatever it is called, i do not remember) does not recognize Meta's AI as open source, so they do not get to benefit from those exceptions, though i'm not sure about the details here.
One of the key aspects of the act is how a model provider is responsible if the downstream partners misuse it in any way. For open source, it's a very hard requirement[1].
> GPAI model providers need to establish reasonable copyright measures to mitigate the risk that a downstream system or application into which a model is integrated generates copyright-infringing outputs, including through avoiding overfitting of their GPAI model. Where a GPAI model is provided to another entity, providers are encouraged to make the conclusion or validity of the contractual provision of the model dependent upon a promise of that entity to take appropriate measures to avoid the repeated generation of output that is identical or recognisably similar to protected works.
[1] https://www.lw.com/en/insights/2024/11/european-commission-r...