Legally, it needs to be opt-in in order to protect downstream consumers of code written by Copilot.
Copilot sometimes reproduces code verbatim. You can't use open source code except under the terms of the license. Authors whose code may be reproduced by Copilot need to grant a license to downstream consumers, and republishers of Copilot-generated code need to adhere to the terms of that license.
Copilot is inserting ticking time-bombs into its users' codebases.
Nope. Copilot users are inserting "ticking time-bombs" into their own codebases.
The buck stops with the user, when they use code from any source at all, whether it's their head, the internet, some internal library, lecture notes, a coworker, a random dude of the street, or who knows what else, it their own responsibility to ensure the code they're using has been released under a license they can use. They don't get to go back and point fingers just because they didn't do their own due-diligence.
The exception would be if a vendor provides code under a legal contract providing liability and an agreed license, that has not happened in this case so there's no reason to expect any legal protections.
We agree that downstream users who redistribute copyrighted code regurgitated by Copilot are in violation of copyright.
It doesn't seem to me as though the distinction between "Copilot reproduced the code and the engineer copy/pasted/saved it" versus "Copilot inserted the code" is crucial.
There's a separate question about Microsoft's own liability. When Copilot reproduces open source code without adhering to the terms of the license, that's redistribution and thus copyright infringement. A copyright owner might not be able to get substantial monetary damages, but they ought to be able to get a copyright injunction.
I wonder what happens to Copilot should a Github user secure injunctive relief, forcing Microsoft to exclude their code from Copilot.
> It doesn't seem to me as though the distinction between "Copilot reproduced the code and the engineer copy/pasted/saved it" versus "Copilot inserted the code" is crucial.
I think that could be crucial.
If I read a computer science book, and from that produce a unique piece of code which was not present in the book, I have created a new work which I hold copyright over.
If I train a machine learning algorithm on a computer science book, and that ML algorithm produces some output, that output does not have a new copyright.
Similarly, if copilot synthesizes a bunch of MIT code and produces a suggestion, that may be MIT still, while if a human does the exact same reading and writing, if it is an original enough derivative, it may be free of the original MIT license.
The way I'm reading your reply seems like sophistry, so I expect I'm misunderstanding you.
Scenario 1: Copilot, operating as an IDE plugin, placed the suggestion directly into the text. To accept the suggestion, the engineer hit save.
Scenario 2: Copilot, placed its suggestion in an external file. The engineer copy/pasted the suggestion verbatim into their IDE, then hit save.
These don't seem as though they materially affect the situation. Regardless, the downstream user who somehow brought the copyrighted code into their codebase (which they subsequently redistribute) is infringing.
This theoretical case where Copilot is not involved and the user synthesizes something on their own is not germane. Copilot is involved.
What are you folks getting at? That Microsoft is in the clear? That the end user is in the clear? That "I'm just making suggestions" is akin to "I'm just asking questions" and absolves the suggester of liability? I don't get it.
Thanks for giving me the benefit of the doubt, but I do not deserve it in this case. I misread what I was responding to and my response was off the mark.
You're right to be confused, and my reply can be ignored as off-topic for the thread i'm in.
That's generous of you, since you were not alone. It seems as though I could have done a better job of emphasizing from the get-go that I thought infringement by the end user was the key point, rather than infringement by Microsoft.
> It doesn't seem to me as though the distinction between "Copilot reproduced the code and the engineer copy/pasted/saved it" versus "Copilot inserted the code" is crucial.
Yes, the only thing that matters is who authorized the code to be published, which is never Copilot (an automated system that takes tickets, has copilot craft patches from them, then publishes them with no human review would be a) very cool and b) an incomprehensibly terrible idea; but even then there is still a human authorizing the code to be published, just residing a level of abstraction removed from the process itself)
>It doesn't seem to me as though the distinction between "Copilot reproduced the code and the engineer copy/pasted/saved it" versus "Copilot inserted the code" is crucial.
Is Google responsible when they index licensed code, then others steal it? It's surely the liability of the programmer to "check" (Not really sure how this would work, either).
What matters is that users of Copilot (the "others" who "steal it" in your scenario) are liable for infringement. That renders Copilot impractical as a tool for production use, regardless of whether Microsoft has any liability.
If it is relevant at all, the threshold of originality applies to the allegedly-copyrighted source consumed by Copilot (as regards bare infringement, not wilfull infringement). If that doesn't meet the threshold, it is not copyrightable. If it is, unauthorized copying not within a copyright exception (e.g., fair use) is infringement.
I can't see any case where originality of the snippet presented by Copilit matters (if Copilot were a person, it would matter to determining if the snippet on its own was copyrightable by Copilot, but still wouldn't be relevant to whether to original copyright was violated.)
How much of the code needs to be "unique" across a single codebase for copilot to be illegally pasting it downstream ?
For a great deal of copilot insertions, it's like the equivalent of me writing the sentence, "the man gasped in surprise" in a novel I'm working on. Yeah maybe that sentence came from somebody else's novel, but you can find the same damn sentence in a thousand other books/papers/etc as well.
Copilot sometimes reproduces code verbatim. You can't use open source code except under the terms of the license. Authors whose code may be reproduced by Copilot need to grant a license to downstream consumers, and republishers of Copilot-generated code need to adhere to the terms of that license.
Copilot is inserting ticking time-bombs into its users' codebases.