Frankly, I think it's on you to lay out your argument in full rather than assume...

gavinhoward · on Oct 28, 2021

> Frankly, I think it's on you to lay out your argument in full rather than assume everyone is privy to your thought process.

No, it's on you to not assume you know everything about my thought process before I show you otherwise.

Could I have communicated better? Yes. But I didn't assume you knew everything about my thought process. I thought it wasn't necessary for you too until you assumed that you knew my argument better than I did.

> You seem to coming at this as if the law is a purely mechanistic thing that can quickly resolve disputes, overlooking how these things play out in the real world, like Oracle v google going on for a decade or the even longer litigation involving SCO and IBM.

Once again, you are assuming. Yes, I know law is not mechanistic. Yes, I know going to court would take a long time.

Going to court is not the only thing I am doing. I also created new licenses, which I would not have if I only cared about what happened in court.

Going to court would be to attempt to argue for and enforce my viewpoint (indirectly). It would be a last-ditch attempt.

The first thing I am doing is creating new licenses specifically meant to "poison the well" for machine learning on code in general and Copilot in particular. [1]

With those licenses, I hope to make companies nervous about using Copilot for anything that might be using my licenses. This hesitation may only apply to code with my licenses, but the FAQ for those licenses ([2] is an example) are also designed to make lawyers nervous about the GPL and other licenses.

If I succeed in making the hesitation big enough, then Copilot as a paid service would be dead, and hopefully enough companies will prohibit the use of Copilot, as is already being done. [3]

Going to court, then, would only happen if I found someone infringing.

This will be especially helped by the fact that the vast majority of the code under those licenses will be in a language I'm building right now. If there's open source code in the language, then I can search that code for infringements caused by Copilot.

> I mean, what makes you so sure the court is going to give you a quick judgment on the infringement, or that it's going to agree with you about the size of code fragment that that is sufficient to infringe?

Do you think I would be stupid enough to pick an example to bring before court that would not be obviously infringing?

Winning in court is not just about being right, it's also about picking your battles, and I would be very choosy.

> Surely you can can agree that sufficiently small code fragments won't meet this threshold because they're too basic or obvious.

Yes, and as I said above, I won't use any of those.

> Because your whole argument here rests upon that assumption, it comes off as a wish fulfillment scenario where Copilot disappears because nobody likes the risk calculus;

You realize that this is the entire basis for the cybersecurity industry? The entire point is to make it economically infeasible for bad guys to do bad things in cyber space; it's to make the "risk[/reward] calculus" skew in favor of the good guys so much so that bad guys just stop operating.

Making the risk calculus riskier for your opponent is how wars and legal cases are fought too, but such tactics are not confined to the warroom or courtroom. That's why my opening salvo is licenses to sow doubt, to change the perception of the risk calculus. Battles like this are won by "winning minds," which in this case means convincing enough people to be nervous about it.

> your stated goal of 'making Copilot a dead product' seems more emotional than rational.

This is something where you are partially right. There is a lot of emotion behind it, not because I'm an emotional person (I'm actually on the spectrum and less emotional than the average person), but because I objectively considered the ramifications of what GitHub is doing with Copilot, realized how bad those ramifications were, and that lit a fire under me.

I wrote about the ramifications and refuted the dubious legal justifications in a whitepaper [4] for the FSF call for papers [5]. (Intro blog post at [6].)

But if you will read through the paper, you will find that there is rationality in my thoughts. I just happen to think this is a fight worth taking. Thus, the emotion.

> In reality it will take you a long time to get a result, and if enough people find Copilot useful (which I suspect they will), legal departments will adapt to that risk calculus and just figure out the cost of blowing or buying you off in the event that their developers carelessly infringe.

"Buying me off" would include checking that Copilot didn't output my code, and if it did, to follow the license. I'm not sure they would like the added work to use something that is supposed to save work on the easiest part of programming. But even if they did, I would be satisfied.

And that points to another part of my "thought process": the reason that I think I've got a chance is because I think the "reward" side of the risk/reward calculus is not very high with Copilot because it is the easiest part of programming.

Almost everything in programming is harder than writing boilerplate, and as I said in another comment [7], I think there are still better ways of reducing boilerplate. In fact, the language I am working on is designed to help with that. So my perception, which I acknowledge could be wrong, is that the reward for using Copilot is not high, which means I may not have to raise the risk level much for people to change their minds about it.

But the most important point would be to make legal departments and courts recognize that copyright still has teeth, or rather, argue well enough to convince people of that fact, despite what GitHub is saying.

> If it sufficiently improves industrial productivity, it will become established while you're trying to litigate and afterwards people will just avoid crossing the threshold of infringement.

This would be a win in my book too. I am going to be the first person to write boilerplate code in my language, which means that anyone who writes in this language will be "copying" me. I don't care about the boilerplate, though; they can copy that as much as they want.

> Honestly, this exchange makes me glad that I don't publish software and thus don't care about license conditions on a day to day basis.

I feel you on that. The only reason I do is because I feel like my future customers deserve the blueprints to the software they are using the same way the buyers of a building deserve to get the building's blueprints from the architect. If I did not have that opinion, I would probably not publish either.

[1]: https://gavinhoward.com/2021/07/poisoning-github-copilot-and...

[2]: https://yzena.com/yzena-network-license/#frequently-asked-qu...

[3]: https://news.ycombinator.com/item?id=27714418

[4]: https://gavinhoward.com/uploads/copilot.pdf

[5]: https://news.ycombinator.com/item?id=27998109

[6]: https://gavinhoward.com/2021/10/my-whitepaper-about-github-c...

[7]: https://news.ycombinator.com/item?id=29019777

Edit: Clarification and fix typo.

throwaway675309 · on Oct 29, 2021

If Microsoft noticed that a substantial number of contributors were putting in these "Poisoned Well license restrictions" in their repositories, it would be relatively trivial to automatically filter out those code bases using some basic heuristics to determine if the license was biased against systems like copilot.

gavinhoward · on Oct 29, 2021

And that would be a win for me too.

In fact, I'm really just trying to put them between a rock and a hard place so that I win no matter what.

Of course, if that does happen, I expect more people would want to use my licenses...