Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Does Github give users a toggle to opt-out of being in the copilot training dataset?

I feel like that would be a decent compromise for folks not down with their code being used in copilot.



I should not have to opt out. GitHub should have to respect my license. I already said they can use my code, as long as they keep an attribution intact (via a BSD license, for example)

GitHub is taking my code and ignoring the license. I don’t understand why anyone would think that is ok.


No, you're ignoring what you agreed to when you accepted the terms of service. GitHub can display your code, and YOU granted them that license by accepting their terms.

I find the only people making these OSS claims haven't used copilot and tend to lack any real contributions to OSS. What you're describing is just simply not the case for 99.9 percent of the code snippets being produced/generated based on data from GitHub.

I actually care more about putting code into peoples hands versus someone copying a license file, that's probably why I use the unlicense... "Because you have more important things to do than enriching lawyers or imposing petty restrictions on users"


Are you saying the FSF accepted github's terms of service when someone mirrored Emacs into github?


You mean when a maintainer of Emacs setup the mirror?

https://sachachua.com/blog/2015/12/2015-12-10-emacs-chat-joh...


Of course in this case as well, it even says that mirror is not endorsed by the copyright owner.


I have used Copilot in its free phase, and have made substantial OSS contributions to various projects as well as shepherding my own projects (a few of which have attained a degree of success). Building an AI model off the community's code and selling the model (and code generated by rearranging statements and patterns found in the training data) back to the community is odious.


Not everyone who has code at GitHub uploaded it personally. Plenty of code was written before GitHub even existed and that code is still uploaded there.


That’s what fair use doctrine is about. Copyright doesn’t say “you can’t do anything without permission”, it says “you can’t do anything without permission, except for a few categories of things which cannot be forbidden”, and Copilot claims that what they’re doing fits in one of those categories.


Do you not think there are limitations on the copyrights you claim? The courts certainly do!


Given Git is distributed (or most source control as a matter of fact today is) - even if you stopped pushing your code to Github, does it stop Copilot from pulling code from sources like gitlab.gnome.org, kernel.org, gitlab.kde.org etc?

I think underlying discussion should be about licensing, not about website to which you are pushing open source code to. Because that can be easily worked around.


There is a hidden problem with licensing here. Developers are giving Github the permission to use the code with a different license [1]. The clause sounds broad enough for them to justify training copilot with it. This allows them to disregard the license with which the project is published. The developers don't have the protection of a FOSS license anymore when you host there.

[1] https://docs.github.com/en/site-policy/github-terms/github-t...


I don't see anything in that licence that would allow it to be used as a corpus for machine learning

most likely they're relying on fair use, which would apply regardless of where it's hosted


a nice, easy compromise they could have proposed and implemented quite some time ago.



This is completely unrelated, it allows you to use copilot.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: