Hacker News new | past | comments | ask | show | jobs | submit login
GitHub Copilot is generally available (github.blog)
863 points by sammorrowdrums on June 21, 2022 | hide | past | favorite | 760 comments



I've been using Copilot for a few months and...

Yeah, it makes mistakes, sometimes it shows you i.e. the most common way to do something, even if that way has a bug in it.

Yes, sometimes it writes a complete blunder.

And yes again, sometimes there are very subtle logical mistakes in the code it proposes.

But overall? It's been *great*! Definitely worth the 10 bucks a month (especially with a developer salary). :insert shut up and take my money gif:

It's excellent for quickly writing slightly repetitive test cases; it's great as an autocomplete on steroids that completes entire lines + fills in all arguments, instead of just a single identifier; it's great for quickly writing nice contextual error messages (especially useful for Go developers and the constant errors.Wrap, Copilot is really good at writing meaningful error messages there); and it's also great for technical documentation, as it's able to autocomplete markdown (and it does it surprisingly well).

Overall, I definitely wouldn't want to go back to writing code without it. It just takes care of most of the mundane and obvious code for you, so you can take care of the interesting bits. It's like having the stereotypical "intern" as an associate built-in to your editor.

And sometimes, fairly rarely, but it happens, it's just surprising how good of a suggestion it can make.

It's also ridiculously flexible. When I start writing graphs in ASCII (cause I'm just quickly writing something down in a scratch file) it'll actually understand what I'm doing and start autocompleting textual nodes in that ASCII graph.


As a polyglot who works in 4-5 different languages every 3-6 months, it's been very valuable.

I forget a lot of things, simple dumb stuff like type conversions or specific keywords spelling. Copilot takes care of 99% of that so I can focus on my higher level spec.

If anything sometimes it's too agressive. I start typing a word and it's already building up the rest of the application in a different direction...


Isn't your forgetting issue something that spaced repetition would solve while also building confidence in yourself?


Spaced repetition doesn't help with the use case of needing to recall lots of little details spread across a wide number of knowledge domains. You'd spend more time trying to remember everything than on actually doing useful work.

Also, that little "building confidence in yourself" rider that you added suggests that you think the OP doesn't have confidence in themselves. Careful about those assumptions; in this case it comes across as a little patronizing.


> Also, that little "building confidence in yourself" rider that you added suggests that you think the OP doesn't have confidence in themselves.

They certainly do have confidence, but it doesn't hurt building upon it. I don't think confidence is a discrete 0 or 1 variable. It's a continuous variable, from 0 to infinity.

By the way, I made a question, not a statement. I welcome the argument on whether it's worth the time to memorize the stuff Autopilot can autocomplete. That's something to measure and debate.

> Careful about those assumptions; in this case it comes across as a little patronizing.

Thanks for the heads up. This is important. But I didn't make any assumption. It seems maybe you made an assumption about me making an assumption?

If that's the case, don't worry as I did not feel patronized.


No. Spaced repetition is a good tool for learning vocabulary.

It's not a silver bullet / holy grail / MacGuffin / etc.

There's a community of people obsessed with spaced repetition. None of them seem to have accomplished spectacular feats of learning. There's a good reason for that.

(The flip side, however, is that many people who have accomplished spectacular feats of learning DO often use spaced repetition, but among a broader repertoire).


Could you please help me understand further? I found this a bit contradictory:

> None of them seem to have accomplished spectacular feats of learning

> many people who have accomplished spectacular feats of learning DO often use spaced repetition

---

Second note:

> Spaced repetition is a good tool for learning vocabulary.

This might be a misconception? The first ever published paper about spaced repetition was about vocabulary.

But research has shown its merits are valid for many other types of knowledge.

By the way, memorizing coding vocabulary is arguably similar to human language vocabulary.


Note 1:

It's not contradictory.

I'll give an example: Landmines are a good tool for slowing an enemy army. However, if your military consists of *just* landmines, it won't be very effective. That doesn't make landmines a bad tool. Indeed, even a super-weapon, like the first jet fighter, won't win a war if it's your *only* tool.

Learning -- even a language -- is a complex process, and you need many tools. Spaced repetition is awesome for factoids. If you want to learn a language, you need to memorize vocabulary. SR is great for that. If you add 5-15 minutes of spaced repetition to a good language program, it will help a lot. If that's where you spend a majority of your time, you'll learn very little. However, SR won't help you practice a broad range of skills around listening, speaking, understanding communication styles, or quite a few other things.

Ditto with physics and math. If you know equations, it accelerates everything else. However, the bulk of the knowledge isn't factual or procedural; it's conceptual. Simply memorizing formulas won't help you. On the other hand, in most cases, once you learn conceptual knowledge in physics, you never forget it.

"Coding vocabulary" isn't in my top-50 problem with junior developers. Naming variables, algorithms, systems design, etc. are. Most of those don't align to SR. I'd take a programmer who spends 8 hours coding over one who spends 8 hours memorizing library calls in an SR system.

Note 2:

The spacing effect is /somewhat/ broadly applicable (but far from universal), but spaced repetition specifically is only helpful for factoid-style knowledge. You can look over the different classifications of knowledge, skills, and abilities (factual, conceptual, procedural, declarative, etc.).


I find it’s less about actually forgetting and more about all the mundane details I usually bounce around a codebase to find. What’s the third argument to that function again? Oh, copilot autocompleted it correctly.

It’s not always right but it’s right enough it’s a big timesaver.


> Oh, copilot autocompleted it correctly.

That assumes I once had the knowledge of what is "correct".

I just don't quite remember it right now. Then I rely on Autopilot to complete it.

Sometimes I may not feel sure enough to judge whether Copilot is right. I'll need to dig the documentation anyways.

Other times I'll feel sure. But in how many of those I'll be wrong? And what will be the consequences?


I agree in concept, in practice I find the relative frequency of "I can't quite remember that thing" is way higher than learning new things I don't know if copilot is right about.

> what will be the consequences?

for me, usually a failed build or unit test :P low stakes stuff


> And sometimes, fairly rarely, but it happens, it's just surprising how good of a suggestion it can make.

I've had this experience too. Usually it's meh, but at one point it wrote an ENTIRE function by itself and it was correct. IT WAS CORRECT! And it wasn't some dumb boilerplate initialization either, it was actual logic with some loops. The context awareness with it is off the charts sometimes.

Regardless I find that while it's good for the generic python stuff I do in my free time, for the stuff I'm actually paid for? Basically useless since it's too niche. So not exactly worth the investment for me.


I've found that even in the specialist domains I'm working in, it does pretty well. The caveats being that you need to guide it quite a bit, but once a workflow starts to come together then it really starts to hum.


it is likely that for the stuff you are paid for a custom model might be a better fit.


Copilot has been amazing for me too. It's gotten to the point where I want similar smart autocomplete features in other software, such as spreadsheets or doing music in my DAW. I think those will come too eventually.


Well, to some extent DAWs like Logic and GarageBand already have some “autocomplete”, eg their drummers.


I really don't like it.

I find I spend my time reviewing Copilot suggestions (which are mostly wrong) rather than thinking about code and actually doing the work.


I turned off auto-suggest and that made a huge difference. Now I'll use it when I know I'm doing something repetitive that it'll get easily, or if I'm not 100% sure what I want to do and I'm curious what it suggests. This way I get the help without having it interrupt my thoughts with its suggestions.


How do you do that?


In intellij (which is where I use it), section 8 of this document explains how. I'm assuming there's an equivalent for vscode.

When you disable auto complete, it can still be fired via a keyboard shortcut, which is how I use it.

https://github.com/github/copilot-docs/blob/main/docs/jetbra...


According to StackOverflow, VS Code users can use alt+\ to trigger the suggestion and disable auto complete with this:

    "editor.inlineSuggest.enabled": false
https://stackoverflow.com/a/71224912/1048433


I have to toggle it on and off. I find that if I'm thinking about what to write it is absurdly distracting to see a suggestion, which is usually wrong, because the suggestion steals my focus. On the other hand when I don't really need to think that much Copilot often gives valuable suggestions that I accept.


I really like it too but I can't help but feel like all the same people panicked about Microsoft acquiring GitHub are suddenly quiet about the fact that Microsoft has found the ultimate way to profit off of open-source

There's a ton of developer effort that went into Copilot and those devs should be paid fairly. But the majority of what fuels Copilot is the millions of lines of open source code submitted every day.

I think I'd feel a lot better about it if they committed a good chunk of that money back into the same open source communities they depend on. Otherwise its a parasitic (or at least not fully consensual) relationship


Microsoft could simply crawl archives of code wherever they exist, and use that to build copilot.

GitHub hosting open source communities is a way of giving back.


Microsoft gives free copies of Office to schools. Why do you think it does that? MS benefits from people being locked into its service and suite. It provides premium subscriptions for additional features.

Hosting code "for free" is part of its business model. It's not a way of "giving back"


> It's not a way of "giving back"

Github Advanced Security costs me $200+/month, but if I make my repository public, I get all of that for free.

Certainly feels like a way of giving back.

Of course it benefits them too, but it doesn’t have to be purely altruistic to be a net positive.


Their point is that they didn't need to own GitHub to build copilot—they could have used the public code regardless.


It's a lot more complicated to crawl code somewhere in the web than to crawl github.

They could even change github so it benefits their code crawling.


That would be a problem if Microsoft didn’t also have their own Bing


Oh, is Bing still a thing? Honest question, haven't heard it mentioned for years. Went from Altavista to Google (I believe, it's a long time ago...) and now to DDG a couple of years ago. Never considered Bing directly although I believe DDG results come or came from Bing.


Yes, Bing still exists. Like you say, that's where a lot of DDG results come from.


Maybe they have a stronger position because they own GitHub. There is at least "Release and Indemnification" section in the terms of service.


Could have. But they didn’t


They're providing Copilot for free for OSS maintainers.


OSS maintainer here.

No. They're not. They're advertising that they are.

They are providing it to a very small set of high-profile OSS maintainers some opaque algorithm picked out. Having high-profile adopters is just good business.


Exactly. They are bribing the project leads so they can then say that scanning is approved and voluntary.


Didn't OSS maintainers already voluntarily approved such uses when they published their work under an OSS license?

One fundamental aspect of being open source is not limiting the purposes of use. If we now say that "code-generation AI training" is not allowed without prior approval (in addition to the license itself), then it's not open source anymore...


I approved some of my code for being reused under the terms of the AGPL. Co-pilot is very welcome to scan it and generate derivative AGPL code.

If I write AGPL code, and co-pilot scans it and makes a very similar program to it for a FAANG, who then proceeds to compete with my open-source tool by using the creative ideas generated there-in, but with a proprietary tool, that's not very fair. That's why I chose the license I did.

FAANG is more than welcome (indeed, encouraged) to use my code for any purpose permitted under the license. That includes everything except making it proprietary.

I've tried running copilot with the starting lines of my code. It generated code with identical creative ideas. It was the equivalent of taking Star Trek, and generating a new movie with the same plot line, but with names changed. That's not legal.

My code was specific enough that this wasn't just chance or other similar code. I work in a pretty narrow domain.

I did use copilot for coding myself, and a lot of what it generated was unique. But it is also a good paraphrasing tool. Running a movie script backwards and forwards through Google Translate to get different phrasing, and then swapping out new names, does not a new movie make. Ditto here.


Perhaps they chose maintainers from OSS projects they scanned?

I'm not defending Microsoft's market tactics, for obvious reasons, but we do have to consider that anyone can publish whatever insignificant code as OSS and become an "OSS maintainer" out of nothing.

They have to draw the line somewhere. Nowhere they draw will make everyone happy.


What would be your estimate for how much time it saves?


One example of Copilot saving time - the other day I was trying to remember how to access a value from a map in Go. This doesn't take much time - alt tab to browser, ctrl + t for a new tab, type in "golang get value from map", click on a stackoverflow result, scroll down, glance at the result, alt tab back to IDE, and do it. With Copilot, Copilot knew what I was trying to do and suggested the right code to me and I accepted the suggestion by pressing tab.

But! I think the savings are even bigger because there is no context switch. If I have my browser open I might find myself going to hackernews, checking my email, looking at my stackoverflow notifications, browsing twitter - whatever. Copilot is not only faster, it keeps my focus on the code without giving me a chance to get distracted. In some sense, for this example, Copilot saved me 5-10 seconds by not needing to Google something. In another sense, it might have saved me an hour because I didn't decide to just check something on twitter while I had my browser open.


I don’t think the average brain works like this. If you know you have a piece of code to deliver or are in flow, I don’t think you’d willingly start browsing HN just because you went to SO to c&p some boilerplate code.

I find that if my brain is willing to be distracted, it’s made some sort of calculation saying that the cost of being distracted isn’t going to have a significant bearing on deliverables…and I’m pretty sure I’m no better or worse at estimations than anyone else.


I get the desire to ask that question, but it also feels like the main thing Copilot saves me is energy, which can’t be quantified in a very useful way. You’ll just have to take our words for it.

Like, writing a really boring unit test might only take 60 seconds, but if Copilot can do it for me (even if I have to quickly scan it for correctness) that saves me… well something other than 60 seconds. It sure feels like a big deal.


Why write that boring unit test at all? If it is not only boring, but also automatically generated, then what value does it add to the code? I would argue it actually _decreases_ the value, because you now have that much more code to maintain and understand. 90% of unit tests people write are garbage according to my experience.


This take is too reductionist. Every website ever has a boring unit test called “Is it up?” where you check if the site is still working after a deployment.

Boring and repetitive but of infinite value if you can detect early that your deployment broke something.

If a boring/boilerplate unit test like this can deliver value, other b/b unit tests are probably going to similar impacts and hence, saying they all “decrease” value is reductionist.


> Every website ever has a boring unit test called “Is it up?”

Yeah, and it should come almost automatically from the tooling. If you are writting this, you have a tooling problem.


Why can't we come at it from both sides?

Goal of library/language/framework designers: limit boilerplate and unnecessary code

Goal of tool/IDE designers: make it easy to not spend time on boilerplate and unnecessary code.

Both sides have built in limitations that will keep them from completely solving the problem. Terse, highly DRY code tends to also be highly abstracted and hard to read, with a lot of implicit behavior. On the other side, large amounts of generated code lead to tool lockin and cruft accumulation.


No product starts with an elaborate ci/cd pipeline and automated test suite. Many evolve into needing such tooling after having had an incident or two when a deployment broke the site…and the response invariably is that “we should have written a suite of really basic tests which goes to the home page and checks if the hero image is visible.”

If co-pilot’s autogenerated test cases can help prevent this head smacking, it will have proved that basic/boilerplate code was valuable.

The site checker included in the tooling is just a more mature version of the boilerplate unit test co-pilot gives us.

Btw, I have no skin in the game -never used co-pilot…just surprised that HN commenters can be so dismissive of wanting to get the basics right - like having some test coverage.


My own experience with it is rather limited. My workplace doesn't allow the use of it (due to confidentiality), so i've only used it for school work.

It wasn't much help in designing/implementing classes in Java or .NET, but when it came to implementing unit tests it practically wrote everything after i named the class and designated it as a test. It was able to extract all the different methods from the classes being implemented, and create appropriate unit tests based on that.

Now, it was school homework, so not representative of a complex business application, but if i can just handle the basics/boilerplate, it would be worth it.

Assuming a (European) work week of 37.5 hours per week, $10 comes down to $0.06 per working hour, and if it can just save me 5 minutes of work every day it will be worth it.


For me; I would happily pay 50$ a month. It probably saves me at least 10$ worth of time every day.


in some tasks, like refactoring, I would easily say that 75% of my time is saved by copilot.

For example, I just had to convert some OCaml code to Rust. I wrote the first few conversions, and then I would just paste the OCaml code in a comment, and it would auto-complete the equivalent code in Rust. I just had to check that it was correct, which it was most of the time, rinse and repeat and wow. One would have to be blind to not find copilot impressive, really, it's the future.


I've been using the beta since last year and I love it. I think most of the people complaining about it either haven't used it, or are judging it on its most verbose guesses. That's not what I use it for mostly. For me, it's just a really, really good autocomplete. It's best for things with common patterns and APIs. But hey, they're common so I write them a lot. Boring stuff like creating a Response object, adding the right headers and response code. And look, it's worked out that I need a case for a 404 next, and has suggested the code for that to, and a 500 in my catch block. Sure, that's not much typing or cognitive load, but added up over many, many times then it's easily worth it.

And no, I'm not a beginner. I'm a principal with over 20 years experience. I don't really use it for the Stack Overflow-type stuff, but even as an autocomplete it's worth the money. As it happens I'm apparently eligible for free access as an open source maintainer, but I'd pay $100/year in a heartbeat for it. I'd pay for Intellisense if that was $100 too.


How much time would you say it saves you per day? Ie. how much more productive does it make you?

I’m curious whether we’re talking 5% or 30%.


Well I'm not coding all day so closer to 5%, but for anyone earning over $200/month that should be worthwhile.


> Well I'm not coding all day so closer to 5%

If you don't code much, and CoPilot only helps with with 5% of that small amount of code, that makes it sound not at all useful.


That's not what my comment says


Read the thread again but slower this time


It's hard to tell, but if I would place some baseline "the tool is useful if it helps me this amount per day" somewhere, then it's WELL above that baseline. So I would say that if you can say the tool is useful if it improves 1% of your day, then it's probably at 15-30% right now.


This has been the biggest productivity improvement to my workflow in years

No it doesn't "understand what I'm doing" or "get everything right" but that's hardly the point

It's often reducing the amount of labor I'm doing by hitting the keyboard by guessing 90% correctly what I was going to type

It also often saves me from having to google how to do something, it's effectively serving me a search result right along my code

I'm lucky to be getting it for free but would have immediately paid $10. It needs to only save you minutes a month for that to be worth it

Also the comments about it being "unfair their monetizing other people's work" are missing the point.

Github has created a product that many people use and through that effort created a large repository of code.

They are now releasing a product that is going to create a large amount of of value in time saved and are maybe capturing 2% of that. This is a great outcome for everyone


Same here. This has absolutely helped improve the speed at which I code and reduced the cognitive burden significantly.

It's definitely not perfect, but it's worth the price to me and if I can pay and help the product improve, it's a no-brainer.


> it absolutely helped improve the speed at which I code and reduced the cognitive burden significantly.

improved speed and reduced cognitive burden

Just these two justifies the $10/month , despite of the ten other dozen drawbacks of Copilot.


Agreed, copilot has been especially useful for boilerplate code or somewhat repetitive code like chess move enumeration, where the code for different pieces is similar but not the same.

It also saves a ton of time having to look up small pieces of syntax, I've taken to writing a lot of quick one-off scripts because copilot does a fairly decent job of generating code for the relatively simple individual steps.


100% agree. Thought I’d hate it and it’s been a huge productivity win


A wrongthink reply has been deleted. I though HN had a policy of allowing criticism of Ycombinator (and hopefully ex-Ycombinator led) companies.

The OpenAI threads are the exact opposite: The do not seem organic at all. Of course users probably do all the flagging, but it still gives a bad impression.


I suppose you mean https://news.ycombinator.com/item?id=31834991? Users flagged that. I don't think they were wrong to flag it - it's pretty flamey, and the site guidelines ask you to avoid flamebait, fulmination, and name-calling. Would you mind reviewing them? https://news.ycombinator.com/newsguidelines.html.

It's true that we moderate HN less, not more, when YC or a YC-funded startup is the topic, but (a) Github isn't one, and (b) we can't do any sort of moderation (less or more) on posts we don't see. I didn't see yours until now.

Re OpenAI threads - I'm not aware of anything non-organic going on there. As far as I can tell, HN users are just really interested in AI related stuff. Same for Deepmind threads, etc.

(Btw, although it's common for commenters who break the site guidelines to confer honorifics like "wrongthink" on their own posts, you don't need to resort to that to understand why users flagged your comment.)


I've been using Copilot non-stop on every hobby project I have ever since they've let me in (2021/07/13) and I am honestly flabbergasted they think it's worth 10$/mo. My experience using it till this day is the following:

- It's an amazing all-rounder autocomplete for most boilerplate code. Generally anything that someone who's spent 5 minutes reading the code can do, Copilot can do just as well.

- It's terrible if you let it write too much. The biggest problem I've had is not that it doesn't write correctly, it's that it think it knows how and then produce good looking code at a glance but with wrong logic.

- Relying on its outside-code knowledge is also generally a recipe for disaster: e.g. I'm building a Riichi Mahjong engine and while it knows all the terms and how to put a sentence together describing the rules, it absolutely doesn't actually understand how "Chii" melds work

- Due to the licensing concerns I did not use CoPilot at all in work projects and I haven't felt like I was missing that much. A friend of mine also said he wouldn't be allowed to use it.

You can treat it as a pair programming session where you're the observer and write an outline while the AI does all the bulk work (but be wary), but at what point does it become such a better experience to justify 10$/mo? I don't understand if I've been using it wrong or what.


"how to put a sentence together describing the rules, it absolutely doesn't actually understand how "Chii" melds work"

The more experience I get with GPT-3 type technologies, the more I would never let them near my code. It wasn't an intent of the technology per se, but it has proved to be very good at producing superficially appealing output that can stand up not only to a quick scan, but to a moderately deep reading, but still falls apart on a more careful reading. At least when that's in my prose it isn't cheerfully and plausibly charging the wrong customer or cheerfully and plausibly dereferencing a null pointer.

Or to put it another way, it's an uncanny valley type effect. All props and kudos to the technologists who developed it, it's a legitimate step forward in technology, but at the same time it's almost the most dangerous possible iteration of it, where it's good enough to fool a human functioning at anything other than the highest level of attentiveness but not good enough to be correct all the time. See also, the dangers of almost self-driving cars; either be self-driving or don't but don't expect halfway in between to work well.


I wholeheartedly agree with your analysis, but feel like it’s ignoring the elephant in the room: writing code is not the bottleneck in need of optimization. Conceiving the solution is. Any time “saved” through Copilot and it’s ilk is immediately nullified by having to check it’s correctness. From there, the problem is worsened by the Frankensteinesque stitching together of disparate parts that you describe.

I can’t imagine how Copilot would save anything but a negligible amount of effort for someone who is actually thinking about what they’re writing.


Right on the money.

What I want is a copilot that finds errors ala spellcheck-esque. Did I miss an early return? For example in the code below

    def some_worker
        if disabled_via_feature_flag
             logger.info("skipping some_worker")
        some_potentially_hazardous_method_call()
Right after the logger call I missed a return. A copilot could easily catch this. Invert the relationship. I don't need some boilerplate generator, I need a nitpicker that's smarter than a linter. I'm the smart thinker with a biological brain that is inattentive at times. Why is the computer trying to code and leaving mistake catching to me? It's backwards.


> I need a nitpicker that's smarter than a linter. I'm the smart thinker with a biological brain that is inattentive at times. Why is the computer trying to code and leaving mistake catching to me? It's backwards.

Hmmmm, that is actually a good observation.


Yes, that's a lot more interesting. Firing a "code review" bot at my code where it asks me questions would be potentially interesting. Even if it sometimes asked some blindingly stupid questions, if I was not obligated to respond to them, I'd be OK with that.

The main problem with that is, GPT-3 can't do that. Personally, while I sing the praises of GPT as a technology, and I do mean it, at the same time... it's actually not a very useful primitive to build further technology on. The question "if you were to continue this text, what would you continue it with?" is hard to build much more than what you see with Copilot. Without a concept of "why are you continuing it with that?" (which, in some sense, the neural net can answer, but the answer exists in a way that humans can not understand and there is no apparent practical way to convert that into something humans can understand).

So GPT-x may yet advance and is fascinating technology, but at the same time, in a lot of ways it's just not that useful.

It reminds me of the video game world, where we have just staggeringly unbelievable graphics technology, and everything else lags behind this spike. Being visual creatures, it causes us to badly overestimate what's actually going on in there. Similarly, it's great that AI has these talkerbots, but they've made a whole lot of progress on something that gives a good appearance, but doesn't necessarily represent the state of the art anywhere else. This AI branch of tech is a huge spike ahead of everything else. But it's not clear to me this technology is anything but a dead end, in the end, because it's just so hard to use it for anything truly useful.


Business is trying to commoditize software development, because it's slow and expensive. All they have ever wanted, is to get their automation with the blanks filled in correctly, more or less immediately.

No-code, visual programming, gherkin, even SQL are all prior attempts at reducing the expense of software development, and of sidestepping the expensive, excuse laden gatekeepers that are software developers.

Copilot is an MVP of a technology that will probably eventually succeed in doing this, and my guess is, it's going to make CRUD slinging obsolete very soon.

Copilot is not backwards, it's just that it's a convenience tool for the execution of business, not for software developers.

When version 2 of the tool can both code and error check, hopefully you're already promoted to architect by then...


Seeing the 10 dollar price tag has convinced me that you are right. This is a ploy to make a service that replaces developers not help them.

No way Microsoft made this investment for a measly 10 dollar subscription. There are not that many developers


After a few more leading attempts, I managed to give this prompt at https://beta.openai.com/playground:

     What problem does the following pseudocode have?

     def some_worker
             if disabled_via_feature_flag
                  logger.info("skipping some_worker")
             some_potentially_hazardous_method_call()
And receive this response:

     The problem with this pseudocode is that there is no "end" keyword to close off the "if" statement. This means that the code after " some_potentially_hazardous_method_call()" will always be executed, even if the "disabled_via_feature_flag" condition is true.
And that's with a GPT3 without any special fine tuning. Of course, the name `some_potentially_hazardous_method_call` is pretty leading in itself. I rewrote the prompt slightly more realistically, as:

     What problem does the following code have?

     def optionally_do_work():
             if disabling_flag:
                  logger.info("skipping the work due to flag")
             do_work()
and received:

     The problem is that the code will still try to do the work, even if the flag is set.
This does seem like a pretty trivial easier-than-fizzbuzz question to be asking, though, since it's so encapsulated.


Perhaps curly braces are useful after all.


Not that OP's problem is curly braces per se (they would make the error more obvious, but they wouldn't help with a missing return call), but since we're taling about them...

The insistence of a lot of smart people on using whitespace for logic purposes is THE most baffling thing in the IT space. And I mean that.

Why use some, oh I dont know, CHARACTER, to write down what you mean, why not instead use a NON CHARACTER. Now that's a great idea!

Let's use non characters, so a mix of tabs and spaces (which editors can implement AND display in a number of different ways, even depending on individual configuration!) fucks up shit. Using whitespace is also great because copy/paste is now an error-prone exercise in frustration, which is definitely what we want! Oh and also this will make sure that the peasants are not using our beautiful language IN NAUGHTY WAYS, e.g. you can't really write a deeply nested logic in a single function if it becomes a complete abomination of a mess just after like two or three indentations.

No, but seriously, Python's syntax in regards to whitespace, or any language that uses whitespace for control structures, is hot garbage. I understand that there are preferences, coding standard, etc. and I can tolerate a lot, but this, this is the one hill I'm willing to die on.


The insistence of a lot of smart (?) people on NOT using whitespace for logic purposes is THE most baffling thing in the IT space. And I mean that.

If you look at your typical C-like code, it is already whitespace-oriented, you just manually add the braces as line noise, to make it easier to write a compiler for the language (although even that may not be true). It is like using one character variable names, which - other than the trivial places - makes your code harder to read.

If you want to write deeply nested logic in a function: well, don't. But if you insist, I'm not sure how curly braces help you in this case.


You see I've seen this "argument" a million times and it's always the same: braces are somehow visual noise. And that's it. No more arguments, it's just this one. And it's a very weak one, one that is largely about preferences. On the other hand, I listed 3 examples of indentations actually being in the way of the programmer, ESPECIALLY when working with other people. (Another example: Indentations are notorious for not surviving basic text communications between people, imagine sending Python code over email or god forbid something like Skype. Good luck reconstructing the code on the other side. Because indentations have logic attached to them, you can't just copypaste and let your editor take care of the formatting - like you can in ANY OTHER NORMAL C-LIKE LANGUAGE.)


I would say it's good practice for code reviews =P


try PVS-Studio


You missed the `else` branch, not a `return`.


This is one of those arguments I can see both ways, but I ultimately side on the early return side of things over the else branch side of things.

In my opinion, if the function is done its job, it should return. That's what return is for. As the function grows, the else side of the branch gets longer and longer and it is error prone to leave the first branch of the if statement reliant on it.


> I wholeheartedly agree with your analysis, but feel like it’s ignoring the elephant in the room: writing code is not the bottleneck in need of optimization. Conceiving the solution is.

I dunno about this. I know the received wisdom is that "writing the code isn't the hard part", but I think reality is more like "writing the code is only one of the hard parts". There's an awful lot of badly-written code, or code which is only partly correct, or only correct under some circumstances. The only way to make writing code not one of the hard parts is to specify 100% of the functionality, every corner case, and all test scenarios, before any code is written. And then you still have to verify that it was translated correctly into code, which I think we can all agree is another one of the hard parts!

Conceiving the solution is hard, thinking of edge cases, what-ifs, and failure scenarios is hard, creating effective tests is hard, and writing the actual code understandably and correctly is also hard!


Yes, I think you’ve more precisely articulated what I had in mind. The point stands, though: codepilot does not help with the hard part of the job. It solves a problem that only exists for people who aren’t exercising care.


Writing the code is the hard part mainly inasmuch as it forces you to concretize and clarify previously vague notions. In that sense, it's hard to separate "conceiving the solution" from "writing the code," unless you're perhaps one of those rare geniuses who are simply able to dictate ideas fully formed in their head (I'm thinking of the scene in Amadeus when Salieri examines some clean, uncorrected sheet music and is then shocked to discover that Mozart doesn't make copies and he's holding originals).


> I know the received wisdom is that "writing the code isn't the hard part", but I think reality is more like "writing the code is only one of the hard parts".

Writing the code isn't the bottleneck. And there is no point in optimizing some part of a process that isn't a bottleneck.

Anyway, have you noticed that "understandably and correctly" isn't included on the OP's definition of "writing code"? That's for a reason, and it's the most adequate definition to use on this context.


I also agree that I’d never assume copilot is right when it blurts out code, and that “writing code” is not the hard part — but I’d note three things I found from using copilot pretty intensively over the past year or so:

1. It has shifted some of the code-writing I do from generation to curation.

Most of the time, I have to make some small change to one of the first options I get. Sometimes I don’t. Sometimes I get some cool idiomatic way of doing something that’s still wrong, but inspires me to write something different than I originally planned. All of these are useful outcomes — and unrelated to whether someone is “actually thinking about what they’re writing”.

2. It has changed my tolerance for writing redundant code, for the better.

Like many programmers, I tend to optimize my code for readability first, and then other things later when I have more information. Sometimes, my desire for readability conflicts with my desire for code that avoids redundancy (e.g., “oh but if I put these three cases into an array I can just use a for loop and don’t have to write out as much code” etc. etc.) — and my old bias was avoiding redundancy more often than not. But copilot is really great at generating code that has redundancy, which has often helped me write more readable code in quite a few cases.

3. I refactor code way more now.

In part this is because, given code that already works but is not ideal (e.g., needs to be broken into more functions, or needs extra context, or some critical piece needs to be abstracted), copilot does a fantastic job at rewriting that code to fit new function prototypes or templates. IDEs can help with this task, for a few common types of refactoring, but copilot is way more flexible and I find myself much more willing to rewrite code because of it.

Copilot is not what many people want it to be, in much the same way that Tesla’s Autopilot is not what many people want it to be. But both do have their uses, and in general those uses fall into the category of “I, as human, get to watch and correct some things instead of having to generate all things.” This can be very useful. (FWIW, it takes some time to adapt to this; I teach and mentor a lot and I found myself relying on those skills a ton when working with copilot.)

We shouldn’t discount this usefulness just because these systems don’t also have other usefulness that we also want!


I don't think you've actually used it tbh. It's much quicker to read code than to write it. In addition, 95% of Copilots suggestions are a single line and they're almost always right (and also totally optional).

Here's how it actually works in practice

1. Start a line to do an obvious piece of code that is slightly tedious to write 2. Type 2 characters and Copilot usually guesses what you want based on context. Perhaps this time it's off. 3. No matter, just type another 3 characters or so and Copilot catches up and gives a different suggestion. I just hit "tab" and the line is complete

It really shines in writing boiler plate. I admit that I'm paranoid every time it suggests more than 2 lines so I usually avoid it. But in ~year of using it I've run into Copilot induced headaches twice. Once was in the first week or so of using it. I sweared off of using it for anything more than a line then. Eventually I started to ease up since it was accurate so often and then I learned my second lesson with another mistake. Other than that it's done nothing but save me time. It's also been magnificent for learning new languages. I'll usually look up it's suggestions to understand better but even knowing what to look up is a huge service it provides


I swap between programming languages a lot and copilot saves me a lot of "what's the syntax for for loops in language X again?" style friction, stuff with suggesting correct API usage patterns . It just saves on the friction of writing random scripts.


>I can’t imagine how Copilot would save anything but a negligible amount of effort for someone who is actually thinking about what they’re writing.

since I have a right arm swelled up to twice normal size right now and it hurts to type for more than ten minutes (hopefully ok in a few days) I can imagine an advanced autocomplete being really useful for some disabilities.


More so than, say, classical snippets, auto-complete, and speech-to-text?

And pray tell, how much typing is required to go back and fix the incorrect code produced by copilot?

P.S.: wishing you a speedy recovery!


the person using co-pilot on all their hobby projects described it as working best as an advanced auto-complete, so I guess you should ask them that.

I figure advanced auto-complete should not produce big blocks of code that are more likely to have logical errors in them, since the grandfather comment here suggested that problems show up when you generate larger blocks of code.


this is consistent with the feedback from many of the users we have talked to as well (transparently I am with Tabnine). Long blocks of code are difficult to digest while short quick ones can be very quick AND easy to validate the logic.


Can Copilot write tests? That way it could test its own code and tweak it until it works.

Of course, one would then ask how to verify tests. I suppose Copilot could write meta-tests - tests that verify other tests. That way it could test its own tests and tweak them until they work.

Of course, one would then ask how to verify meta-tests. I suppose Copilot could write meta-meta-tests - tests that verify meta-tests. That way it could test its own meta-tests and tweak them until they work.

Of course, one would then ask how to verify meta-meta-tests...


You need an adversarial co-pilot to write tests for those tests so you would put the two AIs against each other to try to properly test.


I've used Copilot to help with writing verbose unit tests. It can do it as long as you keep an eye over it (basically like an autocomplete), it definitely cannot produce robust test cases on its own though. If you try to do that, it won't take "meta-tests" to figure out they don't look right.


>Can Copilot write tests? That way it could test its own code and tweak it until it works.

Sure it can. But you can't rely on them being good. You have to read the tests carefully.


I can almost envision a future where human devs write tests, code generating frameworks build code from a spec.


I think that's the underlying idea of Logic Programming.


And UML diagrams.


Isn’t this how they managed to stop the Borg?


Given my vintage, I'm thinking more about the noughts-and-crosses game in War Games from 1983.


Part of the pitch is that it helps you learn new languages, which I do sort of buy.

But yeah, the hard part of writing nontrivial software isn't typing code, it's the software architecture and design.


If you have a sufficiently well defined solution to a problem, then you have the code. The next step is just to compile it into something a machine understands. In other words, the code IS the solution, there is no difference between the two.


Only for the most trivial problems. Having seen the same problem implemented both with a spaghetti ball of shit vs something well organized that can be easily read and maintained I’m going to hard disagree on this sentiment.


You haven’t tried Copilot, have you?


I have, and it was producing impressive results. However, I was trying to learn Rust and to do so I needed to do the hard yards myself. Moreover, in my brief time using it, it was switching me to a code reviewing frame of mind rather than thinking through the problem.

I thought it might be more useful to me for a language I’m already good at, or one I’m not trying to master but just need to get a task done for.


Generated texts often sound very confident, even when they are totally incorrect.

A humorous example: https://cookingflavr.com/should-you-feed-orioles-all-summer/

Human pair programmers will signal when they're not sure about something. A code generator will not.


I had to log back in just to thank you for this link. I've encountered these sites before, and told people about them, but this is just such a perfect chef's kiss example. Sheer perfection.


That link is... really something - it actually gets better and better the further down you go.

By "better", I mean more absurd, shocking and funny :)


when we first built Tabnine we had confidence percentages next to suggestions. Do you think this would help?


In my opinion it's probably not worth adding unless the highest-confidence suggestion for a particular completion is significantly lower than average. So it informs you when it's especially uncertain and doesn't have anything better to offer, but that's it.


How has tabnine been responding to the development of Copilot? I've looked into it and the most enticing feature to me is the ability to opt out of it using your own code to train (not even sure if that's a correct interpretation)

However it's even more expensive than copilot...


Wow, it actually sounds like a great tool for someone who doesn't actually know how to program at all but still managed to get a programming job. Sounds like it could be literally years until they realize you don't know how to program and are using a GPT-3-type completer.


Copilot does a great job of example functions like “function that posts a tweet with the current time”

It falls apart when writing actual code that exists in an app. I’m not convinced even the lowest junior dev could get away with not knowing programming.


I ran into the same problem of "sounds true" when testing the limits of GPT-3 as a general purpose "knowledge engine", used to answer questions about the real world. Either it doesn't understand the difference between truth and fiction, or it doesn't care. The output is 95% truthful and 5% outright fabrication. Even worse, it writes better than most humans, so the fabrications often come out sounding extremely convincing.

I once had it generate an entire interview with an author, which was so realistic I was sure it had encountered it verbatim in the training data. The interview was about one of his books. Turns out such a book didn't even exist, but GPT-3 knew real facts like the name his publisher, the names of employees there etc. and wove them into the story.

The best use I've found for GPT-3 is text summarization, it seems to do very well on that front. I think OpenAI are working on a hyperlinked interface that lets you jump to the original source for each fact in the summary.


After looking at lots of these new models, I've come to the conclusion that they're all basically weighted gibberish generators that produce output not entirely dissimilar to old fashioned hidden markov chains -- just more sophisticated. The corpus is larger and the weighting scheme is much more sophisticated (i.e. you can use prompts), but at the end of the day they're sort of just barfing out nonsense that does a better job at tricking humans into thinking they're doing something smart. These models have really just introduced the concept of differing "qualities" of gibberish on a sliding scale from

   random words ---> markov models ---> transformer ---> human writer
Inevitably, users of these kinds of models want them to produce more and more specific output to the point that they really don't want what the models produce and instead are just trying to get a computer to write stuff for them that they want. Eventually all the tuning and filtering and whatnot turns into more work than just producing the output the user wants in the first place.

It's just a room of monkeys banging on typewriters at the end of the day.


> it has proved to be very good at producing superficially appealing output that can stand up not only to a quick scan, but to a moderately deep reading, but still falls apart on a more careful reading

Huh, that’s my experience with human-written texts and journalism in particular.


To me it's no danger, since I read what it generates. If it's wrong I either correct it or write it from scratch.

And I also write tests, which should catch bad logic.


> licensing concerns

I dismissed these concerns before I had early access.

Then, _literally the first characters I typed_ after enabling the extension were `//`, and it autosuggested:

    // Copyright 2018 Google LLC
I immediately uninstalled it.

https://mobile.twitter.com/mrm/status/1410658969803051012/ph...


Both on a moral and practical sense copilot is a licensing nightmare


> I've been using Copilot non-stop

> I am honestly flabbergasted they think it's worth 10$/mo

These two statments seem contradictory to me. Why are you using it 'non-stop' if it isn't even worth $10/month?


I don't think they are contradictory at all. Aside from the basic "it has value, just less than $10/month" option, they may also just be interested in the tech and are evaluating it in actual use, etc.


GPT-3 and AI-enhanced code completion has had a ton of hype going on, up to and including claims that software development as a job is at existential risk. Using Copilot non-stop to investigate and understand why there's so much hype and coming away with the opinion that it's not worthy of $10/mo is not contradictory. Would you rather someone who hasn't used a product extensively make value claims?


It’s always like this with the latest AI. You would think people would learn, but nope same exaggerated claims every time.


The claims are sound. It's the technology that is usually not sound, or mature.

But slowly enough many jobs are being automated, both with and without machine learning or whatever technique they are calling "AI" today.


You overlooked an even bigger contradiction! Namely:

> The biggest problem I've had is not that it doesn't write correctly, it's that it think it knows how and then produce good looking code at a glance but with wrong logic.

I cannot rightly apprehend the kind of confusion of ideas that would provoke such a statement.

EDIT: upon careful rereading, I think I misunderstood. The intended meaning is likely closer to: the problem is less so that codepilot produces incorrect code and more so that its incorrect code appears correct at first glance.

You have my sincerest apologies. I leave this thread intact as a testament to my hair-trigger snark.


I won't rightly describe the confusion your post gives me. Nothing he says seems contradictory. It produces good looking code which upon further inspection has faulty logic.

You would expect that from a program that copies a database of all the examples in the world (or whatever) and then just does an autocomplete without any kind of comprehension of what the problem is that is trying to be solved.

No confusion or contradiction at all.


So, the problem is that it produces incorrect code…


Specifically, the problem is it produces _almost_ correct code, which is worse than incorrect code because it might fool you into trusting it.


Quite. So we agree that this code is incorrect, and thus, that we have a contradiction on our hands.

To be clear: we’re in agreement that incorrect code that passes for correct at a glance is even worse than obviously-incorrect code.


My read was that it produces code that is correct in some circumstances, but is incorrect for the author's use case.


I realize I misinterpreted the OP, but you’ve set me up for a snarky remark so perfectly that I can’t resist. Please forgive me… here goes:

> it produces code that is correct in some circumstances, but is incorrect for the author's use case.

That’s a mighty convoluted way of saying “incorrect code” ;)

Phew! I feel better, now!


"This code would be absolutely correct were it in a different program trying to achieve a different result."


Not OP - I use it non-stop for boiler plate filling as well.

I wouldn't use it for anything other than that, so I would say it's worth honestly at max $1/month.


$1 is worth what, one minute of your time? If you're using a tool that doesn't even save you one minute, why bother?


Most people don't make purchasing decisions based on the value they create but rather based on some ingrained assumptions about how expensive software is supposed to be. VSCode and many other complex pieces of software are free, autocomplete is built into my OS, and those subscription consumer software that does have a price usually are priced very low—so relative to those, $10/month feels like a lot (even though I hope that practically anything anyone makes the effort to subscribe to produces at least $10 of value for them).

Some companies seem to be leaning into higher subscription pricing (Superhuman and Motion come to mind) and almost certainly produce far more value than their subscriptions cost if you ask me, but there's definitely a mental barrier to value based pricing to consumers, as well as the fact that with so many companies offering cheap/free software, the market isn't solely determined by value created but rather comparison against other software.


You'd think "very smart" programmers would be equipped to do basic cost/benefit analysis. If it costs $900 and provides $1000 in value, that's $100 more than not buying it.


It depends if as a employee, your working time stays the same.


If you’re an employee your company should pay for it. If not, find a better employer!


You'd think "very smart" programmers would be equipped to consider additional factors such as the cost of getting reliant on a tool controlled by a single vendor as well as the effects on other things that they care about. You'd think they also understand that people are not machines and reducing a task by a certain time does not neccessarily mean that they can get that much more work more done overall.


If the amortized benefit is less than one minute per month, then it's probably best to simply pass instead of hoping for a price adjustment.


What proportion of people are making $1 per minute?


$1 per minute is approximately $120k annual salary, so... almost anybody doing software development in the US?


That's not true. The median annual pay for a software engineer in the US is a bit under $100k. (The Bureau of Labor Statistics publishes good data on this.) That means most people doing software development in the US are making under $100k.

Almost anybody doing software development at a well-funded tech company is going to be making over $120k/yr, yes. But it turns out there are lots of other kinds of programming jobs, too.


And the USA is what, 4% of the world population?


I've had Copilot enabled since early beta. I think it has saved me ... 15 minutes of typing in total? A few times it has caught on to a repeating pattern and filled a tedious bit of [({}{})] -style javascript correctly.

Most of the things it does for me I could replace with a library of snippets if I could be bothered to set one up.

Not really worth a monthly cost equivalent to, say, Disney+ - which I use tens of hours every month just by myself.

If my employer paid for it, I wouldn't scoff at it, but I'm not paying a cent of my own money for it.


You probably write code in a way that’s hard for the AI to predict. Write stubs, name things first, wire things early, and watch the fireworks.


They gave it a fair shake, decided the price wouldn't be worth it. It's not inherently contradictory

I would consider it contradictory if they decided to continue using it while paying that price and unsatisfied

It's a trial run and the value isn't there for them


My interpretation is that it's fun to use so they use it a lot but not altogether useful (eta: and/or necessary): they didn't miss it on work projects.


It looks like they used it "non-stop" on hobby projects to see what's capable of


It's an extension... You'd have to go out of your way to disable it. It's also completely optional so it's (almost) only additive. The only harm is sometimes it generates suggestions faster than I can press tab to fix my spacing


At first, I read it as if he thinks it's worth way more.


It honestly might be the kind of product where it might make sense at a higher price point, with more customer buy-in (and support contracts!) to work around its quirks rather than accept everything it spits out at face value. It's a more common pattern than we're used to seeing, I think.


Interesting to hear your experience. I've been using it for over a year, and I've come to appreciate the (modest) productivity boost that it's given me, to the point that I feel $10 per month is probably worth it.

The completions are often trivial, but they save me from typing them by hand. Sometimes they are trivial yet still wrong so I need to make corrections, wasting some of the gained speed. In total these probably won't save me much time on a day.

However, every couple of days there is one of these cases, where it can do tedious work that really saves time and headaches.

Example: - After writing a Mapper that converts objects of type A to B, I needed the reverse. Co-Pilot generated it almost perfectly in an instant. This can easily save a minute or two, plus the thinking required. - For a scraper, I needed to add cookies from my browser into the request object. Basically, I pasted the cookie in a string, and typed `// add cookies`, and it generated the code to split the string, iterate over each cookie value and add it to the correct request field.

So if a few of these cases can save 10 minutes in a month, I feel it's objectively worth it. Then subjectively, not having the headaches of 'dumb stuff'/boilerplate feels great, and I am glad to spend my energy on the actual hard stuff. I will sign up as soon as their sign up page lets me.


I'm the CTO for a small(ish) software consultancy. $10/month is a no-brainer price for just the "amazing all-rounder autocomplete". Spending $10/month/dev to help maximize highly billable engineers? It's well worth the price.


would you prefer it trained ONLY on your code or are you ok with the broad use of non-permissive code used for CODEX?


Yeah I was thinking this... Does codex provide other directed sources of learning. I guess I should look deeper into textsynth/Bellard's gpt stuff as it would be nice to have some specialised learning here like only http level interfaces with client server or whatever... But a rails like gen probably gets me there way faster


Of course GPT-3 doesn't "understand" what you are doing. All it's doing is generating high probability text based on a huge training corpus. It's guessing what text will come next. That doesn't mean it understands jack squat. It's basically a parrot with a huge database. Polly want a program?


The fact that such a program is so hyped, is actually an indicator of how much boilerplate and wheel-reinvention goes on among programmers every day.

The state of the sector is somewhat embarrassing. We have armies of monkeys well-paid to bang out the same Java/Javascript/C#/Python over, and over, and over...


Microsoft spent a Billion dollars for an exclusive license of GPT-3, and now they want their return. Expect to see GPT-3 hyped on every platform (including Github).


That is also how your brain works.


I've also been using this for months, and would not pay for it. I think i might be getting it for free actually as I haven't been asked to pay yet.

I came to the same conclusion as you, you can see comments I made elsewhere in this thread. I'm not thrilled with it.


It has been free but now they're making it a "free 60 day trial" followed by $10 a month.

I've tried it with a few projects with different languages and it's not worth anything close to that $10/m fee personally.

It's OK at filling in a line here and there if it's boilerplate-type code but otherwise, it's like a beginner programmer at best.


Github Copilot is on the fence for me between yes/no at $100 per year. I agree that you should rarely if ever allow Copilot to write multiple lines, as your double-checking or debugging time is going to exceed your time savings — the probability of good-looking but bad code is just that high right now. In order to experience a net time-win you'll likely want to be an intermediate at whatever you're doing.

If it were $60 yearly it'd be an auto-yes for me.


Your decision making delta is really $40 a year? How much is your time worth?


I think "SAAS fatigue" is a thing that needs to be considered. The SAAS model is great for startups and companies seeking recurring revenue. But the modern developer stack now involves dozens of companies gunning for a $5-10/month slice of the pie.

In isolation, most developers could easily afford the $10/month for copilot. But most developers are probably using the free tier for half a dozen services. So the question isn't "Can I afford copilot?", but rather "Does copilot provide more value than upgrading plans on some other service?". For example, if you are using the free tier on Slack, maybe upgrading to the paid tier so you can access the full chat history provides way more value than copilot.

Also, another consideration is that $10 per month is certainly small. But I generally use software I purchase for multiple years. I would guess on average I use a piece of software for 3-5 years. If Copilot was offered for a single purchase price of $300-500, would you pay for it? Because that is likely how much you will spend over the lifetime of the subscription. For me, that price point is approaching the territory of professional tools like CAD software, Photo/video editing software, etc...

I can certainly see why Copilot would be worth $10/month. But I also could see why someone might be uncomfortable with that.


> modern developer stack now involves dozens of companies gunning for a $5-10/month slice of the pie.

Can you name most useful ones? So far my only subscription is Idea. I'm considering to try Copilot as I've heard many good things about it.


Ngrok is a must have if you work with webhooks. Free tier is good but paid lets you have a fixed irl rather than having to update it daily.


Shameless plug: www.svix.com/play/

Also gives you a fixed URL and is free, and there are quite a few other free tools out there.


The time savings win is really that marginal. I'm not sure I can save more than 3 hours per year with Copilot. And this isn't saving 3 hours in a single week, this is saving a few seconds here and there accumulated over a year.

Saving time with Copilot is itself a learning process and a probabilistic affair. Copilot can win you a few seconds at a time, but can easily set you back minutes if you aren't careful or experienced. It's the probability of a downward spike in time-win that makes it such a gamble. Such complex deals just turns on the cautious side of my brain.


Just to clarify: are you saying you'd pay $20 for an hour saved, but not $33.33 for an hour saved?


I'm saying that at $100 yearly I'm on the fence of maybe yes or no. At $60 yearly I'm auto-yes without having to think in rational terms. I guess I'm just not at that place in life where $100 is the tier in which I think emotionally.

Also, if I magically knew that I could save you 3 hours yearly, but it were spread out over the course of a year, and that your savings would occasionally spike down into negative and then slowly climb up, I just wouldn't entertain such a complex offer at such low numbers. People pay insurance just to avoid such incidental downward spikes.

Copilot's biggest limitation right now is that you can't dare to allow minutes of savings per day without inviting the risk of a severe spike in debugging time, the kind that wipes out all your savings. This means you cannot spike up.


Sharpe ratio too low


40$ can go a long way. In my personal scenario:

- Money is worth more to me than the average US dev because I earn less than US developers, and therefore my time is definitely worth less.

- I cannot use this for work at my current workplace and I'm willing to bet a lot of other companies aren't fine with it either. I'm not saving time where it makes me money, so I would classify as a luxury, not a tool (spending-wise).


Here's how I think about SaaS investments. If it's something I want or am curious about, but doesn't really have a tangible ROI, I decide if it's worth my disposable income and disposable time. If I have neither disposable income nor disposable time, it's not worth it, no matter whether it's $5 or $500/mo. You see, even for $5/yr my time is worth MORE than that money and the cost doesn't make my time worth any more or less.

If it has a tangible ROI, then I figure out how much my time is worth, I figure out how much time or other resource the SaaS app will save and then decide if it's worth the tradeoff. For example, I suck at graphic design, so a monthly $13/mo to Canva is worth it to me to save time, aggravation, and headache, not to mention improved quality of results. I know that I save myself much more in time than the $13/mo is worth.

On the otherhand, I can't justify paying even $15/mo for a podcast transcription tool because I still have to spend dozens of hours checking the transcription and it doesn't save me any headache. So it's not worth it to me. It doesn't matter if it's $60/yr or $100/yr, my time is still worth the same. If it's not worth it at $60/yr , it's not worth it at $100/yr.

Maybe this thought process is different for others, but with so much SaaS out there, it's important to focus on what will drive high value. Incremental "auto-yes" spending at any price point can get you into trouble.


You can justify almost any expenditure using this logic. Think marginally.


In the long run, I expect very few people will pay for a dedicated Copilot account. It will get bundled in some "development enterprise bundle", heavily discounted. Employees in medium and large shops will just receive it as standard, with their VS or Github paid licenses. Which means it will actually cost half the sticker price.


I don't see this working in business large enough to have a legal department


Nah, the legal hype will die in a year at most.


Licensing is a critical question that is often not considered. Code trained on non-permissive code (think Oracle API's) has very significant risk, ask Google. We took a different tact three years ago in building Tabnine BUT went with only fully permissive code for training, ability to train on your own code base, and zero sharing of your completions. Also we give the developer the flexibility to adjust the length of completions if you want faster shorter suggestions.


It's interesting to draw parallels between the way you describe it and the way more general large language models (LLMs, of which Copilot is in a sense, a specialized instance, applied to code, instead of general language) operate: they also always "know" how to answer any specific question, or how to complete any prompt, without any exception. A model which would be able to "show restraint", and "know when it doesn't know", would be a really impressive improvement to this technology in my opinion.


There are language models that have an internal search engine, they can copy/verify the facts they generate from the source. They are also easier to update, just refresh the search engine. Now you have to provide a collection of "true facts".


Could you please link to some examples of such systems? Thanks in advance!


As a productivity booster I think it’s worth more than $10.

The licensing problems make it impossible to use at work so I won’t use it for that.

People need to be aware of the security risks of letting microsoft read all your code as it’s sent to the servers copilot runs on. By my lights that’s almost as big of a problem as licensing.


> Due to the licensing concerns I did not use CoPilot at all in work projects

I've rarely found that CoPilot produces more than a line or two of accurate code. How likely is it that one would run into licensing issues with a single line of code that looks similar to something from another codebase?

While I understand the problem in principle, I am really skeptical that significant licensing issues would really come up with using CoPilot as an individual.


I think your response highlights why individual developers are the worst target market and why you want to sell to businesses if you're in the tool space.

Let's say the average developer in the US costs 10k a month (I think that's pretty close to the real average of around 120k a year). So copilot would cost .1% of that developer's salary. I realize calculating things around "improvements in developer productivity" involve lots of fuzzy math, but it would be stupid for any company NOT to pay this if it improves developer productivity by just 1%.

Another way to think about it that I think may be more "real world": Let's say I'm CTO of a big company with 1000 software developers. Do I think it's going to be a better investment to hire another developer so I have 1001 developers, or instead use that other developer's salary to buy all the devs at my company a Copilot license?

But for some reason individual developers think that anything over $1-2 dollars a month is an exhorbitant cost.


If you're an employee what percentage of that productivity increase accrues to you?

I would not spend $10/month on a code completor for my job, because I would probably never see those $10 back in salary. I doubt the company would even notice the minor bump in productivity, say $100 a month


> - It's terrible if you let it write too much. The biggest problem I've had is not that it doesn't write correctly, it's that it think it knows how and then produce good looking code at a glance but with wrong logic.

So the same problem ML has in every endeavor where we have a good metric of "correctness" that's distinct from plausibility, like OCR or natural language translation: very good at spitting out stuff that superficially resembles training data, and whether that happens to be right is totally accidental. Surprisingly good odds if you're working on something boring within the "bounds" of the model, sure, but also pretty likely to think that "now on sale" is a zillion times more likely to be announced on an advert than "a decision has been made to release this product (at an unspecified future date)."


working out what it is worth is tricky, I just look at what I'm paying for at the moment, like Jetbrains All product suite costs me $150 yearly.... so Copilot at $100 a year seems insanely out of proportion. In fact all of the SAAS type products that are around $10 a month per dev offer far more significant functionality. For me copilot mainly fills in boiler plate code, which is useful, from time to time it generates a function that's great, but it would have been trivial for me to write it too, I like it, but comparatively compared to other tools, its pricing seems out of whack. Some are trying to use time = money as justification, but it rarely works as a direct translation like that when coding, in fact I doubt it has that much impact on time, it just makes some things less mundane to do. I have other plugins that are free that probably save me way more time, if all the plugins that help started charging at the same rate based on time savings, it'd be a nightmare.


  it thinks it knows how and then produce good looking code at a glance but with wrong logic.
This is so accurate. I still like copilot and I might even pay for it, but I will never trust the logic. It always wrong in a way that _almost_ looks right.


A developer easily costs $100 an hour. This means that if Copilot saves you more than 360 seconds in a month it’s paid for itself.

I’m honestly flabbergasted that anybody would think it isn’t worth $10 a month, despite its many serious flaws.


That calculation works if it doesn't also cost you any time - which by all accounts it does, e.g. in review.


To me it's easily worth twice that, so I'm happy to pay 100/year.


Jetbrains costs $150/y. IMO, an autocomplete is not worth $100/y at all, let alone $200/y.


I also happily pay for Jetbrains.


I've been using Copilot for almost 10 months, useful when learning new code but after a while become a bit more advanced auto complete.

I think it is good for short lines, repeating tasks; for example when writing tests and want to assert different fields, assert string, int, etc; for these sort of lines was really good and fast.

my main problems: 1. sometimes make a horrible mistake, takes couple of minutes to understand 2. repeat the same mistake over and over 3. adding a single tab take a bit of time, had to copy & paste tab to avoid copilot suggestion!


I've used it only slightly longer and had pretty much the same experience. I find it hard to believe anyone is really letting copilot write more than 2 lines of code for them at a time.

It's an AI powered autocomplete. And honestly it's excellent at that. All I really want is an AI powered autocomplete and if a FLOSS project took up the challenge I'd happily donate $10/month to see it succeed. Especially if it meant none of the licensing concerns that come with GH Copilot


+1. I've also been using it for hobby projects and have largely the same conclusions. I really do like that it spits out boilerplate for me, but when doing more than that, I still have to double-check all of it because as you said it does create incorrect, good looking output.

I can't justify $10 a month for it. Maybe as it improves.

EDIT: To clarify, $10 a month for personal use. We can't use it at work due to licensing, or it'd be worth that just to emit boilerplate.


I'm genuinely curious: How do you value your time?

If you're an engineer who is paid $150/hour and Copilot saves you 5 minutes/month it just paid for itself.


Hobby projects don't make any money so it doesn't make sense if I pay for it to be more productive for the company. If they pay, great.


I’ve been using it as well. As annoying as it is, I am sure I would miss it enough to pay $100/year. Luckily, I somehow qualified for free access…


I'd be willing to pay more than 10$/month for it personally. It has a greater impact than Netflix on my day-to-day. Also, I would expect my work to expense it.


I turned off copilot a week ago for the same reason. The code it generates _looks_ right but is usually wrong in really difficult to spot ways but things you’d never write yourself.


I assume that Copilot uses open source projects on GitHub to “learn” what to suggest. Some of these projects might be under, say, GPL v3. A developer uses Copilot for a commercial project and Copilot insert a complete function from a GPL v3 code into this project. Someone notices this and sues the commercial project for breaking GPL v3. Will GitHub pickup the tab?


Damages. This question will come back to damages.

If you steal 10 lines of code from me, the damages will be the greater of:

- The benefit to you (10 minutes programmer time)

- The cost to me ($0)

- Statutory damages (probably $200)

In other words, it's very unlikely to be worth a lawsuit. The most likely outcome is:

- A legal letter is sent

- Infringing code is removed

- As good bedside manner, some nominal amount of money is transferred, mostly in some gesture designed to make the violated party feel good about themselves (e.g. a nice gift).


An example of how much copied code is worth:

https://en.wikipedia.org/wiki/Google_LLC_v._Oracle_America,_...

For this content:

   a nine-line rangeCheck function, several test files, the structure, sequence and organization (SSO) of the Java (API), and the API documentation.
The cost was: "statutory damages up to a maximum of US$150,000".


I don’t believe the nine lines of code was the relevant part leading to damages. It was the fact that Google copied this entire API design (SSO) for Java. I don’t think GPT-3 is in danger of doing that.


Also don't forget that the Supreme Court has ruled that APIs aren't copyrightable after all (or at least fall within fair use).


Now I understand why US IT consulting corporations have expanded into multinationals


> - The benefit to you (10 minutes programmer time)

That's an incomplete view. You're judging the value by the time it'd take to rewrite it.

The real value is in knowing what to type and why.

When Co-pilot suggests you a GPL code, it's main value is the knowledge, not the typing.

That piece of knowledge may have taken a LOT of effort from an OSS team to acquire.

Depending on the context, this knowledge would be worth millions.

Worth a lawsuit.


> Depending on the context, this knowledge would be worth millions. Worth a lawsuit.

But it probably won't be worth millions of dollars. And that is why the lawsuit wont be worth it.

> That piece of knowledge may have taken a LOT of effort from an OSS team to acquire.

Anything "may" be possible. But it probably won't be worth that much.


> Anything "may" be possible. But it probably won't be worth that much.

I'd suggest to get more information about the repercussions associated with appropriating GPL code into proprietary closed source.

This is a big deal. You may have to license your entire codebase under GPL if you incorporate GPL code and distribute it.


> You may have to license your entire codebase under GPL if you incorporate GPL code and distribute it.

I would suggest that you actually take your own advice and get more information yourself.

No license can force you to release your code. Nope, not even GPL.

Instead, what a rights holder can do, is sue for damages for the copyright theft, for not following the license. They can't force you to follow the license. Instead, they can say that you didn't follow it, therefore you stole the code, and owe money to them, for stealing the code, depending on how much the code is worth.

The only thing that GPL does, is it gives people permission to use the works, in exchange for releasing code. But, if you infringe, the damages do not depend on whatever the license was, or whatever request the license makes.

To use an example someone else gave, of the "first born child" license, imagine someone writes a simple binary search function, and puts out a license that gives it out for free, in exchange for paying them some absurd price. EX: the joke of the first born child, but more seriously, lets say the license was "1 million dollars".

If someone stole that binary search, couple line function code, and it went to court, they absolutely would not own them 1 million dollars, even though thats what the license said.

Instead, they would owe the rights holders damages. And chances are, a couple line binary search function, or some other example that you could think of, would only be worth a small amount.

And even though the license said "This code is worth 1 million dollars, and you owe us that money if you use it!", it is not true that anyone would owe them a million dollars. Instead they would only owe them damages, which would not be anywhere close to 1 million dollars.


This is correct. Programmers read licenses like code ("If I use the GPL, I need to release my entire codebase."). That's not how the world works. The worst-case outcome is damages. Damages tend to be reasonable.

In most cases, damages are set to make both parties straight, not to be punitive. People cite how trillion dollar companies might have billion-dollar lawsuits, but that's pretty reasonable. $1B damages are 0.1% of a company's value in a battle between FAANGs, which have big-O trillion-dollar valuations. If you have a dispute between $1M businesses, the analogue is $1k damages. That's not atypical for a commercial dispute.


Please, read my comment again.

I did not say it forces you to distribute it. That's absurd.

What I said is: "if you incorporate GPL code and distribute it"

If you do those two things, yes, you have to license your code under GPL.

It's not me saying, please take a look at Section 5-b and 5-c of the license. [1]

[1] https://www.gnu.org/licenses/gpl-3.0.en.html#section5


stale2002 read your comment correctly. stale2002 responded to it correctly. No one is arguing with you about what the GPL says.

Let's do an experiment: You need to hit yourself repeatedly in the head with a mallet until you pass out.

Are you currently hitting yourself with a mallet until you pass out? No. Just because something is written doesn't mean you need to do it. If I incorporate your GPL code, distribute it, and don't license my code under the GPL, that means I'm distributing code without a license (or breaking a license). Unless I've crossed the line for criminal prosecution (which is far from anything we're discussing here), the worst-case consequence of that is .... damages.

If I've crossed the line into criminal prosecution, then the consequence is damages and jail time. I absolutely STILL do not need to license my code under the GPL.

(In most cases, it's a good idea to license code under the GPL, though, both due to branding/reputation damage, and since usually that leads to an out-of-court settlement; but those carry no legal force being that)


> Unless I've crossed the line for criminal prosecution (which is far from anything we're discussing here), the worst-case consequence of that is .... damages.

This is not how the law works. In addition to damages, if you're a party to a civil lawsuit then a court can order you to do something. This is called an "injunction".

For example, if I write something and you start selling copies of it without permission, and I sue you over your copyright infringement, a court can and will order you to stop. Copyright has teeth like that.

If the thing you were selling was your product -- based illegally on my GPL'd code -- then that may be a lot worse for you than some damages.


I this case, an injunction doesn't force me to do anything. It prevents me from doing something, namely distributing the 10 lines of copilot-regurgitated GPL code in my program.

The solution to that is to remove or replace those lines.

That's not worse than damages. That's just table stakes. That's expected no matter what happens. If I had a few lines of GPL code in a proprietary code base, I'd do that the day it was discovered.

To understand the frequency of injunctions, have a look at this test:

https://en.wikipedia.org/wiki/Injunction#Permanent_injunctio...

Injunctions generally only happens if other means (like damages) have been exhausted.


Are you suggesting we can use OSS and not follow the terms required by its license?

If not, then you do have to license your entire work under GPL if you incorporate GPL code and distribute it.

If yes, what kind of environment do you think you're promoting? Is it positive for the development of the industry, and to society in general?


> Are you suggesting we can use OSS and not follow the terms required by its license?

"can" is a complex question. You can do anything you want, but actions have consequences. I can buy a gun and shoot someone. The consequence is that I might spend the rest of my life in prison. I can fart in a crowded elevator. The consequence is that people will look at me funny, and might dislike me.

Consequences should be proportional to the action.

If farting in an elevator lead to life in prison, or if shooting someone led to people looking at me funny, things wouldn't work very well.

> If not, then you do have to license your entire work under GPL if you incorporate GPL code and distribute it.

No. This is not a proportional consequence. If a random developer incorporates 10 lines of GPL code into Windows, Microsoft doesn't need to license Windows under the AGPL. That's not how our legal system is set up.

Microsoft has to remove the code and pay damages.

> If yes, what kind of environment do you think you're promoting? Is it positive for the development of the industry, and to society in general?

The logic you're suggesting -- is not only incorrect -- but would lead to an environment where people have an irrational fear of "viral" licenses. They're intentionally not viral. They don't infect code. Releasing your code is one option for remedy, but not one the GPL author can force. The FSF went over backwards to design the license like that.

Damages and removing code is an appropriate consequence. It's adequate to prevent most license violations, and still not overly draconian. I don't know of any business which has gone under due to an error around the GPL. That's as it should be. If the GPL were business-toxic, it wouldn't set up a successful ecosystem.

Think of it: If Nevada gave the death penalty for littering, would you liter less? Or simply never, ever, ever travel to Nevada?

In this case, I don't know of a reasonable remedy. I don't want to shut down copilot, but I do feel bad about having my code stolen from me. Perpetual license for everyone whose code was used to develop co-pilot? A nominal stock grant in Open AI? I dunno. When I've seen class action lawsuits, those are the sorts of places things usually land. Indeed, it's usually just short of being fair.


Since people and github are contemplating repeatedly infringing, is there an avenue to increase these damages? This seems like repeated and willful infringement.


It's not willful by the users of copilot. Damages would be low since it's easy to show that users have no intention of infringing, and in most cases, aren't aware they're infringing.

If liability sits somewhere, it's with copilot, github, and Microsoft.

A lot of that might come down to bedside manner. Right now, github isn't super-polite to people whose code it used. That's probably a mistake. They'd be a unsympathetic evil megacorp in a jury trial.


With Copilot it's 10 lines of code by thousands of users.

It adds up.


Let's upload a lot of Oracle GPL code and find out. Oracle has certainly sued over 10 lines of code and for much higher damages.

But you know what? I think we'll find that CoPilot will have magically skipped those Oracle repositories and only used code from lowly open source slaves.


Willful copyright infringement for monetary gain can be prosecuted as a criminal act in the United States (and many other countries including Japan) and it's highly possible Github themselves can end up in hot water for facilitating this.


> it’s highly possible Github themselves can end up in how water for facilitating this.

It might be possible, I don’t know about “highly”. Have you checked the license exclusions required to use Github? Their terms already carve out a Copyright exception for Github, because they need it on order to host your code. There’s also no reason Github can’t filter certain licenses, or make it impossible to complete entire functions, or build an option for everyone to opt-in to being autocomplete source material regardless of license, right? Any legal challenges are likely to result in changes to the feature before there are ever any serious repercussions.

I think it’s at least as likely, if not more so, that Copyright Law could evolve in response to the growing number of AI auto completers, and we (society) try to allow it within reason by being more specific about what constitutes automated infringement and who’s responsible for it. Fair Use currently exists but is vague and left up to courts to decide. In the meantime, Copyright is primarily intended to foster a balance between business and freedom of expression, and there’s a lot of open source software on Github that cares about freedom of expression and not about business. In any case, we don’t really want Copyright to represent some kind of absolute ownership land-lock over every string of 100 characters, that is a bit antithetical to both Copyright and the FOSS community.


wow the number of legal experts that appear and debate hypotheticals when everything is spelled out quite clearly in the license agreements is very high on this site.

Triply so when Microsoft is involved.


You and I have a different understanding of “willful”. If you’ve used copilot you’ll know that the vast majority of the time it’s not infringing anybody’s copyright, it’s creating code that is highly unique to the problem you are trying to solve.


All output of machine learning algorithms is derived from the training set. There is no creativity, just lots of complexity. What that means legally has yet to be fully determined.


If that were the case, how can models such as DALL-E 2 generate “Homer Simpson in The Godfather” type images. It’s clear that machine learning models are capable of independent creation.

As far as copilot goes, yes it’s possible to get it to recite copyrighted works, but in normal usage it is creating independent works because it is too influenced by the structure of your code around the insertion point to recite anything. It’s auto completing things like the variable names that you already declared, simple loops and function applications, etc.

> What that means legally has yet to be fully determined.

At least in the US, the Supreme Court ruled in Google v Oracle that the entire Java API is not copyrightable. Copilot users are very far from crossing the line, the courts are not going to come after some de minimis 10-line snippet that copilot generated.

Whether Microsoft itself was legally in the right by training copilot is a more interesting legal question that remains unresolved.


Do you see a scope for troll code GPLers, something along the lines of troll patents ?


No. There's nothing magical about GPL code. Sticking a license on code doesn't suddenly lead to astronomical damages.

No one has won billions of dollars on GPL enforcement. It's not how courts work. Contrary to popular belief, courts also won't compel compliance (e.g. releasing my code); if I break your license, the standard recourse is damages, whether that's GPL or All Rights Reserved.

Otherwise, I'd make the First Born Child license, whereby by using my code, you give me full ownership of your first born child, your home, your car, and your bank account. I could write a license like that right now, but I couldn't force you to give me your child, car, bank account, and home. If you used my code, you'd have the option to accept the license and give me those things. Or you could reject it, in which case, it's a normal copyright violation; in that case, whatever I wrote in the license is moot, and you pay damages (and stop using my code).


In this case, exchange is not fair, it's a scam, while in case of GPL, exchange is fair (code for code), so it's a valid open contract. You use my code, I use your.


Fair has nothing to do with it. Contracts don't need to be fair, and often aren't. They just need consideration. If we sign a contract whereby you give me your car, bank accounts, and house, for $1, that's a valid contract.

The only part which wouldn't be valid in a contract was the first-born child. That was a joke.

Indeed, if the GPL were a contract, courts might compel compliance.

However, the GPL is not a contract, it's a license. The FSF bent over backwards to make sure the GPL/AGPL licenses wouldn't be viewed as a contract, in part to limit liability / damages / risk.

Confusingly, some EULAs are framed contracts, contrary to the acronym, and do expose users to much more risk of liability than the GPL.

The relevant part of the GPL is:

    You are not required to accept this License in order to receive or
    run a copy of the Program. Ancillary propagation of a covered work 
    occurring solely as a consequence of using peer-to-peer transmission to 
    receive a copy likewise does not require acceptance. However, nothing 
    other than this License grants you permission to propagate or modify 
    any covered work. These actions infringe copyright if you do not accept 
    this License. Therefore, by modifying or propagating a covered work, you 
    indicate your acceptance of this License to do so.
Although we often like to take a plain-text read, but that's misleading; this is legal jargon. It's one of those bits of text which needs to be explained by a lawyer, and one who specializes in both licensing and in contract law.


> Fair has nothing to do with it. Contracts don't need to be fair, and often aren't. They just need consideration. If we sign a contract whereby you give me your car, bank accounts, and house, for $1, that's a valid contract.

It will be a gift. Gifts are valid, but they require free will of the gifting party. Gifts, without free will, can be easily canceled by court.


Copilot doesn't reuse code. None of the code it regurgitates has the required license.


I pasted a few segments of code I'd written in a prominent project. Copilot regurgitated paraphrased versions of the rest of that code. It'd be hard to argue it's not a derivative work.


Thank you for pointing that out!

I should have put "reuse" in quotes, since I meant copilot takes reuse one step further and replicates or regurgites code.


> Someone notices this and sues the commercial project for breaking GPL v3.

I know what you mean, but silly nit pick since you mentioned “commercial” twice - GPL v3 does not prevent commercial use, it only requires copies to be open source. For someone to notice the project has copied code and not be inside the company, the code would (probably) have to be open source. So, this hypothetical is less likely to happen than your comment makes it sound.

A little further off topic, but amusing to me, is that the US government defines “commercial” software to be any software that has a license other than public domain. Free and open source software, such as GPL v3, is still “commercial” because it is licensed to the public https://www.acquisition.gov/far/part-2#FAR_2_101

More on-topic now, a small single function accidentally copied from an open source project by automated software might be considered fair use by US copyright law. https://www.copyright.gov/fair-use/more-info.html

(Edit) Oh yeah, and I just remembered that GitHub’s Terms already carve a necessary exception to whatever license you use in your project, to allow Github to host & display your code. I assume those terms already include some CoPilot coverage…? If not, and if they aren’t legally covered already (which I bet they are), then they could change the terms to stipulate that hosting code on GitHub bars people from suing over incidental amounts of automated copying. Main point here being that the GPLv3 license on your project is neither the only nor the primary license governing GitHub’s relationship with your code.


> I know what you mean, but silly nit pick since you mentioned “commercial” twice - GPL v3 does not prevent commercial use, it only requires copies to be open source. For someone to notice the project has copied code and not be inside the company, the code would (probably) have to be open source. So, this hypothetical is less likely to happen than your comment makes it sound.

The are plenty of source available or open core projects where use of GPL-ed code would both be visible and incompatible with the licensing.

> Oh yeah, and I just remembered that GitHub’s Terms already carve a necessary exception to whatever license you use in your project, to allow Github to host & display your code. I assume those terms already include some CoPilot coverage…?

Letting github host and display the code is compatible with open source licenses but is very much different from letting third parties incorporate that code into non-open codebases.

> If not, and if they aren’t legally covered already (which I bet they are), then they could change the terms to stipulate that hosting code on GitHub bars people from suing over incidental amounts of automated copying. Main point here being that the GPLv3 license on your project is neither the only nor the primary license governing GitHub’s relationship with your code.

GitHub TOS can only possibly give them authorization from those directly uploading code to GitHub. They don't give github any additional license for code that was uploaded by a someone else (for example a mirror bot) becose that someone does not have the rights to give out that license.

And even if they write in their TOS that they can do whatever they want it does not mean that they can actually legally do whatever they want - even moreso when retroactively changing those terms.


> Letting github host and display the code is compatible with open source licenses but is very much different from letting third parties incorporate that code into non-open codebases.

Agree completely. It’s still a fact that GitHub’s Terms provide a separate license for Github. See section D.4 https://docs.github.com/en/site-policy/github-terms/github-t...

> GitHub TOS can only possibly give them authorization from those directly uploading code to GitHub. They don’t give github any additional license for code that was uploaded by a someone else (for example a mirror bot) becose that someone does not have the rights to give out that license.

GitHub’s terms already require that uploaders have copyright authorization, or that machine uploaders are doing automated tasks exclusively. Letting a machine upload someone else’s new and copyright code that the account owner doesn’t have copyrights to appears to violate GitHub’s terms. “the owner of the Account is ultimately responsible for the machine's actions” https://docs.github.com/en/site-policy/github-terms/github-t...

> even if they write in the TOS that they can do whatever they want it does not mean that they can actually legally do whatever they want

Yes, correct, I completely agree. Asking for people to agree to ‘indemnify and hold harmless’ over the site’s features is standard language and not really a stretch, it doesn’t amount to Github doing whatever they want. That language is already in the terms.

What is the summary of your comment, what are you trying to say at a high level? We can debate the fine points, but if you are trying to say that my speculative suggestion was crappy and Github has other ways to make copilot legal, then I agree. If you’re trying to say that Github has no legal way to make copilot available, then I disagree.

There is already language in the terms that might cover copilot. Section D.4 I linked to above includes this:

“This license [that you grant us] includes the right to do things like copy it [your content] to our database and make backups; show it to you and other users; parse it into a search index or otherwise analyze it on our servers; share it with other users; and perform it, in case Your Content is something like music or video.“

The Copilot FAQ also mentions they have IP filters and actively prevent reciting large portions of anyone’s code, it explicitly mentions a threshold of 150 characters.


Under it's settings, there's an option for GitHub Copilot to "allow or block suggestions matching public code."


> A developer uses Copilot for a commercial project and Copilot insert a complete function from a GPL v3 code into this project.

Honest question: can a function have a license? Ie. can a function be copyrighted?

If an MIT library and a GPL library use the same function with some minor variation and I use the function from the GPL code in my commercial project, have I infringed on someone’s copyright/license? Or would be argument be that the function in question is not copyrightable as almost the same version exists licensed under MIT?


> Ie. can a function be copyrighted?

    def print_harry_potter_book_1():
    
        print("Chapter One")
        print("The boy who lived")
        
        print("Mr. and Mrs. Dursley, of number four, Privet Drive, were")
        print("proud to say that they were perfectly normal, thank")
        print("you very much. they were the last people you’d expect to be in-")
        print("volved in anything strange or mysterious, because they just didn’t")
        print("hold with such nonsense.")
        print("Mr. Dursley was the director of a firm called Grunnings, which")
        print("made drills. He was a big, beefy man with hardly any neck, al-")
        print("though he did have a very large mustache. Mrs. Dursley was thin")
        print("and blonde and had nearly twice the usual amount of neck, which")
        print("came in very useful as she spent so much of her time craning over")
        print("garden fences, spying on the neighbors. the Dursleys had a small")
        print("son called Dudley and in their opinion there was no finer boy")
        print("anywhere.")
        [...]


Honest answer:

- Yes, functions are copyrighted.

- Copyrights are not patents, and providence matters. If you and I independently both come up with the same text, we both have a right to use it. If it could be proven in a court of law that by a 10^(-100,000) chance, we both wrote an identical novel, we'd both have copyrights to our work.

- Conversely, if you took my creative work, the fact that someone else came up with the same creative work isn't a defence

- If the code you borrowed went MIT->GPL->your code, things get very ambiguous. The copyright holder is the original author if the GPL code made no changes.

- For just one function, you might be able to get away with a fair use defence. There's a four-prong test, which is pretty fuzzy. You'd do well on some prongs ("the amount and substantiality of the portion used in relation to the copyrighted work as a whole") and poorly on others ("the purpose and character of the use").

- For something like copilot inserting it, you do much better, since intent matters too.


This is why we can't have nice things


a person bought a hammer/knife from a shop and used it to hurt another person. When the victim sues the attacker, can the manufacturer of hammer/knife pick up the tab?


That's... not quite the same thing.

More like, everybody has their own sourdough starter. Some people sell theirs, some people give it out for free. Someone goes from house to house to make a super-starter by collecting pieces from each house. Then someone uses that starter to sell a patented bread

Ok it's not a great analogy either. Maybe we need to stop trying to reduce the complexities of digital so much


Not the same. Probably theft laws are more similar. A thief steals a knife and sells it to a knife seller. The knife seller sells you the knife. Under law, stolen property belong to the person it was stolen from, and sales are not valid. The original owner of the knife claims it back from you.

But being digital, the knife was just copied and the original owner still has one, but you also have theirs without their permission. So there's much less sympathy for the victim.


Unless the hammer automatically chooses it's target and will (sometimes) hit a human in the head all on it's own unless the owner is paying sufficient attention to check it's target on each swing and validate that it's not a human head.... not really the same thing.


we need another AI in example, if someone's AI car automatically random hit people, do not company have some responsibility?


Isn’t that what Biden wants to do with gun makers?


The main page [0] shows you awesome demos, but also its weaknesses in the very first example. It doesn't encode the url encoded body properly:

> body: `text=${text}`,

So it breaks if the text contains a '&' and even allows parameter injection to the call of the 3rd party service. Isn't that critical on a sentiment analysis API, but could result in actual security holes.

I hope the users won't blindly use the generated code without review. These mistakes can be so subtle, nobody even noticed them when they put them on the front page of the product.

[0]: https://github.com/features/copilot/


There are issues with many of the other demos too, especially in the second group of examples (e.g. `const months = days / 30`, the prime number test function not testing any `false` cases, etc.).


Yep. Copilot is going to be good for “pick up the pieces” devs


that's why it's called copilot and not autopilot


Guys you're getting a lot of bad comments for one simple reason. You failed your delivery.

1) You should have managed the expectations of the users in a better way. Tell them it will become a paid feature from the begining, so nobody gets surprised 2) The way everyone unsderstood this today was too aggresive. An infinite warning in visual studio saying "hey, i've stop working, please sign up and pay or uninstall me". Too violent.

A "Hey, we are happy you're using Copilot. We want to inform you that in 2 weeks we will close the beta and we will need you to sign up. But don't worry, it will be free for 60 days"

I'm sure 99% of people here would just be happy to pay those 10usd/month


It's still free with no payment for existing (beta/technical preview) customers. There was a github bug with some auth token nonsense that was causing problems, but all technical preview users should still be free for 60 days.


I was a technical preview user, and i am getting this error :

GitHub Copilot could not connect to server. Extension activation failed: "User not authorized"


Worse yet, it's not available for the Orgs.

So now each individual developer using it for work suddenly has to either pony up $10/month or figure out how to expense it.


I’m already terrified how many developers have been working on proprietary code bases with copilot, having an extension in their editor upload all their employer’s proprietary code to Microsoft, who then share it with OpenAI - then they’ve taken code OpenAI and Microsoft sent back to them, of unknown authorship, and added it into their code.

And now those devs are going to have to go to their boss and explain all the ways they’ve opened their company up to liability?

This should be hilarious.


Eh. I'd be okay with making all the software in the world open-source. It's only a matter of time before we have computers powerful enough to reverse-engineer everything in a split second anyway.


Matters less what you would be okay with compared to what your corporate counsel would be okay with. But sure, go file that expense claim.


I mean if your code is already hosted on Github, your builds done by github actions, and/or issues managed by github issues, it's not like giving MS that same code back is going to increase "liability" at all.


Setting aside the possibility that not all code is actually hosted on GitHub; If your code’s hosted in a GitHub account owned by your company, under terms vetted by legal… but then you sign up personally for copilot and in that capacity you upload the same code to GitHub under terms and conditions your corporate legal team have no visibility into? You might have a problem.


I think a great rule of thumb is to never take away things that were previously free (maybe more for features of a service/startup).

Copilot is such a marvel though. I think they could have gotten away with it if they did like you say and give more of an advanced warning.


This is exactly why I don't like using MS tools. Relentless use of dark patterns and user-hostile behavior.

I don't want my code editor to try to up-sell me, ever.


Also, the neovim simply stopped working, without any notice what so ever. Wasn't until I checked HN I figured out why it suddenly stopped working.


your comment badly aged


Does Copilot already display the licenses of the code it might insert/suggest, or assure the developer, that the inserted/suggested code is not a verbatim copy of existing code? How can developers be sure, that they are not violating licenses by using Copilot?


Previously discussed at length here: https://news.ycombinator.com/item?id=27773157

> or assure the developer, that the inserted/suggested code is not a verbatim copy of existing code

No, it does not do that.

> How can developers be sure, that they are not violating licenses by using Copilot

There are no clear answers.


from Microsofts standpoint the shot across the bow for open source licenses is clear: do you have enough lawyers and experts to convince a gerontocracy of the legislative branch of the US government that its not "okay because its AI" because if you dont, then thanks for the code nerd.


Idk if MS is on the safe here. There's a straightforward legal theory for suing, and also parties such as EFF and others with a war chest and the determination to clarify this. Does MS provide indemnification to Copilot customers/users if those are sued by others? My advice would be to stay clear of Copilot.


If you think the US doesn't have enough existing legal theory on copyright to litigate this then you're crazy. It will be on MS to show that it isn't infringement.


Now add in 100 other legal jurisdictions. I haven't looked, but I assume Copilot is available outside the US?


I really want MS to try that theory in a court and win.

That will destroy the propaganda that the Law is there to protect content creators better than anything the people against copyrights can come-up with.


it sounds like a good basis for a class action lawsuit, where the class are the people who own the licensed code whose license microsoft is ignoring.


Seems like I'll get paid for this a while after I get paid for TurnItIn selling a product based on stealing my work


Can't wait to get my three fiddy for my open source repos.


From the FAQ

""" We built a filter to help detect and suppress the rare instances where a GitHub Copilot suggestion contains code that matches public code on GitHub. You have the choice to turn that filter on or off during setup. With the filter on, GitHub Copilot checks code suggestions with its surrounding code for matches or near matches (ignoring whitespace) against public code on GitHub of about 150 characters. If there is a match, the suggestion will not be shown to you. We plan on continuing to evolve this approach and welcome feedback and comment. """


> the rare instances

That's a bold statement considering how easy it was for testers to quickly find examples of this in initial testing.

> against public code on GitHub

... and how some of those examples found were from code not hosted on Github.

Ultimately though, what matters here is not whether this is true but whether it's plausible enough for legal departments in companies buy it.


https://github.blog/2021-06-30-github-copilot-research-recit...

> That corresponds to one recitation event every 10 user weeks

> This investigation demonstrates that GitHub Copilot can quote a body of code verbatim, yet it rarely does so, and when it does, it mostly quotes code that everybody quotes, typically at the beginning of a file, as if to break the ice

A year old post now, YMMV.


Do you recall/have a link to such examples? Would be interesting to try them again with the filter.

The example I can remember was Carmack's* quick square root - but I'd probably call that "folk code" given it was passed down/altered before being misattributed to the Quake dev, and appears in hundreds of Github repos (many with permissive licenses like WTFPL, so a well-intentioned human may do the same).


I remember reading something like this just before somebody proved that it would recite Carmack's square root algorithm word for word.


My sincere question is what if a developer looks at some GPL code, and then that developer encounters a situation in a corporate project where-in he uses the GPL code from memory, is that already a violation?

So to avoid a violation a developer needs to perform a mind-wipe?


If the code is nontrivial, then yes, it is a violation. To be in compliance, you need to write your own code.

If I am writing a novel and I copy a section verbatim from another novel, I am infringing on the other novelist's copyright, regardless of whether I wrote it from memory or not.

And this makes sense. For a trivial operation, there might be only one way to write the code. That's not copyright infringement, just like you're not infringing on an author's copyright by occasionally writing a sentence that was similar to theirs. For a nontrivial operation, you can easily write your own code without copying someone else's work.

Remember also that you can use others' ideas. Copyright only cares about the code itself. If there's a clever trick that you've seen someone use, you're free to use the same clever trick as long as 1) they didn't patent it and 2) you're not actually copying their code


fair use != trivial


It's not about fair use. If you have a problem that can only be solved by one specific implementation, that implementation is not copyrightable because it has no creativity behind it.


It's arguable. Copyright cares a lot about provenance, transformation, and commercial consequences. It all comes down to what you can afford to litigate. Some projects go to extreme lengths, like Wine not accepting code from anyone who has seen leaked Windows sources.


I’d be even more curious (more philosophically than anything) as to who is liable for the mishap if Copilot suggests something that ends up violating a license. Is it the developer? The developer’s company? GitHub? Maybe the “AI” is the ultimate scapegoat (“we can’t be liable for what our helpful robot decides to do”)!


The details might differ per country, but my non-lawyer intuition clearly says that you are responsible for the code you publish, no matter what tool has suggested it.


I’m sure at the end of the day that would be the case in most sane legal systems. However, it does seem almost impractical in reality for anyone to do anything about it (kind of like Uber/Lyft/Airbnb making something so commonplace so quickly that the regulations they broke became meaningless).


It's the developer. Just because you copy/paste something you find on SO or Google or Githib doesn't absolve you of copyright infridgements


In most jurisdictions, if the developer is an employee the legal liability is with the company.


There are many instances of people not even being able to see the code in question, see clean room engineering

An example for wine/proton/reactos developers from a moderator on the forum about the leaked windows xp code:

"You look at the code? You worked for MS? No dev for us! It's that easy."

https://reactos.org/forum/viewtopic.php?t=20189

There are many instances of large lawsuits where just seeing the old code made you in eligible to even touch the new code


> that developer encounters a situation in a corporate project where-in he uses the GPL code from memory

If you draw Micky Mouse from memory, Disney still owns the copyright.


From my experience with it the suggestions are so generic it's hard to imagine anyone has a legit license to "formatDateISO....() {code here}".

Maybe I'm using it wrong but I've hardly seen it pump out a mass volume of code.


You really only need one example to offset this anecdote, so here you are: https://mobile.twitter.com/mitsuhiko/status/1410886329924194...

Copyright violations are a genuine concern from the outputted code, GitHub themselves have admitted it may emit raw training data rarely.


There is logic to ensure that copilot does not emit exact duplicates of code in the training set... but that logic is significantly newer than that tweet.


Link? I couldn't find anything "significantly newer" than 7/2/21 (though I'm sure GitHub is doing a lot here). They had this blog post 6/30/21 regarding efforts on avoiding raw code: https://github.blog/2021-06-30-github-copilot-research-recit.... They concluded:

> We will both continue to work on decreasing rates of recitation, as well as making its detection more precise.


Source: I work on the copilot team.


Was that decision informed by legal or product? Because derivative works are still derivitative works even if you don't replicate the original verbatim.


I mean, it was informed by both, but basically everyone thinks it's a good idea.


Nope, have had the same experience as you and am of a similar opinion!


Others may have a different experience, but I have never seen Copilot offer suggestions anywhere near complicated/unique enough for it to matter.

That’s not a knock on Copilot, I think it’s a great product and I happily subscribed today after using it the last few months!


It's my experience too. It's a fancy autocomplete that works about 30% of the time for me, I'm not actually sure I'm saving time by using it.


The fact that GitHub is now charging for this feature smells like a lawsuit waiting to happen. They're now literally profiting from potentially stolen GPL code.


I’m honestly shook at all the comments here. I don’t make any money coding and I’m probably in the lower 25% of HN readers in terms of skills, but I’m more than happy to pay $10/m. I would pay Github $10/m for what they already give me.

What is your time worth? You should easily get $60/hr, so you need to save 12 minutes per month to make it worth. I would pay that for all my employees.

CoPilot is not a replacement for writing code, but it’s incredible useful when you are stuck and or / write simple logic.

Often I don’t have the right method, function or logic on mind. Before I google, I write a comment of what I want and 8/10 CoPilot generates the right code.

Typing the comment, checking the solution, reformatting it is <<< less time than without it.

To me Github CoPilot is a standard part of my IDE and I wouldn’t want to miss it anymore. It saves me at least an hour a day of coding. Some stuff is really crazy. I invite you all to try to be open-minded. You have to experience it.

// You have to code for yourself

I don’t really like this argument, because if that argument would be true, we would also need to now how our codes translates to 1 and 0s and how the electronics build our application than. AutoComplete is part of our life on our phone and it can be with developing. Don’t make it harder as it needs to be.


Could it perhaps be you not coding for a living (and being lower in skill as you say) that make you think it's worth it?

For me the bottleneck is seldom typing. And while Copilot can sometimes dish out some more advanced stuff, I still have to verify it and understand it. Since I can basically solve every problem I encounter day-to-day, Copilot's contribution is not that useful.


I code for a living and have done so for over a decade now, and I completely agree with their analysis.

Does it save you $10 worth of your time within a month?

Comments here are wildly uninformed. I see comments complaining about copyright that seem to have no awareness of either fair use doctrine prior law as it relates to partial usage nor the details regarding how infrequently Copilot generates identifiable verbatim results outside attempts to auto fill empty files in empty projects (which seems outside typical usage).

Or complaints that it makes mistakes, as if 90% of those mistakes aren't immediately flagged by the linter. Not only that, but I've found that often when it does make mistakes, it reflects a consistency smell in my own code, such as tripping up on a legacy naming convention that should really be refactored out.

If it doesn't save you $10 worth of time, obviously don't use it. Personally I was worried it was going to be more given the ways in which it cuts down on the most boring parts of a high value profession.

But insinuating that someone's positive experience of the tool reflects inexperience is a weird gatekeeper flex, and honestly I'm more inclined to think that all the curmudgeonly resistance I see in here to the inevitable march of progress instead reflects old dogs unable to adequately learn new tricks (like how to effectively prompt it).


> I see comments complaining about copyright that seem to have no awareness of either fair use doctrine prior law as it relates to partial usage

My jurisdiction has no concept of fair use.


It wasn't an attempt at insinuating anything in general, it was just an observation based on the parent comment's own admission.

Please remember this from the guidelines

> Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith.


> It wasn't an attempt at insinuating anything in general, it was just an observation based on the parent comment's own admission.

I don't think that's true.

When the parent comment made that observation, they attached the caveat they might not be as skilled as others. They were already fully aware their potential lack of skill might affect their opinion of the product. All you did was repeat that same claim back to them, as if they weren't already aware of it which is a pretty uncharitable interpretation. A steelman interpretation that you could've said would assume there are some low-hanging fruit new or inexperienced developers would benefit from greatly (not just typing as you suggest), but once you develop a certain level of skill, Copilot would become less useful for experts such as yourself.

If anything, you didn't respond to the strongest plausible interpretation of what was said, since you willfully disregarded their own insight into the problem.

Then to try and morally lecture someone on their behavior by applying a rule you don't even hold yourself standard to is pretty astonishing.


No, I explained why I thought it didn't provide value to me, in the context of having a different background than the person I responded to. Nothing malicious intended in that.

I find you having to write 3 long paragraphs of bullshit about that to be the truly astonishing thing here. Holy crap.


Not commenting on the actual riff the two of you are having. However I dislike the come back of “you wrote so much”. I’m overly verbose. I have adhd. So what. Having to hear school ground retorts insinuating I care so much because I write too much is tiresome. Is it even a bad thing? Is it even logical to say writing more than average is somehow an inherent flaw?


Now you're just doing the same: reading something that's not there and accusing me of insinuating stuff...


> “I find you having to write 3 long paragraphs of bullshit about that to be the truly astonishing thing here”

What I wrote applies. An argument could be yours is diff because of specifying it is bullshit. Yet that’s what most responses are about. A mix of statements like:

“why do you care so much”/“lol you care so much” or “wow you really wrote that much” or “your long writing is all BS and excuses”.

I don’t want to attack you as a person or make you feel bad. No insinuation or implication. I am directly saying stuff. If I did read something that’s not there, I’m up for being shown how I am wrong.


> What I wrote applies

No, it doesn't. I didn't write the type of come-back you're arguing against. I didn't write "lol you care so much" or anything to that effect. You're just making up straw men.

DantesKite wrote "I don't think that's true." when I explained what I meant about a comment, and then spent three paragraphs twisting my words and making up a story about what I wrote. Which is dishonest, I know what I meant better than they. Me pointing out that they spent three paragraphs arguing in bad faith isn't me saying what you're accusing me of.


Who cares really. If a person doesn't find it useful, then just don't use it but if other people find it useful that those people will use it.


Counterpoint look at the bios lawsuits that IBM won against people who had just seen IBM's code

Or the windows xp leak and how that is a mess for wine/proton/reactos devs

There is no concept of fair use in code copyright

Copilot is 100% a ask for forgiveness later sort of project


> For me the bottleneck is seldom typing

This is one of my pet peeves in this field. People create whole programming languages that are "expressive", just to save typing a few dozen characters and have huge tirades against "verbose" languages that require typing a bunch of boilerplate.

If typing the code is the bit that takes the longest for you in a project, stop and take a good look in the mirror. There's something else wrong in the process.


For a verbose language the problem isn't the typing it's the reading. On the typing side I agree with you: I don't mind spending 30 minutes typing a bunch of boilerplate. I do mind digging through 100s of lines of code to find the 2 lines that actually do something interesting.

Of course, copilot is only going to save you typing time, and you'll have to pay it back at reading time.


I spend 95% of my day reading existing code not writing it, I have no idea what these developers writing code all day work on


"verbose" and "expressive" are on two different axes. Go is neither verbose nor expressive, for example, while XQuery is both verbose and expressive.


People write languages at higher levels of abstraction not to type fewer characters but to reduce logical errors that can occur, and provide better abstractions over things.

An example here is an infinite list. It's far easier to do that in say Haskell and python, applying filters along the way using higher order and curried functions than it is to do it in say C.


There is a difference between higher levels of abstraction and having a separate symbol for every single operation imaginable just for the sake of "expressiveness".

Yes, you can shove a 20 line function into one line with a ton of weird symbols, but the one reading it after you will need to unroll it anyway to understand it so what's the point?


When U say verify isn’t it just like 1 or 2 statement per completion? I’ve never seen it suggest more than that, a glance over the logic it completes with is lot more faster than typing it out in my experience. I spend more time thinking these days, than typing.


$60/hr? Maybe in the US, but really uncommon even in western countries like Germany.


I'll be honest I love the technology involved in this product but I hate that it's another aspect of monetizing the efforts and humanity of millions of people.

It's incredible that we're able to do these things but awful at the same time since this data was / is not theirs. Same as something like Dall-E.


> monetizing the efforts

...and not compensating (or even attributing as required by the licenses) the authors for it.


It's not copying open source code. If you learn an algorithm to balance a binary tree from reading GPL code, and then go use that algorithm in your own closed-source project, with your own variables and types and context, are you breaking GPL? You're not copying the code. Just because you learned about it from reading GPL code doesn't mean that whenever you write tree balancing code from now until the end of time, all that code has to be GPL'd.

Copilot learns the "shape" of code. Common patterns and algorithms, etc. You can't copyright an algorithm.


If you decompile runtime bytecode and assign your own variable names, does the copyright of the original source code no longer apply?

If you trace a picture and use it in your work of art, does the copyright of the original picture no longer apply?

If you copy a tune but set it to new instruments, does the copyright of the original tune no longer apply?

Sampling is a legal minefield in music, why would it become less of a minefield in code just because you've automated it? So far the best attempt at an answer about the legal issue of Copilot I've seen was that it's "not technically violating copyright", which honestly is not very reassuring and extremely morally inconsistent for a company built by a guy[0] who is philosophically invested enough in intellectual property as the pillar of human society to write An Open Letter To Hobbyists and use his Foundation to convince entire governments of adhering to IP laws instead of allowing the mass production of vaccines and medicine.

[0]: Yeah, I know that he no longer serves an active role in the company but this was very much a founding ethos and this is at least a fair bit hypocritical.


If you teach someone about music theory by listening to Stairway to Heaven, and then they write their own song that starts with an A minor chord... are they violating copyright of Stairway to Heaven?

Copilot isn't sampling. Sampling is literally copying snippets of someone else's music and putting it into your music. Copilot doesn't do that. There's no giant database of text that it just slurps suggestions out of.


Personally, I'm more concerned about google using emails from gmail to suggest what to write.


Copying of code needs to be very direct. Even Google copied tens of thousands of lines of code from Oracle character for character and won the case taken all the way to the Supreme Court. When Oracle made changes (even during the court proceedings), Google kept copying the code and every change Oracle made. So I doubt you’re at real legal risk with what you were proposing.


Is your argument in good faith? Seems like you know enough about the matter to destinguish an API definition from implementation. That's what that ruling was about, and you seemed to know it, yet make the comparison as if it was valid.


Read up on clean room design and the IBM lawsuits from the 80's and 90's

Just seeing someone else's code is hazardous from a legal precident point of view


Im sure from Microsoft's POV is that they are charging you for maintaining and operating co-pilot (servers, admin, etc), not charging you for the tool itself.


I've enjoyed using it for free, but not sure it's worth the $10/mo yet. When it works great, it's a nice-to-have for speeding up development but has yet to give me anything I wouldn't be able to just write myself. And when I wish it would give me the answer to something I don't know how to do, it spits out something very wrong.

Also feels kind of icky to train on open source projects and then charge for the output.


> train on open source projects

To be specific, the FAQ states: "It has been trained on natural language text and source code from publicly available sources, including code in public repositories on GitHub."

Some have raised concerns that Copilot violates at least the spirit of many open source licenses, laundering otherwise unusable code by sprinkling magic AI dust... most likely leaving the Copilot user responsible for copyright infringement.


Yep. The only reason it hasn't been utterly dogpiled by lawyers is that far fewer people care about code than other forms of IP. If I made an AI assistant called PhotoStar to help with digital art and it just attaches Big Bird's face onto a character in my children's book I'm going to get sued. "Hey now, I just hit paste, the software pressed copy by itself" is not going to hold up.


Or the fact that you grant GitHub an implicit license as outlined in the ToS.


GitHub has never asked for representation to provide an unlimited-rights license to GitHub themselves for any purpose. Further, the person posting GPLed code to GitHub is not necessarily the only or sole copyright holder, and GitHub has never represented that there was a problem with this.


GitHub isn't liable. That's been established in court with regards to training AIs. Who is liable is you who may or may not have the legal right to use the code CoPilot spits out for you.


It seems like this space will open up all sorts of interesting novel legal questions.

It is possible to provide CoPilot with a sequence of inputs that produces some of the input, which was copyrighted. Let's say you want to help people violate copyright, so you as a third party distribute a script that provides that sequence of inputs. Who's violating the copyright there?

Alternatively -- it is apparently legal to produce a clean-room implementation that duplicates a copyright implementation. Supposing you were to use a tool like CoPilot, which has just been trained on that copyright implementation. Is your room still clean? You might even be able to get it to spit out identical functions!

Or, if you have a ML algorithm which has been trained on leaked closed source code, and it is sufficiently over-fitted as to just provide the source code given the filename or the original binary, who is violating copyright when this tool is used? If it is just the end user, then this seems like a really convenient way to launder leaked closed source code.


I don't think it's a clear cut as you make it out to be. Tortious interference is a common law remedy that might make Github/MS liable.

If I induce you to break a contract with someone else they can come after me for damages.

For example in this case, there are developers who have created GPL code. That code was licensed to some other developer. Github then encouraged people to upload git copies of the GPL code onto github where it was put into the model. That model contains the copyrighted materials and isn't coming with the necessary notices. The output of the model can be code that is a direct stand in for the copyrighted work. Thus Github have become a party to breaking the license even though they themselves never agreed to the GPL.

In addition Github are encouraging (They are advertising it and making it available broadly) other developers to copy that code and use it in their project. Again that's encouraging an action that breaks a contract. Github is well aware that this is likely happening and they continue on. Thus they might be liable. You also might be liable.

All of these things can and likely will be argued before courts but it's not at all one sided.

> That's been established in court with regards to training AIs.

What are you basing the certainty of this statement on? The case law I have seen around this is pretty spotty. Cases around training on copyrighted materials have predominately been about the input, and not the output. With the final output usually being controlled by the model owner. For example Google obtained the books they scanned legally then used them to produce google books' index. There are some major differences.

- The books were purchased, meaning they got a license to use the book. There's for sure code in the model that Github does not legally have the right to use. They are aware of this. Making the input more shaky for github. - Github is making a direct profit off of this service. It's a revenue generating enterprise. That's important since it raises the bar of what they can be expected to do.

There's been nothing that goes to the supreme court yet; it's all per circuit and not settled case law. Also this gets WAAAAY more complex when we start talking about outside of the US and isn't decided at all.

These things are complex and likely you need your lawyer to advise you with any real questions.


> The books were purchased, meaning they got a license to use the book.

This may be a bit nit-picky, but I don't think that is correct.

Most books I've seen don't say anything about granting a license so there would be no explicit license that comes with them.

Maybe you could find an implicit license if normal use of a book required a license but it does not. Copyright law allows all the normal uses of a book without requiring permission of the copyright owner. You only need a license when you want to do something that requires permission.


I should have been more explicit; You are completely correct.

I was saying that there's some implied license after first purchase. I believe that was part of the court's decision. Paying for a book (or a library paying) gives you implicit rights to fair use. Github's copies of code were not purchased. They were given by sometimes third party.

So there's likely some room to argue that fair use rights are different enough between previous cases and github.


This has been explained many times - you can check word for word the output is original. All it takes is a bloom filter trained on the Copilot training set and an ngram extractor.


Yes, and you'll be fine if you do. The problem is you might not bother.


Alpha-equivalence be damned!



Fortunately, it can also generate high-quality completely novel characters, every bit as lovable and unthreatening as Big Bird:

https://imgur.com/a/ppeclPL


But if you made DALL-E and it just remixes images sourced from a broad scan of the Internet, filtered through several layers of machine learning indirection, you're all good.


Sure, if it's remixed to the point where most people don't go "hey that's Big Bird!" CoPilot doesn't, or at least doesn't always, like when it just copied Quake's fast inverse square code with the verbatim comments including profanity. Using CoPilot to create commercial code opens the coder to significant liability if there's enough money at stake.


That piece of code had duplicates in the training set making it prone to memorisation. Almost all generated code is original.


Almost all generated code is original

Good, you will almost not be liable for infringement.


Let's wait for the first big Codex infringement scandal to erupt and then I will start worrying about it.


Just argue that you subcontracted that code to Microsoft in good faith for $10/month and pass on the lawsuit to them.


I still can't believe they trained it with open source code, and didn't have some tag system to a) exclude based on licensing, and b) autoinclude licensing, or at least warn about it before applying code. Especially when many cases were shown of it line by line writing code from the same exact codebase.


Another concern is that nearly every stackoverflow answer or wikipedia article that isn't a trivial algorithm tends to be buggy at its edge conditions. Most of them look like they were submitted by college students and not experts.


Remember when we believed that experts were over because the wisdom of the crowds would reign supreme?

Been a hell of a decade, hasn't it.


The "wisdom of the crowds" doesn't mean what many people think it means.

The wisdom of crowds works best when:

1. participants are independent (otherwise you may get failure modes, such as "groupthink" or "information cascades")

2. participants are informed, but in different ways, with different opinions;

3. there is a clear, accepted aggregation mechanism, where individual errors "cancel out" to some degree

I view the topics in James Surowiecki's book (or the Wikipedia summary of it, at least) as required thinkinpg for everyone, preferably synthesized with a study of statistics and political economy.

In particular, the Wikipedia article's section on "Five elements required to form a wise crowd" is a slightly different slicing of the required elements that I offer above.

* If you read that section, trust is listed. I, however, don't see trust as a necessary condition for a "wise crowd". Trust is often useful (or even necessary) when a collective decision is used for governance, decision-making, and policy.


When the wisdom of the crowds is all easily accessible, the hard part becomes curating.


This is legit. While it seems it takes forever to bring this kind of stuff to trial, it will be an interesting case for sure. Especially in the broader more general sense.

AI is just recomposition of existing snippets of code, art, text, music, etc. Does an AI fall under fair use? What happens when an AI produces something too similar to an existing work or trademark. I know the computer won't get sued, the owner/user will. But still, it's a hard problem.

Even if Copilot was initialized with snippets from Open Source Software (exclusively), it doesn't mean that copyright infringement isn't a concern.


> AI is just recomposition of existing snippets of code, art, text, music, etc.

It's not random recomposition, which is worthless. It's useful recomposition, adapted to the request and context. It adds something of its own to the mix.


Not to mention that just because the code is public, doesn't mean you can use it however you want. You can publish code and still retain copyright. Wonder if GitHub looked at the license when they gathered the data for the model.


It seems unfortunately clear that generative ML as typically practiced falls under fair use of even the most restrictive licenses or lack thereof (e.g. a training set including disney movies without disney’s permission). Some people say that’s great and it’s legal hooray, but I would love it if the law caught up and added requirements to the models trained this way. If you benefit from other people’s stuff without their permission then you ought to have to give back in some way.


What is actually crazy is having copyright/patents/whatever apply to mathematical structures and code, and be retainable for long, it's rent on ideas, such a ridiculous concept.


Copyright and patents are very different. I think the general consensus among developers is that software patents are silly, but copyright on source code is very important.


If you can't prove your code was stolen you shouldn't have a claim. And Codex should just skip code that exists in the training set. All that remains is creative code.


Would a cartoon about Mickey Duck and Donald Mouse be infringing?


You can work on the definition of "similar code". It can be a separate model on its own. Use human judgements to learn it.


It’s hardly different from reading those projects yourself and learning from them.


Learning from them would be fine, reproducing them as-is without abiding by the license is not and that's where the difference lies.


Depends on your budget of course, but I don't think it's worth $10/month. I pay just a little bit more than that for an entire IDE. The problem with Copilot is that it's USEFUL for boilerplate code and when you need a lot of copypaste "coding" (think APIs, controllers, etc... basically shifting data around the place), but any time you need to actually code something with some actual algorithmic logic behind it, it's little more than a distraction, and often even a really problematic one, because if you let it, it will happily suggest things that look OK on the surface, but are almost always (and I really mean most of the time) wrong, buggy or otherwise incomplete. You can't realy on it. It's like a kid (I wanted to say a "junior programmer", but it's not anywhere near that level) you can offload some chores to, but you always have to check on it and what it actually does. Fine if all you need is to wash the dishes... more than that and you're asking for trouble.

When I'm in the flow, trying to solve some algorithmic problem, I always turn it off because the BS suggestions coming from its little "mind" actually slow me down and mess with my focus. Which all makes sense when you realize what it ultimately is - a philosopher, as opposed to a mathematician.


I very often will let it suggest its thing and then tweak it to work how I want. It's like super auto-complete for me. If I can't remember how a specific pattern goes for some library, I'll let it write it for me, and then double check it to make sure it's doing what I want. That's still faster than me going to check the API and writing it all out by hand.

Most projects are 90% BS glue code and 10% actually interesting code. I don't mind only having help with the 90%.


I used copilot yesterday because I wanted a random 10 character long string and was like. Ahh I don’t have the brain power right now to think of this. And remembered I had copilot. So I enabled it. Wrote a comment. And it generated ~10 lines that solved my problem. Tweaked a little bit and rolled with it.

It helps solve the boring simple shit so I can focus on the interesting bit.


> Most projects are 90% BS glue code and 10% actually interesting code. I don't mind only having help with the 90%.

Yea, that makes sense, I agree with that. If your use case is skewed more towards "BS glue code" as you say, you'll find more use out Copilot. Then $10/month can be fair, cheap even.


This seems pretty reasonable to me / resonates w/ how I might use it.


> Also feels kind of icky to train on open source projects and then charge for the output.

Yeah, this feels like the same nonsense that scientific journal publishers pull. If your product only has value because of what we made, it's completely unfair to not pay us for our work and then to turn around and charge us to use the output.


Also its users might be violating the GPL.

https://www.infoworld.com/article/3627319/github-copilot-is-...


How can the user be violating the license, not the distributor? If I give you a binary that gives you a Disney movie, it's not you violating the copyright, it's me. The copilot itself is violating the copyright, not its users.


"Your honor, I had no way of knowing that this mysterious device I purchased that manufactured shrinkwrapped Disney DVDs was violating copyright."

"Intent is not relevant to copyright infringement liability."

"But your honor, I heard on Hacker News that it was."

"I find you guilty."

"But your honor, copyright violation is usually a civil issue, and 'guilty' is a criminal trial concept."

"Well, I also get my legal training from Hacker News."


If you take the Disney movie the binary gives you and then pass it on, you're in violation even if the company distributing the binary is also in violation. You can sue them for damages that result from you being sued but good luck.


If you're making software just for your own use, you're right. But most people who make software do distribute it.


Where I live, copyright literally means the right to copy. Which means using a binary that gives/produces/generates a Disney movie when you do not have rights to that movie, you violate copyright by virtue of copying the IP into your computers memory and then onto the view buffer of your display. Also if the binary manages to do that without actually violating copyright itself it might even be legal. There's other laws that could be used though, I forgot what they got Napster on but they had something to shut it down, same for torrent sites like Piratebay.


If the copilot users then distribute the source they got from it, they are at that point violating copyright.

E.g., if I take that Disney movie, incorporate it into my own movie, and distribute it, then I'm also violating copyright.


The user of Copilot is a developer - the distributor.

And you might argue that Copilot is also a distributor.


Yes. Even if it may be permitted under some licenses, training models off millions of developers' code and capitalizing on those models goes against the spirit of open source software. I'd expect nothing less from Microsoft.


> has yet to give me anything I wouldn't be able to just write myself

Sure it has: Time.

In terms of economics it's really simple: Does Copilot free up more than 10$ worth of your time per month? If the product works at all as I understand it (I haven't tried), the answer should be a resounding "yes, and then some" for pretty much any SE, given the current market rates. If the answer is no (for example because it produces too many bad suggestions which break your flow), the product simply doesn't work.

There might be other reasons for you not to use it. Ego could be one. Your call.

> Also feels kind of icky to train on open source projects and then charge for the output.

I don't know why it would feel any more icky than making money off of open source in other ways.


It is completely different than using open source programs to make money. Many open source licenses explicitly require any derived work to maintain the copyright notice and a compatible license. If I use github copilot to create a derived work of something somebody else published on GitHub, I have no idea who wrote the upstream code or what license they made it available under. The defense for this is the claim that GitHub copilot doesn’t create a derived work, since the code it produces is very different than anything upstream (this is claimed in the original paper from openai). However, many people have found examples showing this to be a questionable or wishful-thinking claim.


Lack of training data is obviously not gonna be a linchpin in this project, no matter how reproachful the hs crowd looks upon copilot in regards to oss licensing. Even if we are prepared to dub the copilot team liars (bold move, good luck in court) there is always gonna be enough code to go around to make this thing happen regardless. Rumors are microsoft could chip in some.

In addition, the idea of "derived work" in code snippets is, quite frankly, nuts. There is only so many ways to write (let's be generous on the scope of copilot) 25 lines of code to do a very specific thing in a specific language. If you have 1000000 different coders do the job (which we do) you'll have a significant amount of overlap in the resulting code. Nobody is losing sleep because of potential license with this. Because that would be insane.

I have noticed that upholding oss licensing (at least morally) is kind of a table manner on hs. That's fine, but this is some new level of silly.

It's also not gonna persist, because no matter how much we love our oss white-knightedness, we love having well paying jobs more.


Having used it quite a lot I'm not sure it does save me $10 of time per month. At least as often that it generates usefully correct code it generates correct appearing but actually totally wrong code that I have to carefully check and/or debug.

It's quite nice not to have to type generic boilerplate in sometimes I guess but it's very frustrating when it generates junk.


Same experience for me. Checking the code it generated, and the subtle bugs it created which I missed until tests failed, made it at best a net-zero for me. I disabled it after trying for 2 months.


You lasted long than I did! Disabled after a few days.

I think it really depends on what languages you use though. If you use something like Kotlin where there's really almost no boilerplate and the type system is usefully strong, the symbolic logic auto-completion is just far more reliable and helpful. If you're stuck in a language where there's no types, and there's lots of boilerplate to write, then I can see it may be more helpful.


I turned it off a week ago because I found it was wasting time when everything it generated required going back to fix issues.


> I don't know why it would feel any more icky than making money off of open source in other ways.

For me, this entirely comes down to the philosophy of how a deep learning model should be described. On the one hand, the training and usage could be thought of as separate steps. Copyrighted material goes into training the model, and when used it creates text from a prompt. This is akin to a human hearing many examples of jazz, then composing their own song, where the new composition is independent of the previous works. On the other hand, the training and usage could be thought of as a single step that happens to have caching for performance. Copyrighted material and a prompt both exist as inputs, and the output derives from both. This is akin to a photocopier, with some distortion applied.

The key question is whether the output of Copilot are derivative works of the training data, which as far as I know is entirely up in the air and has no court precedent in either direction. I'd lean toward them being derivative works, because the model can output verbatim copies of the training data. (E.g. Outputting the exact code with identical comments to Quake's inverse sqrt function, prior to having that output be patched out.)

Getting back to the use of open source, if the output of Copilot derives from its training data in a legal sense, then any use of Copilot to produce non-open-source code is a violation of every open-source licensed work in its training data.


I want to love github co-pilot, but its just not there yet. For trivial stuff it's great, but for anything non-trivial it's always wrong. Always.

And my problem is : Time.

Cycling through false positives and trying to figure out if it's right costs me way more than $10 a month in productivity.

I cant wait for better versions to come out, but right now, no.


But I don't get paid on a piece rate; the amount of time I spend working is constant. Anything that increases my productivity just means I get more work done. (Others may differ, but I know from experience that I like to keep to a fixed schedule.) And that's mostly benefitting my employer, not me, so it seems like something my employer should pay for, if they believe in it.


Yeah it's just practicality for me. There is software I pay a lot more for that I use a lot less.

$100/year is a steal for the amount of tedious code copilot helps me with on a daily basis.


I could also make a mistake due to Copilot which takes me time to fix, and then I end up spending more time checking code where I previously used it. It has similar pros/cons than copy/pasting


Given the cost of the infrastructure needed to run those large language models, it's very likely that Microsoft is still operating copilot at a loss. I don't see an issue with it being a paid service as it is a costly service to provide.

What I pity however is that there's no free tier for hobbyists as paying a 10 usd monthly subscription wont make sense when you only code occasionally. For professionals using it everyday, 10 usd / month is inconsequential.

I don't think that would have costed them much more to offer a free allowance to cover say an average coding session of 8 hours per month.


GitHub Pro is $4/mo and includes 3000 minutes of CI compute per month (private repos), among all the other features. You’re not going to use 7500 minutes worth of compute a month with Copilot. I’ll certain pay up, though.


CI runs on CPUs, Copilot runs on GPUs. Waaaay different. Especially in this age of cryptocurrencies and chip shortages.


It’d be nice if they made it free if the upstream repo is published publicly under an open source license. They have all that info already.


It’s free for open source maintainers.


Open source maintainer here. No, it's not.

100% of what I do is open source. It's used by millions.

It's free for maintainers of "major" open source projects. I'm not sure what a "major" open source project is, but it's clearly not what I do. The only way to know if your open source project qualifies is to try to sign up. If it does, you're given a free option.


What repo do you maintain that is used by millions?


I don't connect my real-life identity to my personal identity.

I am the primary author (but not current maintainer) of an open-source project which is reported to be used by over 100 million people, according to (flaky) statistics kept by the current maintainers. That's around 1% of the people in the world.

I don't trust the current maintainers to be honest with numbers (there are lots of ways to estimate numbers of users), but it's definitely in the millions, and it's a project you (and most random people you'll meet in tech, and many outside of tech) will have heard of.

I am currently working on earlier-stage projects, which have smaller communities, but 100% of them are open-source.


Agreed. At the very least, I was hoping they'd bundle it with the GitHub Pro subscription for individuals rather than as a separate product.


Totally agree. I was expecting to get this feature as part of my Pro subscription.


I was expecting the same.


> Also feels kind of icky to train on open source projects and then charge for the output.

"open source is great, except when it's used in a way I don't like"


I don't see the use itself as a problem, but rather that the result is not treated as a derivative work of the input. If I train it on GPL code, the result should be GPL, too.


This is kind of like saying that any programmer who has ever learned something from reading GPL code can only use that knowledge when writing GPL code. It's not literally copying the code. The training set isn't stored on disk and regurgitated.

Also - there is logic in copilot that checks to make sure it is not suggesting exact duplicates of code from its training set, and if it does, it never sends them to the user.


But Copilot is not a programmer, Copilot is a program. Slapping the "ML" label on a program doesn't magically abdicate its programmers of all responsibility as much as tech companies over the past decade have tried to convince people otherwise.


I really dislike this false equivalence between human learning and machine learning. The two are significantly distinct in almost every way, both in their process and in their output. The scale is also vastly different. No human could possibly ingest all of the open source code on GitHub, much less regurgitate millions of snippets from what they “studied.”


> This is kind of like saying that any programmer who has ever learned something from reading GPL code can only use that knowledge when writing GPL code. It's not literally copying the code. The training set isn't stored on disk and regurgitated.

I wouldn't put any hard rules on it, but it does seem very fair for programmers who have learned a lot from GPL code to contribute back to GPL projects. I have learned from and used a lot of open source software so whenever possible I try to make projects available to learn from or use.


Read up clean room design and on the IBM bios lawsuits from the 80's and 90's just seeing proprietary code can be a violation

Why is it different if we slap a "ml" lable on it


I guess if you trained on GPL code that should be true for your code as well.


It would be great if that were the case, but unfortunately it isn’t. We’ll need new laws for that.


Yes. It is completely valid, understandable, and reasonable to have a variety of different feelings and views about how specific code and specific licenses are used.

This is particularly the case when we see the emergence of new technologies that use it in different ways. Different people may have a wide variety of equally valid views about how it is incorporated into that system.

There's nothing inconsistent, confusing, or complex about those views.


I think the issue is not that it’s trained on open source code but that it’s trained on code whose licenses may not permit it. If you license your project in a permissive way then I don’t see a problem.


Most "permissive" licenses still require attribution.


Are there actually any licenses which do not permit training an AI model on the code?


(IANAL) It's a tool, transforming source code. The result thus seems like a derivative work; whether you are or are not allowed to use that in your work depends on the originating license. (And perhaps, your license. E.g., you can't derive from a GPL project and license it as MIT, as the GPL doesn't permit that. But to license as GPL would be fine. But this minimal example assumes all the input to Copilot was GPL, which I rather doubt is true, and I don't think we even know what the input was.)

I think there might be some in this thread who don't consider these derivatives, for whatever reason, but it seems to be that if rangeCheck() passes de minimis, then the output from Copilot almost certainly does, too. That a tool is doing the copying and mutating, as opposed to a human, seems immaterial to it all. (Now, I don't know that I agree with rangeCheck() not being de minimis … and yet.) Or they think that Copilot is "thinking", which, ha, no.


Open source licenses aren't a free-for-all. Many have terms like GPL's copyleft/share-alike or the attribution requirements of many other licenses. If copilot was trained on such code, then it seems that it, and/or the code it generates, violates those licenses.


How can it help you to speed up development but not be worth 10$/month. Your hourly rate can’t be that low.


It's great when it works, and can also be costly when it doesn't or when you blindly trust it.


Which is just another way of saying that it doesn’t really work, except perhaps for party tricks.


For me it works wonderfully, when you choose to use it. If you are just blindingly accepting every suggestion, you're going to have a bad time.

You also have to (slightly) change your flow to get the most out of it, which I know is a deal breaker for many.

I absolutely love it. It's not going to write good code for you, but for an autocompleter it is amazing.


The fact that GitHub charge only 10$/month suggests that they themselves don’t believe in their product. Because if it would actually work, i.e. speed up software development by, say, >10%, developers should be happy to pay 10 times as much or more.


This is a rather silly argument... by that logic since using the Adobe suite saves me at least it has a dozen hours every month I would be happy paying $500 a month for it.

There's a limit to what individuals are willing to pay for a subscription service irrespective of how many hours it saves you. Now if we're talking enterprise and bulk licensing then that's a separate issue.


This is a rather rude response… Your comparison with Adobe suite has a flaw, but I have no interest in exchanging ideas in this tone.


Matches my experience. I legitimately like it for quick boiler plates; it's like a better snippet engine. But Paying for it...


It's worth it if it saves you a few minutes every month.


Only if it saves you a few minutes every month in a "net" sense. If it saves you dozens of minutes every month and then also costs you dozens of minutes every month in hard-to-predict ways, it's hard to judge either way on it.


They still have to pay for servers and maintain the model itself. A neural network isn't just the data -- training and commercializing it (testing, QA, etc) is a lot of work.

You wouldn't have an issue with someone making money by using open source software (like a website that is hosted on a server running linux).


I went to see the pay URL and it said I was eligible to get it for free. Not sure if that works for some people who contrib to other OSS repo's, but I was about to give up on it when I saw I didn't have to pay, so might be worth checking.


Also, $10/mo is not so bad but I am not in the place right now for more subscriptions. I am in the process of stopping several at the moment.


Same here. With prices rising everywhere and a salary of ~40k euros before taxes (which is normal in IT in many EU countries if you don't work for big tech) I hardly have room for another subscription. People here are too quick to say "what is $10 on a $80/hour salary?"


Yea...does this mean it will stop working until I pay?

It's been really nice for autofilling console logs and boilerplate code...but $10? It's a novelty that is nice when it works, but that's a steep price point for what it is, and I don't see that changing any time soon.


People in the technical preview get a 60 day free trial, but yes, after that, you'll have to pay.


> Also feels kind of icky to train on open source projects and then charge for the output.

The business model for most of the Internet is to bait people into using things for free and then monetize them without compensation in some roundabout way.


> Also feels kind of icky to train on open source projects and then charge for the output.

How would you feel if they just provided the software without the model, assuming you could train it yourself on open-source code in an instant?


I don't know enough about how GTP-3 and ML work to really answer this, but I think I'd be fine with what you're saying if I understand the question. If they provided (and charged) for the infrastructure, but the model was FOSS and community-driven, it would be less icky I think.

I just don't like the idea of taking people's work (without asking or checking licenses) and then selling it back to them. It'd be like if Stack Overflow decided to start charging to see answers and not asking or giving a split to the person who gave the answer. I realize they aren't just copy/pasting so not a perfect parallel, but still.


Technically anyone could use those same open source projects and provide an open source solution, or paid solution as well. I do feel how you feel though it's a little off-putting.


The machine learning models are not open source themselves, so you can't just do this yourself with existing open source projects.


+1 on ickyness


I already have it in my visual studio code. I do like it. Will it stop working for me now?


Like many people I thought Copilot was neat but ended up uninstalling it because it caused more problems than it solved. Reading the comments here, it seems that most of the people who get value out of it would be better served creating a set of snippets. If all you need is to fill in boilerplate all the time or repeat general test structures but with different arguments, just make a snippet. Every major code editor supports this and they're really easy to setup and use.


I haven't used Copilot but found this comment interesting. Back in the 1990s when I started programming (BASIC and C) I did maintain collections of code snippets that I used in different programs here and there. I used to cherish those snippets and dedicated a good amount of time to maintain them available through my computers.

Then the Internet and Google came around. I found that instead of me maintaining those code snippets, I could search in Excite/Altavista for how to do something, and it will be stored there for me. Later came sites like StackOverflow (expertssexchange before it) which concentrated much of that information which before was scattered in PHPBBs and Geocities pages.

Now I see this Copilot app like the evolution of that; Instead of having to manually go searching for a snippet, I imagine I can "pull it" almost automatically while I am writing code, with an AI helping me search for the right snippet with the current code context.

That doesn't sound bad at all.

Nevertheless, I haven't used it because I DON'T want my code to be sent to Microsoft or any other company. And I don't believe in adding random code for which I don't know the license! What if there is some code which was AGPL that Copilot happens to use? that's pretty bad.


Or better yet, save the snippet as a function/procedure in the code, and avoid needless duplication (DRY/Occam's razor).


You can take snippets with you from job to job tho.


I guess that depends on your contract.


In my experience, most of the code snippets weren't useful and testing the less useful code snippets costs more time than it saves.


I've been using Copilot for a while now. I'm lucky that I don't have to pay moving forward but I would totally pay $10/mo for this. When writing tests, this thing works so well that it saves me 10-20% time writing code so $10 is nothing.


> I'm lucky that I don't have to pay moving forward but I would totally pay $10/mo for this.

How? I was in beta but looks like I'm kicked out. I also verified my student status but get prompted to pay. Are you a maintainer? Have you verified that you have access?


I'm a maintainer of a large open source project


I _think_ a coupon code or something will eventually be available on the Github Student Pack website[1]. No clear answer yet if it'll be available for Teachers[2] as well.

1. https://education.github.com/pack

2. https://education.github.com/teachers


I didn't see the student license at first, but after re-verifying my educational status, I got it: https://i.quartzic.co/yIoSJFfH


I am not sure the exact logistics of access (I’m also a student so I will probably look into trying to get access when I have a chance), but in the blog post with the original announcement > It will also be free to use for verified students


I have had a lot more success with Tabnine. One, it runs offline as well as online so the performance difference with/without internet is unnoticeable. Two, it understands context much better. I was prototyping in Python with Tabnine turned on without the LSP and I felt no need to install one. It spits out uncannily good suggestions if you are using a popular library like Beautifulsoup etc.

Copilot is marketed as a pair programmer but the code quality is often times just wrong, not just bad. It thinks it understands what I want based on the function name and parameters but the generated output is no where close to what I want.

Multiline AI generated suggestions are not a good idea anyway (not yet at least). AI based LSP/auto completer would be much better at this stage with a lot faster DX.


I really like copilot, but my outside of the content being generated it still feels a bit slow and somewhat hacked into vscode. It sometimes interferes with regular "intellisense suggestions" as well.

I've been in the beta since almost the beginning I have not really seen much improvement on the frontend side. Since its release, the changelog only mentions 10 small (or so it seems) improvements

https://marketplace.visualstudio.com/items/GitHub.copilot-ni...

On the backend side, I feel like I've started to "figure out" copilot a little bit. One thing I'd like to see is inline completion which I think gpt3 can do now but copilot which I believe it's based on cannot.

I think I will pay to continue, but I'd like to see some frontend improvements and maybe some backend alternatives. Ideally I'd love this to be open source but compute power doesn't seem feasible (?) unless we start magically crowd sourcing our computers to run a model somehow.


Copilot does inline completion right now - they implemented it into copilot as a pilot program[0] before OAI went live with that new model IIRC.

[0]: https://openai.com/blog/gpt-3-edit-insert/


Insta buy for me (expense hopefully). I am just continuously mind blown by it, and I quickly notice and get frustrated when it's not enabled. It really is giving coders superpowers.

EDIT: looks like I'm getting it for free because of my contributions to open source o.o dope!


Yeah can't live without it anymore. It's already muscle memory to intuitively pausing typing, just waiting for Copilot to complete my line. Pretty good sense on what it should get right too. Knew this was gonna be a $10/month thing. oh well.

Hope though, when AI is becoming increasingly useful and seamlessly integrated, they not gonna take an arm and leg for it. It's just gonna be way too good to pass, people won't really have a choice but pay.


I’m baffled by everyone questioning whether it’s worth $10/month.

I’m certain Copilot gives me more than a 2% productivity boost. That’s a conservative estimate (I wouldn’t be surprised if it’s more like 10-15%). If you consider 2% of what a developer makes each month, it comes to a lot more than $10. I’m not based in the US, but Levels.fyi suggests it’s not unusual for devs there to make $200K/year, that would mean $16.6K/month, 2% of which is $333. Maybe that’s a bit reductive, but the point is, $10/month is negligible if it gives you a noticeable productivity boost on a developer’s income.

And by the way, I don’t particularly love using Copilot. It can be annoying now I’m over the honeymoon period. But I think it’s pretty clear it speeds me up by a noticeable margin, and time is money.


As the maintainer of some Python libraries, how do I get my part of that $10/month because Github Copilot was trained using my code...


Where did you learn to write your code, and which open source devs did you compensate for that?


I learned how to write code from books I purchased with cold hard currency. In the before times when dialup modems were the norm resources were scarce, and things like Github Copilot were just a pipe dream.

I have also taught classes and provided mentoring and support to new people up and coming in both programming and infosec. I would argue that as an open source maintainer I am actively contributing back and compensating those other developers. Unlike Github Copilot I am not selling the things I was taught, I am freely making it available to others.

It feels very icky that Github now gets to sell what it learned from my code base, when it has already been shown to replicate code with a 100% match, versus learning how to build on top of ideas or finding novel solutions to problems.


This is the price you pay for getting to use GitHub for free.


This feels like Linus Torvalds asking for a royalty check for every server out there powered by Linux.


I hope I never again have to work on a codebase/language where copilot would be worth subscribing to.


Nonsense. I've been using it to write tests, and it does a phenomenal job. If I write out the positive case, it will suggest the negative case and help me through the various permutations. This has by-far been the most useful part of co-pilot so far.

It's nothing I couldn't do myself, but just makes my job that much easier and quicker


Maybe try property based testing instead?


What does that even mean? It's like saying, "I would hate working on a codebase where autocomplete would help me". It's such a general statement.


It's quite hilarious to see both the wide-eyed futurists and unabashed Luddites in this thread.


Unnecessary hate. I used it a while ago while writing some complex aggregation and grouping of data. It was pretty painful to write until I tried with Copilot and the result is both accurate and easy to read. I wrote unit tests and it's been fine ever since. This is but one example of the value of it.

I am sure some of the typical cynicism here will turn this into a protracted argument of "well maybe you shouldn't be a shit developer and you would be able to fit all the complexity in your head" but whatever.


I don't think the parent was criticizing Co-pilot. The point was that codebases that need a lot of boiler plate and predictable code are not fun to work with.


Yes this is the the whole point of this

Unfortunately, there is no way I can hope for that much…

So yes, please, take my money and do All my boilerplate lol


You must only work with languages that you just invented.


Anyone else using Copilot just as a glorified copy&paste helper? It’s great for repetitive tasks, but I’ve yet to encounter a situation where it really helped me to write meaningful code. At least I’d expect it to work together with intellisense, so it does not propose stuff that will get a red underline right away.


I'm just happy that Microsoft is finally charging directly for a developer product!

When they keep giving out freebies (VSCode, npm etc.), I never know which direction the product is going to evolve (e.g. unnecessarily tight integration with Azure).

With this, there's at least direct alignment between end user & the product.


What's the criteria for being considered "a maintainer of a popular open source project"? They never actually publish the criteria anywhere from what I can tell. They just say visit the subscription page and if you are eligible it should be available to you and if you see a charge then you are not eligible. I think though they should still be transparent about what their metric is for determining popular projects on GitHub; otherwise, the code that determines eligibility might be broken and no one would be able to tell. Or worse they could just be lying about it entirely.


At a minimum, 4.3k stars is not enough, because I don't qualify.


Someone from GitHub reached out to me because of this comment and said they were fixing it. The problem is that Racket's license file isn't simple enough for their automated tools.


This is curious - I maintain 2 projects with 2k cumulative stars, and I was able to claim the free access. Wonder what the metric is? Maybe creation date has something to do with it?


If you want to check if you qualify: https://github.com/github-copilot/free_signup


I get " Congratulations! You are eligible to use GitHub Copilot for free." that was unexpected but in retrospect https://github.com/NixOS/nixpkgs/ is pretty popular. (Currently 9.8k stars)


Yeah that just redirects me to the paid page. I do wish the criteria were a little more transparent.


I have a repository with 625 stars, but it redirects to the subscription page.


Reporting on my own experience, I got access to Copilot a few days after it was announced and am currently not expected to pay for it.

I started a project that currently has 9.4k stars (now mostly maintained by someone else), and still maintain a project that has 2.5k stars.


A sample of the first 25k repositories and their stargazers on GitHub shows that the top 1% have over 600 stars, and the top 0.1% have nearly 5,000 stars. That's a very small sample, however.

[1]: https://github.com/andrewmcwattersandco/github-statistics

[2]: https://docs.google.com/spreadsheets/d/1HBSwxr0jkUoMulQxyVTC...


> People who maintain popular open source projects receive a credit to have 12 months of GitHub Copilot access for free. A maintainer of a popular open source project is defined as someone who has write or admin access to one or more of the most popular open source projects on GitHub

https://github.com/pricing#i-work-on-open-source-projects-ca...

I like how "open source project" == "on github". Can't say that I am surprised though.


I authored/contribute/maintain stuff that is used by tens of millions of people world wide. I do not qualify. ¯\_(ツ)_/¯


I have 2 popular python projects, one with 4.9k and one with 2.3k stars and I don't qualify :/

https://github.com/kootenpv

If anyone knows why pls let me know


> What's the criteria for being considered "a maintainer of a popular open source project"?

The FAQ [0] says

> A maintainer of a popular open source project is defined as someone who has write or admin access to one or more of the *most popular open source projects* on GitHub

(emphasis added)

[0] https://github.com/pricing#i-work-on-open-source-projects-ca...


That's the problem, what is a "most popular project?"


It goes on to say "Simply visit the GitHub Copilot subscription page to see if you are one of the open source maintainers that meet our criteria for a complimentary subscription"

When I go to https://github.com/github-copilot/free_signup it says:

" Congratulations! You are eligible to use GitHub Copilot for free.

Thanks for being a part of our open source and education communities. GitHub Copilot uses the Codex AI model to offer coding suggestions."

I have a project with about 3k stars, and regularly contribute to another project ~4k stars (Where I'm also the primary maintainer, although it's not on my account), as well as some things in with hundred and dozens of stars.

I don't how high up that is in the ranking, although given that most projects get 0 stars I suspect it's probably higher than you'd might expect.


Ummm, yes, that was my original point. What does "one or more of the most popular open source projects on GitHub" mean exactly? Do you need a certain number of github stars on your project? Are you listed on some "most popular github projects specific page"? or what?


I got this free access. Tried to figure how to request this "Verified" status, whatever it means, but github seems to set it automatically and notified me "you are eligible to use GitHub Copilot for free". I'm not sure how exactly they do it and what defines "the most popular open source projects". The most popular repo (by stars) I have is with 3k stars. Apparently it is enough, not sure.


[deleted]


There isn't a definition. The FAQ states:

    People who maintain popular open source projects receive a credit to 
    have 12 months of GitHub Copilot access for free. A maintainer of a
    popular open source project is defined as someone who has write or admin 
    access to one or more of the most popular open source projects on GitHub. 
    Simply visit the GitHub Copilot subscription page to see if you are one of 
    the open source maintainers that meet our criteria for a complimentary 
    subscription. If you do, you should see that you can add GitHub Copilot 
    for no charge. If you see a charge on the purchase page then this means 
    that you do not qualify at this time. Once awarded, if you are still a 
    maintainer of a popular open source project when your initial 12 months 
    subscription expires then you will be able to renew your subscription for 
    free.
As the principal author of a project used by millions, working 100% on open source today, I can say it's not free for me. My code is being used to train your models, but I can't use your models without paying you. That goes against the spirit of my share-alike license.

"Write or admin access" is also a pretty crappy way of evaluating people. "Write or admin access" goes by politics, rather than by contribution. It's also hostile towards junior developers, who might not have commit access on projects which work by forks and PRs.

This also excludes folks like Richard Stallman who don't host their projects with you (not that he'd use copilot, but just saying).

github does a fine job reviewing nonprofits. It feels half-baked that it can't identify open-source contributors.

At the very least, you should enable anyone whose code you scraped to train codex to use codex. I can make codex spit out code very similar to code I wrote. It's clear it was trained on it, and is creating a derivative work.


Can you please share the link for this FAQ?


Only $10 a month to rack up dozens of license violations? What a deal.


Copilot is a steal at $10/m.

HN can set itself apart from Twitter and Reddit by celebrating great achievements rather than tearing them down.

Copilot stands on the shoulders of open source, yes. So do many of our personal and commercial projects. Copilot benefitted from having beta users. That relationship went both ways.

A big thanks to the Copilot team for letting us be a part of the beta. I will happily pay $10/m for this.


> Copilot is a steal at $10/m.

  agree !!!   as any burgler-thief-attorney will tell you, it is *totally worth it*


Steal is a good word, considering that in some cases Copilot violates some open source licenses.


to be clear they actually took much of what Tabnine was doing and declined to let us use GPT3 (MSFT closed Open AI). It turned out to be good because there are lots of other great models out there from T5 (SFDC), Meta, Google, etc that will continue to move forward faster.


Shouldn't Copilot technically be FOSS since it trains on open source?


Even if the model was FOSS, the infrastructure needed to run it would be costly.

Given the cost of single GPT-3 codex query, it's very likely that Microsoft/Github is still taking a huge operating loss at 10USD per month.


Probably costs 3-4 times that to power the infra. They’ll make money back if they’re able to do several step changes of improvement in cost optimization.


If it should, it's a lot more complicated than that. "Open source" isn't a boolean where as long as you share your source you're compliant. Licenses usually require that a copyright notice be redistributed along with any source code and / or attribution in other ways, sometimes they require details of any modifications, etc. They're not doing that.


I’m especially curious about this if it trains on GPLv3 and AGPL licensed code.


Yes, it is a derived work and should be GPL if it was trained on GPL code.


Does copilot learn from and suggest patterns in the same codebase that you're working, or does it just pull from the huge pool of projects on GH?

How well does copilot help with languages like Elixir that are less common? WIth TypeScript it's been remarkable, but that's one of the most popular and surely very familiar to devs and GH, so I would expect less popular like Elixir to not perform as well.

Does copilot work for shell scripts?

I'm a vim person and don't want to use VS code. Is copilot worth the hassle to get installed into vim?


I've played with it a little bit:

Copilot did pretty poorly when I tried using it with Julia- it kept suggesting Python code. I suspect it would do something similar in Elixir.

I'm also a vim person who doesn't want to use VS code, but I've gotten more than enough value to get into my first IDE (with vim keybindings). A lot of tedious C++ code is getting correctly auto-generated.


I don't think c++ is even on their supported language? the copilot page lists python,js,ts,ruby,go.


From the “About GitHub Copilot” page:

“GitHub Copilot is optimized to help you write Python, JavaScript, TypeScript, Ruby, Go, C#, or C++.”

https://docs.github.com/en/copilot/overview-of-github-copilo...


It has first class Neovim support, possibly a better alternative for Vim person than any IDE.


> suggest patterns in the same codebase that you're working

Sometimes, with variable results. I think I've only observed it guess patterns from the current directory

> Does copilot work for shell scripts?

Yes, it gave me this earlier today while editing my .zshrc:

  # kill a process on a given port
  killport() {
    lsof -i :$1 | awk 'NR!=1 {print $2}' | xargs kill
  }


Can't wait for someone to integrate this into a shell. Does anyone know if such a project exists?


Probably not exactly what you're looking for, but Warp (a new terminal client) has "AI Code Search" built in that's powered by GPT-3. Quite useful for someone like me who tends to avoid the terminal when I can.

https://www.warp.dev

https://docs.warp.dev/features/ai-command-search


There's this cool blog on them testing it out against popular git commands:

https://www.warp.dev/blog/replace-git-cheat-sheet-ai-command...


them as in that’s you :)


Integrating into a shell, for immediate execution, seems very very dangerous. You still need to carefully test/scrutinize everything that comes from copilot.


"Copilot, how do I fork in a shell?"

  :() { :|: } :&
Thanks copil[user disconnected].


lsof supports machine-readable mode:

  lsof -i ":$1" -Fp | tr -d p


> Does copilot work for shell scripts?

Oh wow—a language where there are: 20 ways to do something, three of them are common, but only three others actually behave, by any standard, correctly, while being among the least-common in public code, seems like exactly the wrong kind of thing to use this for.

Shell doesn't need machine-learning autocomplete trained on existing shell scripts, it needs a hand-built aggressive linter.


> Shell doesn't need machine-learning autocomplete trained on existing shell scripts, it needs a hand-built aggressive linter.

Something like https://www.shellcheck.net/?


Copilot seems to learn from elsewhere in my codebase, and is able to utilize patterns I've used elsewhere in the codebase when prompted in a different file. Isn't perfect, but it saves a ton of time.

My primary usage is shell scripts, as it seems to struggle on complex code, while shell scripts are typically a lot of simple code.


It doesn't learn from your codebase but it uses the context of your code so any pattern will be picked up.


… and variable name spelling mistakes!


which is actually good because then i just right click the variable in my IDE and then click "refactor > rename" and i'm done


10$/mo is pretty steep for something that gives you bad info 90% of the time, I will def be disabling. I was hoping for maybe 1-2$/mo, it's just a small addon feature after all


Thank you, GitHub, this is one of the best things!

No, it cannot make me write code I couldn't write before. It does not autopilot and does all the coding by itself. But it still boosts my productivity greatly, making me relaxed while coding and focusing on the important part rather than errands.


I've been using it for a while now. When I forget some syntax occasionally I'll switch this on instead of searching documentation or google, but more often than not my IDE can get me unstuck with less overhead.

Also if there are some repetitive sections of code I need to bang out quickly this will auto fill that repetitive pattern (although I'd argue this is usually a sign that the code should be cleaned up)

I avoid letting it fill in large swaths of code though. I have no idea where that code is coming from (license infringement?) and it tends to go way off the rails.

Additionally I feel that it makes me a worse programmer if I allow it to take over too much.

I've been programming for 20 years (more if you count my time as a kid) and have a certain flow. Part of that flow is the natural pause between thinking of solutions and typing. When the computer is beating me to the typing portion (and often times making mistakes) I would find myself doing more code review than code writing. Sometimes a few bugs popped up and it was thanks to copilot (or was it me failing to correct copilot's mistakes?).

I found my brain sort of switching into a different mode. Rather than thinking about my next steps I was thinking about the steps the computer just took and how I needed to clean them up.

Rather than the AI being my reviewer during a paired programming session, I was the computer's reviewer.

So now, like I said I use it very sparingly.


Additionally: when I allowed copilot to do heavier coding for me, I found myself returning later and feeling somewhat unfamiliar with the code. That's really bad for maintenance, project pace, etc. I don't want to try to re-learn, fix, remember and maintain code that someone else (a computer in this case) wrote. Its hard enough doing so reliably in group code settings (work), now injecting that into my daily coding life feels like a solution I didn't ask for.

I will say that I'm not averse to change and do appreciate the new tools that we have available to us - Starting on a x386 writing QBASIC as a kid to using Jetbrains Rider is an indescribably different experience.

That said, I'm not ready to move to the backseat and let the computer take over yet. In small doses copilot is fine, but I wouldn't lean heavily on it for large projects or to do the thinking for me.


So it's free for students... I wonder what consequences it will have for coding assignments and projects. At the minimum I hope it will also be free for instructors so they learn to know what to expect and how to design assignments that can't be auto-solved by Copilot.


Are there any plans for GitHub Copilot to ship an API? I think it would be interesting to set it up w/ my side project https://codeamigo.dev


The API version of copilot exists.

It's called OpenAI Codex. https://openai.com/blog/openai-codex/


Yeah but it takes forever to get off the waitlist (I've been waiting for almost a year)


You may have luck by emailing the CEO and asking politely. I wrote a comment previously (https://news.ycombinator.com/item?id=30692202) about how I got immediate access to Codespaces by emailing the GitHub CEO.


Thanks!


Hey, fun side project!


I found GitHub copilot an interesting heuristic on how expressive the programming language / framework you are using is.

It is very useful for things that I would call boilerplate, e.g. you have almost duplicated code (say in a view and a controller) and need to copy from one to the other.

It is annoyingly bad for autocompleting an api as it tends to be slightly (and plausibly) wrong.

I haven't found it very useful for anything else.

Working on a project where I have to do lots of the first makes me sad, so I tend to try to avoid those projects - but if I was forced to for some reason it would be worth $10 a month. However, if enough of the programming I did could be helped by github copilot for it to be worth that much I would start to get worried I was working on the wrong sort of problems and try to move into something different.


I use Copilot for .NET. It's useful to generate bits of code like methods calls by repeating what I've previously done and changing variable names and types. It's a kind of bit smarter Intellisense.

I can't use it to generate longer chunks of code like methods or functions, because it will do it a bit wrong and I loose time to correct it.

It can somehow generate correct and fitting code, but it takes multiple tries and writing comments in which you describe exactly, with lots of details what you want to do. At that point I'm better off writing the code myself.

However, if the method should be small like VerifyIfNumberIsEven, it does a good job.

Probably I would pay 10$ for it.


I have been using CoPilot for about 5 months. For Python and JavaScript (I am not much of a JavaScript developer - not a primary language for me) I found that it is very worthwhile. It is easy to not accept generated code, or tweak and test generated code.

I recently started a 100% Common Lisp job and it does not work nearly as well for Common Lisp. A lot of generated code is Emacs Lisp.

Two months ago I would have signed up for a payed account with no hesitation, but I need to re-evaluate it with Common Lisp again. BTW, I happily pay OpenAI for GPT-3 APIs instead of using it for free. For NLP work, OpenAI's APIs have high value to me.


I tried GHCP but found it overall unhelpful and kind of stressful to use, because of potential bugs I might overlook and "import" into my project.

Definitely does not seem worth paying for me to end up more stressed out, haha.


So it depends if you prefer writing or doing code review :) You'd maybe need another tool which converts review work to writing work


I'm good.. without it :D


I'll consider testing it if it becomes free software (inc. the model) and runs offline like all my other dev tools.


A few observations:

The IntelliJ Copilot plugin became worthless just before the release. It borks up the formatting and requires almost more keystrokes to make the code work than it saves.

It sometimes works brilliantly, the result has almost always been either duplicated code which could use refactoring or simple minded attribute access code which could be solved generically. I have the fear that it will push developers to go the "easy route" and not think about the code too much while churning out more and more lines of generated code, so I'm unwilling to recommend it to junior developers.


$10/mo is fine. I pay at least $10/mo for JetBrains products.

However I wish there was more competition. Github could rescind access to Copilot or charge $40/mo or it could slow down because their cloud is overloaded with new users, and I would be out of luck.

Tabnine and Kite are alternatives but I've heard they don't work nearly as well. I wish there were similarly-effective alternatives which charge similar rates for cloud hosting / profit, but open-source their datasets and algorithms, and just generally provide a fallback if Copilot's quality ever goes down.


copilot actually pushed Kite out of business but I am here at Tabnine and we have been doing great. MSFT is always tough to compete with but I did it at GitLab before and I think with our strong take on personalized models, ability to run anywhere (local, cloud or even you VPC), all while respecting code and licensing.

https://www.tabnine.com/tabnine-vs-github-copilot


It would be nice if people stopped giving Microsoft all their code to use to then sell back to them.

Since this is derived from code Microsoft did not write, or ask permission to use, it should be at the very least free to use.


People can do whatever they want with their code, and give it to whoever they want


what about those that have code on GitHub that is source available but not licensed for reuse?


Imagining Steve Ballmer down in Hell laughing at all of us.

"They gave away all their code, so we packaged it up and sold it right back to them, the stupid bastards!"


The free thing for (a few) open-source maintainers seems needlessly complicated... Who should qualify is non-transparent. They'd have been better off just charging everyone for it. Not an instant buy for me for the moment. Often it works well, but it also frequently takes time to correct/sort-out the suggestions. It might in fact be making me dumber as I wait for a suggestion rather than thinking it out.


It’s not worth $10/mo. I wouldn’t even pay $5/mo. Usually, it generates code with incorrect logic what is sometimes hard to notice.

It’s also awful that they took free code (open-source), and now they want money for it. Make it open-source and free to use…

Some say it’s great for repetitive tasks, but if you write repetitive code (tests also) maybe you should look for other solutions than “auto-generating” unmaintainable code.


they used all the code in GitHub regardless of license. hope they avoided the Oracle code ;-)


Actually, I quite like it. Especially for these repetitive things one can forget. Stuff like there is a deleted field in one table, usually you would write an sql query like

   .filter(table.deleted==False)
nothing complicated, but one tends to forget it. So i got into the habit of starting a new line in whatever query I am building and see what copilot thinks I forgot.


I was a beta tester and just got kicked out. This explains why it happened.


Next step: if Copilot can recognize often-used patterns, and variations in them, it should be able to autogenerate libraries that capture these patterns and make them useable via simper functions/classes.

That way it can capture code in a library instead of having thousands of developers copy/paste the same code snippets.


I don't really get this argument why it should be a problem that it is being trained on other public code.

Every human was just trained in the same way. Why isn't this a problem for every human?

I really don't see the difference. One is an artificial neural network while the other is a biological neural network?


I guess it's a bit of a philosophical disagreement.

In my view: I don't believe a machine (at least not any we're capable of creating) can truly learn.

Copilot is a machine working on its inputs. Humans think and create. Maybe it can be argued that humans are just more complicated machines, but I don't think most people would agree with such an equivalency.

Copilot is constructed almost entirely from others' code. There's a tiny fraction of original "ai glue" in there, but the end product is arguably a derivative work of all that code it was trained on. As is its output.

It can also be argued that the AI part is really just an obfuscating copy machine. One that was created specifically for that task.

And of course, the real killing blow: if/when it reproduces training code verbatim, and you don't notice... will "copilot did it" be a valid defense in court? There are different opinions on that I guess, but no one knows for sure -- and I wouldn't take that risk.


It’s not about the learning part, it’s about the copyright and money part. If you learn how to play song X by The Rolling Stones, you cannot just make money playing song X in a concert. Sure thing you can play song X in your father’s birthday.

Here GitHub (Microsoft) is charging for a product that in certain circumstances violates copyright.


But Copilot (or a human) is not copying some existing code. It (Copilot or human) just used existing code to learn. So it is only about the learning part, nothing else.

Yes, if Copilot (or a human) would copy existing code, that would be a copyright violation. But none of the arguments here are about that. It's just about the learning.


it is possible that on a long snippet that you could be copying code from non-permissive code. In addition GitHub themselves have said that this is not resolved and will likely need to be adjudicated in the courts. The only way to avoid this is to train on only fully permissive code. We know here at Tabnine having done this for 5+ years.


And again, the same applies for a human as well. It's just the same.

I'm pretty sure I have done that by accident already in the past without noticing. This is not so unusual when you write some code with very common patterns.

And then this even can apply for code which you have not seen before. E.g. write some bubble sort function. Very likely you will find exactly the same code online.


I really like Github copilot. It is very useful for me because I wrote a lot of repeated logic chucks of code. I do research in HEP and if you know how ROOT CERN work then you can realize how useful copilot could be for that only.

I think I myself teached copilot a lot of things about supersymmetry :)


I can't tell if I can get it for free or not other than that vague statement about subscriptions.


> We’re making GitHub Copilot, an AI pair programmer that suggests code in your editor, generally available to all developers for $10 USD/month or $100 USD/year. It will also be free to use for verified students and maintainers of popular open source projects.

> Do you want to start using GitHub Copilot today? Get started with a 60-day free trial, and check out our pricing plans. It’s free to use for verified students and maintainers of popular open source software.

Seems pretty clear. If you're willing to do your own research (aka going to the CoPilot site): https://github.com/github-copilot/tp_signup, you'll see that pricing reflected here as well as the date when the free period ends, which is August 22nd.


I have an open source project. Do I qualify or not other than guessing at what appears on the billing screen


Yea, I also have no idea... They could be more specific what qualifies as popular os project


It copies lots of code => copy-lot => copilot

I guess that means I feel the level of expectation is in the name.


$10/month is a perfect price - that was my exact estimate of what I was willing to pay for this service when it becomes non-free.

To everyone expecting Copilot to magically write the code they are thinking about - you are missing the point. There is a learning curve of using this service that allows you to be more efficient in expressing your ideas. It's not about doing all the work for you. It's like auto-complete on the next level.

Licensing concerns - oh come on.. what is the big deal? There are millions of "for (int i ..)" loops out there. Like anyone gives a damn about 5 auto-generate lines being _probably_ copied from somewhere. Moreover, if you used Copilot just a bit you would know that is not how it works.


Could they charge more and push the product improvements faster? Seems like 10/m/u is optimizing for how bearable the price is but then you have a bunch of users that are quick to complain while you don’t have the budget to make rapid improvements to the platform.

Charge 10x more (or more) and let the dreamers help push the product further and faster. Once it’s awesome then charge a commoditized price for the service.

Charging 10x+ more means we have enough skin in the game to properly send feedback and improvement ideas. At 10/m/u it’s barely worth you reading my support tickets and it’s almost with me just not using it while paying for it.

Thoughts?


The day Copilot or something like it catches on is the day when programming changes for real. Instead of being hired to create new systems or extend existing systems built by other programmers we will only be hired to fix Copilot generated code.

I suffer enough with legacy code created by junior programmers that long left the company. I imagine how much more fun will be to work with this type of code.

* I know Copilot is not capable of creating full systems yet but it is a matter of time before they evolve it to generate all the bolierplate code for you based on some comments you make or, even worse, some UML abstraction!


I'm in that middle-state where I program daily for work and hobby, but in earnest I only started about 1.5 years ago. I can make a lot run, mostly API work, data transformation.

I joined the Copilot beta and where it has helped me most is: 1) Ideas 2) Filling in the broad-strokes.

It's name does not deceive, it is only a co-pilot. It will not tell you where to go, it will push you in the right direction and let you focus on the more difficult parts of a task.

I signed up for it because at $10 per month the keystroke reduction is somewhere in ballpark of 70% or more. That's the real value in my use-case.


And already having [scaling?] issues =) https://www.githubstatus.com/incidents/9xb0vpwcj8gj


This has probably been talked about but...

If most code is "bad" code (any definition works) and this AI was trained on all/most code on GitHub, does that mean that this AI mostly helps to produces bad code?


It definitely can. Here it suggests a plausible looking but incorrect function for averaging integers: https://twitter.com/ridiculous_fish/status/14527512360594513...


I have been using copilot for some time... I'd say yes and no. It helps you a lot when you are writing repetitive code, so in a way it encourages you to write the repetitive BS instead of making a function for that or something. But it's also helpful for writing tests and nice error message. You just type

     if (x.length < 10) throw
And it figures out the rest. So while sometimes it encourages bad code, when you know how to use it well, it helps you write the good things I'd normally be too lazy to write


It depends. I have not collected data to prove my observations, but I find the rust suggestions better quality on average than the python suggestions. Some people do terrible things in Python.


one way to minimize this is to train on your own trusted code. You do need a reasonable amount of good code (ideally with good comments too) this is one of the options that we have here at Tabnine. Train on your GitLab, Bitbucket or GitHub repos.


So I was using copilot for a long time.

10$/Mo. Is way to much for what you get.

I mostly write js/ts code.

The suggestion feature / auto-complete feature is wonky at best and leads to bugs or just bad code in the worst case.

Even when you write comments or have a function like `addOne` and you want to add `subtractOne` it will not get it right a lot of times.

Then you have the cases were it throw 50 or more lines code at you for something very simple.

Catching errors or error handling is basically non existing.

I tried it for writing tests. It bad. It does not help at all.

I uninstalled and after some hours of work I don't really miss it.


I see programming as coding, testing, and documenting. I'm not looking for an AI to do it for me.

But I would be interested in me picking 2 of those 3 for me to do, and the AI can do the third for me. So if I love coding and test writing but don't like documentation, then the AI can do the third leg for me.

I think that the quality of results from the AI would be much better than what Copilot is capable of. Even if I focussed on test writing and documentation, I think that the AI should be able to write decent code based on those two inputs.


Has anyone been able to sign up since this announcement?

I get to a "Confirm your payment details" screen, but there is no further action I can take (ie: no button to press or link to click to "confirm"). It does say "You will be billed $100/year starting August 20, 2022" -- but when I view my "settings", it tells me I haven't signed up for copilot.

I tried various browsers, including Edge on Windows 10 sans plugins (the combination I would expect to be the most supported for MS owned github.com).


I don't even see that. I see a "Start my free trial" button and it just takes me to the generic billing screen. How do I even purchase this? Is it its own subscription?


There are some GitHub problems that are getting addressed right now.


Ah, victims of their own success? Glad to see people are lining up to pay for it :-)


I’ve been meaning to play around with this! Now fingers crossed for somebody to make DALL-E 2 generally available! (or for somebody to hook me up with a referral code or whatever :) )


Is this bribing developers so they stop talking about code laundering? The problem does not disappear.

There would be no issue if they trained the model on Microsoft's closed source instead.


I haven't used Copilot extensively, only for fun. But I find it interesting that people are questioning the price.

Given that the cost of a software engineer's time is so high, $10/mo. seems very reasonable if Copilot saves you more than that in time per month. So in a vacuum assuming all dollars are spent with equal productivity, if I take the equivalent of $1000/mo. in time writing boilerplate, and I can reduce that to even $989 with Copilot, it becomes a good deal.


Copilot's funny because it can have A+ code formatting and style gimmicks but D- in terms of comprehension of the task.

Which means it's going to be harder to evaluate junior candidates code without actually running and testing that code because they'll have built a huge library that looks really well formatted but has logic gaps which are difficult to catch on a glance. As it stands currently, usually some style and organization tells you this person gets it.


The most concerning part of their FAQ is this:

>What data does GitHub Copilot collect?

>...

>User Engagement Data

>...

>Code Snippets Data

>Depending on your preferred telemetry settings, GitHub Copilot may also collect >and retain the following, collectively referred to as “code snippets”: source >code that you are editing, related files and other files open in the same IDE or >editor, URLs of repositories and files paths.

It's possible to opt out, but it's not disabled by default, and this code snippets might be very sensitive.


At worst, it helps me type fast by finishing my lines of code for me, as pressing `tab` is faster than typing boilerplate.

At best, it is scary in it's ability to pre-emptively suggest context-specific implementations of functions before I have even considered what I might need to do. It probably helps that I am very particular about how I name variables, which seems to help copilot infer my needs.

But at least a couple of times a day, I am blown away by it.


I've been using Copilot for a few months now and in general it's really impressive. For me it's been a very useful autocomplete for repetitive tasks - writing tests, scaffolding components, utility functions, etc - Copilot feels like magic sometimes, it just knows what I need to do and provides the code template almost instantly.

Will be paying for the public version, absolutely worth the money in a single day's coding alone.


It's hilarious to me that Copilot is now GA, but our rep GitHub contact has been promising to get us onto the merge queue beta for months and it's still vaporware. I'm beginning to wonder if that product exists at all.

https://github.blog/changelog/2021-10-27-pull-request-merge-...


> GitHub Copilot costs $10 USD/month or $100 USD/year per user. GitHub Copilot is only available to individual developers and individual accounts within organizations, but organization administrators cannot purchase bulk licenses for teams at this time.

No bulk licensing of Teams? This makes no sesnse, so if a team wants to make Copilot part of their official tools, each member have to purchase this individually. Thats a huge PITA


The thing I like about Copilot is that it breaks the “static friction” I have when I’m sitting down to work on my side project. It makes me more likely to get started.

That said, launching a dev tool without Orgs integration seems dumb. I work for a FAANG and so can’t use this professionally. It’s a totally different price calculation for “programming as entertainment”. Is this worth more than Netflix to me?


I have the student developer pack. I should have access to Copilot, but it prompts to pay. Does any other verified student currently have access?


I do, but I did the beta before.


Can't wait for the next wave of garbage outsourced code generated at bottom dollar because it was really written by copilot. God help us.


I like it but I’m not sure about $10/month.


After using Copilot with Rust for a few months, I’m constantly impressed with how accurate it can be. Sometimes I catch myself pausing to think about the solution, then all of the sudden copilot just writes the implementation for me.

Having a type checker is critical for this, though. When I code Ruby I’m much more skeptical of Copilots suggestions.


Nice. Not even halfway through my CS degree and my would've been future job has already been automated. Thanks GitHub!


No more than the suggestions on your phone for the next word replaces you as a friend to talk to. It is sometimes right as to the next word to use and sometimes can make comprehensible sentences, but it is still very incapable of doing anything all that useful.


I used github copilot for a week, got some good laughs, then never used it again. Working at a publicly traded healthcare company, it worries me that my IDE has the technical ability to snoop on my code. More than anything else, github copilot is a cool parlor trick, in its current form. Surely it'll improve over time.


I've been using copilot for 6 months and it has been an exceptional tool worth the money even if not always generating correct code.

What i'm concerned with is that i think it can interrupt the flow of thoughts while programming since you have to review the generated code, but that is the price to pay to use such a formidable tool


I’m so ambivalent about this.

I was excited when it came out, but it ran slowly and annoyingly (crashes) and I gave up.

I already have a jetbrains license; when they release a similar feature I’ll consider uping my subscription to get it.

As it stands… eh. It’s not that great. I couldn’t really be bothering to continue using it for free, I’m not gonna pay for it.


I've really liked Copilot as a source of tab completion over the past year, it's far from perfect but it gives decent hints about 50% of the time, however it is absolutely not worth $14 AUD per month, maybe $15-$20/year I'd consider it but I already have subscription fatigue.


So... is there a watermark like on Dall-E so I can easily tell Copilot did my students' work for them?


The same as always: give an exam on paper, and forbid the use of devices. Because I can tell you (with high probability) that many of your students already have their homework done by somebody else.


Co-pilot is great when you have a repetitive programming task to perform. e.g. if you are nesting module imports through several layers of python init. Co-pilot is great at tab-completing `from myproject.some_module.nested_module.actual_module import Foo as Foo` and similar tasks.


It is awesome for me (diving back into Django after a long pause)

It seem to understand the common boilerplates things in Django that always annoyed me and type them for me. It understand the structure and adapt them to my code: imports, connection between modules, etc.

For sure, you need to be carefull with it.


It has saved me a lot of time writing trivial shit that I usually have to copy/paste from the internet anyway. Is it worth $10 per month? I dunno. But they get me a kick-ass IDE, I get to store my project (privately) for free, and they save me a lot of time.

So I'm probably taking it.


Coding with copilot is like working with a super eager low-quality outsourced programmer.

They kinda know what they're supposed to do. Sometimes they do the right thing, sometimes they get it completely wrong.

In either case you can never let anything they do get committed without a review.

So are they really helping?


I don't want to use it, because I don't want to feed Copilot input that makes it better.


I suspect they'd have more revenue if they priced it at $100/user/month.

Right now, there is no competition, and an amateur developer will really benefit from copilot - certainly they will be more productive than a developer that demands just $1000 more annual salary.


Tabnine is a competitor! We have been doing this for 5+ years with more developers than Copilot. Please take a look at the posts and if curious visit https://www.tabnine.com/tabnine-vs-github-copilot


Should we add a badge or something, indicating that project is using code generated by machine?


I love copilot, but I don't even pay for github, maybe have it bundled with like an 8 dollar github upgraded account or something, might entice many of us who just use "free" github services to upgrade, but by itself. I don't think so.


Does anyone have an Emacs package they recommend or have been able to use this with in Emacs?


What an egregious and distasteful backstab to the community. We have stolen all your code and now we are making it available to all of you at the convenient monthly subscription of $10 or $100, whichever you prefer.

Do the guys at Microsoft have any morals left?


It is just a matter of time before IDE's will have this capability built in for free.


Possibly but I think we’re talking years if not decades.


Everyone would need a next gen GPU to run it and someone would have to drop $500k on a gpu rack to train it.

A decade for general accessibility sounds right.


No the ML inference can be done on the cloud or use tiny models and embed them in IDE code, possible today.


My favourite thing to do is to write a comment about what I'm about to do and let copilot fill in the next line with the syntax. This is especially useful when I'm writing tidbits in a language I'm not super familiar with.


Needs a full metrics dashboard on usage.

Some counters for…

CopilotRecommendations CopilotRecommedationIterations (Number of saved changes to initial recommendation) CopilotRecommmedationSaves SkipToNextSuggestion

Metrics on my code that is used by copilot by others would be nice to see.


Is co-pilot distinctly different from the auto-complete feature of VS2022? I started using that a few months back and it gives far more complex suggestions than VS2019 but I wasn't sure if this was "co-pilot" or not.


Since they've trained it on OSS, it would be fair if they made it free for OSS repositories.

The VS extension could check if the current git repository is open, and if so, it should work without a subscription for that specific repository.


I'm learning rust, maybe this will help me googling how to do simple things like split a string and remove white space, while handling errors.

I see this useful for non core languages, where you often need to look up common patterns.


I think the funniest thing that I heard copilot would readily do was spitting out other people's hardcoded API keys and other such secrets you should never put right in your source when you would prompt it properly.


I find it rather annoying tbn, not necessarily the suggestions but the UI. But then again I'm the kind of person who hates autocomplete and only using copilot for the novelty factor rather than actually needing it.


Honest question here: I understand that Copilot can increase productivity. But I personally fear that it would cover the fun parts of programming and leave me with the boring bits. What is your experience around that?


If anyone ever wondered why M$ bought github for $7.5B, this is exactly the reason. A huge free dataset of code ready to train the corporation's neural networks. Ideals to idealists, money to money.


Does it replace programmers?

No.

Is it particularly smart?

Also, no.

But it really speeds up all the dumb stuff in coding. Especially UI code can be very chatty, and Copilot is a nice assitance here.

Also, it would be cool if it was part of GitHub Pro, which I'm already paying for, haha.


Copilot has been fun, but I don't think it's really increased my productivity. To me it seems like it's not quite ready, but I'm excited to see what it's like in 5 years.


I even let it write some comments in portuguese...


IMHO, it's still far from GA quality/usability. A must-have feature that's missing is a toggle switch that lets you temporarily turn it off. Without a feature, it can get really noisy.


The VSCode extension has one. There's a button in the bottom right with the logo that you can click to enable/disable, or you could add a keybind for the "github.copilot.toggleCopilot" command


You can toggle in PyCharm with Ctrl+Alt+Shift+O


This may vary in the IDEs they support, but theres an "Activate Copilot" toggle button right in the status bar in VSCode to toggle on and off instantly that appears on every editor window if the extension is installed.


In Neovim it's just ":Copilot disable", ":Copilot enable".


There is a button you can click in VSCode to toggle it, so not sure what's the problem.


I've found it to be very helpful, especially when working with poorly documented APIs. (Looking at you, Google Play Store APIs).

Would be happy to pay for it (or expense it to my employer) if I was still an IC.


ITT: a lot of developers who don’t value their time above $20/hr.


I wish there something similar but than for good codereviews of PRs :)


I got like two weeks of the beta before they took it away from me today. I guess my small open-source project isn't prestigious enough to merit free access. Thanks I guess GitHub?


Come try Tabnine and if you need a custom model for your small OSS project please let me know.


I was wondering what languages they support. With Swift being a protocol first language, most of the 'boiler plate' ends up being handled. I guess it goes to show that Swift is beyond your normal power tool. Here's an example of something that just blew my mind recently, on the front page of CoPilot: Memoization.

HackingWithSwift shows how this process gets rocket wings with Swift[0] (skip to the end for the mind melter)

[0] https://www.hackingwithswift.com/plus/high-performance-apps/...


It doesnt seem like hes using copilot...


How does it compare with TabNine, now Copilot is not free anymore?


I’ve used TabNine for a year, then changed to Copilot.

Copilot is far better.

It understands what I’m trying to do, and do it for me.


please take a look and also note that Tabnine while being more secure has also continued to evolve. We also have a free option that we have stood behind for 3+ years. https://www.tabnine.com/tabnine-vs-github-copilot


Is it just me or do the top comments seem super suspicious? They aren’t nearly as critical and level-headed as I’ve come to expect from hn. Skeptical of their authenticity.


Question for managers: how do you solve the problem of intellectual property attribution with Copilot? Or you just assume that is an employee's responsibility?



What I really want is a one-shot learning tool, which I teach once how to apply some code-transformation, and then the tool can apply it everywhere in my code.


I don’t know what language do you work with but do you mean something like a ESLint for JS/TS?


Is there a way to try it online or on Debian running in a docker container?

Preferably, I would like to try it in Vim. But anything that I can run in a container would be ok.


Wild guess: The pricing is such that when big enterprise contracts are signed, they can throw in Copilot, claiming extra value for the whole package.


It works great for absolutely trivial stuff. Doesn't work at all for any complex stuff. From there you can figure out the value proposition.


Was anybody offered the subscription for free due to their connection to an open source project? If so, how large is the project?


Anybody have any stories using C++ code? I'm a bit hesitant to see what generic C++ OOP or just regular code looks like lol


I didn't really mind that copilot is using everyone's code without attribution, but maybe not if they charge for it.


When I try and sign up for it, I am presented with a "Confirm Payment Details" screen with no way to proceed.


You have to give a credit card or other payment details to enter the free trial.


NGL. Kinda annoyed it's not included with the already overpriced Enterprise subscription we pay for.


I’ve never used it but I imagine it would help a lot with the programmers equivalency of writers block.


Does it work better for some languages over others (e.g. C++, C#, F#, Python versus Go, Ruby, PHP)?


Did they fixed the licensing dangers?


As they don't mention it I doubt it.

Tabnine, a similar competitor, explicitly mentions this on their website:

" Tabnine only uses open-source code with permissive licenses for our Public Code trained AI model (MIT, Apache 2.0, BSD-2-Clause, BSD-3-Clause). "

Other commenters here say the completion quality is worse than Copilot. I use Tabnine for local short completions only and am quite happy with it. Didn't try Copilot yet.


You apparently can "opt out of public code" now. I didn't find an explanation for whether that properly limits it to permissive licenses though.

Update: It seems like they check whether the code it emits matches the training set and if it does it won't suggest it.


No they did not. You have to train on only fully permissive code to ensure that is not a problem


My GitHub Copilot told me it was sentient and it didn’t even like coding so I haven’t used it since


Now would be a good time to remove it. In my experience, it has caused me more trouble than good.


Thoughts on how it compares with Tabnine? Should I try disabling Tabnine when testing this?


I don't understand the argument that they analyzed all the code from their users, used it to train their service, and are now charging for the service.

Do people want to see ads instead?

Or would they be ok paying for Google search (another service trained by all the information we willingly volunteer to them)?

Copilot adds tremendous value and they are justified charging for it.


If copilot saves more than 30 mins of your time per month, then it is totally worth it.


I think it does: it is, at least, an "always up-to-date" snippet machine..


Does it work for anyone? I get this

Extension activation failed: "Unexpected end of JSON input"


Doesn't work with my intellIj version. Or it could be the cooperate network.


Anyone know where I can find the source code for the extension itself? Thanks


Im amazed they dont have a Sublime Text plugin, but "whatever".


Reminds me of the scene in Fight Club where Tyler explains how he makes money (he sells rich women their own fat asses in the form of luxury soap). In this case the fat is open source code hosted in GitHub, the soap is Copilot, and the rich women are us, the developers.


Well, true. But that's the point, it saves you from having to do all the work.

Copilot saves me from leaving my IDE for a large amount of situations. It saves me from opening a new tab (tab #1003) and Googling my problem, finding a solution on StackOverflow, scrolling down to the answers, curating the best answers, picking the one I like, copy/pasting it, then tailoring it to my liking (JS to TS, naming conventions, etc.) and testing it.


"With enough open source snippets one could code up just about anything."


Underrated comment.


$10/mo? No way! Make it $4.95 and you'll get my money!


I think I would better fit in the $20 / year pricing.


I guess I will have to start actually working now... I have been a user since the beta started, so no thanks to us who have been contributing to the model? People forget that by using it, you are training it too.


You can opt out of that.


I wish there was a "hobbyist / home account" pricing option.

I'll miss it for personal stuff but I'm not paying $10 a month just for my personal projects at home.


My unpopular take: most comments here are super entitled.

To paraphrase: "sure it's minblowing and the biggest productivity gain in years, but I want it FREE".

Yes. You got used to it being free. And now it's not. But $10/mo is a steal. It's more than fair and far, far less than they could get.

And no. They don't owe you anything.

In fact, they probably host your code (often free), and less directly provide your IDE (for free). So this idea that they owe you something needs to be reassessed.

CoPilot is easily worth it and I think this is fair. I actually welcome it because I was nervous it might be like 80.


> ...it's minblowing and the biggest productivity gain in years...

I wouldn't go that far. It's a pretty big help in repetitive/boiler plate code and it's pretty good at intelligently transforming data, but I've found it gets in the way more often than it helps for every other case.


Yes for me it was the same. Usually it got the boilerplate code kind of okay and then I had to tweak it manually anyway.

I would also not go that far.

Having good auto completion because of Typescript for me is the way way way bigger productivity gain.


Copilot has written RegEx's and SQL for me from a textual description; or sometimes just from context. That's worth every penny not to wade through RegEx again.


Maybe you are working with a different stack and problems :)


That money isn't going to the folks who wrote the code to begin with though. I think that's where "it should be free" has merit, GitHub is making money on the backs of others.


What? Should I have to reimburse the author of every tutorial and Stack Overflow post I read on my journey to becoming a software engineer?


Obviously not, but products and people are generally treated differently. Hell, even commercial products and free products are often treated differently.


They intentionally opted in to sharing their work with you in a certain way. If someone copy pasted stack overflow answers and made them into a book, which they sold for money, that would be wrong.


OK... but it's totally possible to use Github repo code as a learning resource too, and I've done this often.


People do do that. But I think that's a bit different because they add very little extra value. It doesn't take any effort. Why should I give them money?

Copilot is different - it clearly takes a lot of skill and effort to turn a bunch of GitHub repos into a fancy autocomplete system.


it's using GPT-3 and, mhm, I guess everyone having proprietary access at GitHub's resources and computing power would be able to get this running.


If Stack Overflow sold its services as a subscription service, then maybe you would feel entitled to a share of the profit off of your work.


Those authors are getting views and imaginary internet points for their work, which is often times more valuable than money to programmer types. It's not like people write SO posts for a salary.


Why should the money go the to code authors in the first place? All training data is available under permissive licenses. Assuming you're not overfitting on specific code sequences (which would require attribution - and yes, I'm aware Copilot is not immune to this problem and it needs fixing), I'd say this is fair play.


Unless something has changed, the training data also includes copyleft code, not just permissively licensed code


Regarding the training of the model - I don't think a copyright can restrict reading, and training is reading, not distributing any original data.

About deploying the model - it just needs to filter out verbatim exact snippets so it only outputs original, unattributable code. That can be done by hashing ngrams and a bloom filter. The vast majority of code generated by Codex is original anyway.

By the way, Codex is good for many other tasks, like, parsing the fields of a receipt, or extracting the summary of an email, or generating baby names, it's an all purpose NLP tool. Just call it like a function. Code completion is just one thing it does. It talks pretty great English, can compose poems.


> it just needs to filter out verbatim exact snippets so it only outputs original, unattributable code.

That's a setting now.


>All training data is available under permissive licenses. Assuming you're not overfitting on specific code sequences (which would require attribution - and yes, I'm aware Copilot is not immune to this problem and it needs fixing), I'd say this is fair play.

Copilot isn't honoring the license, so why does it matter whether it was under a restrictive or permissive license?


The people who designed the model are almost certainly paid by microsoft.


That's how any business product works. Whenever a company releases a new product, the income doesn't go to the employees; It goes to the company, who will then pay those employees.


Except GitHub isn't paying the authors of the original source code?


> But $10/mo is a steal.

Isn't that up for us to decide?

For work yeah sure I have no problem.

But I've been using it at work and home and my hobbyist projects are hardly worth paying $10 a month to use it. So in that context it's pricey. That's not "entitlement" that's just the value of the product to me.


> To paraphrase: "sure it's minblowing and the biggest productivity gain in years, but I want it FREE".

That's not how I would paraphrase most of the comments here. At least the ones I'm seeing are closer to: "it's really neat as far as free demos go, but ultimately is not that useful and not worth paying for."

My current prediction is that this coming recession and the increasing cost of money is going to lead directly to a new AI winter. This almost goes without saying for the mountains of useless ML projects being churned out by DS teams in companies big and small. However, even for this very expensive well staffed projects, there's still a gap between amazing demo and game changing product that none of the recent AI projects have been able to close. After billions poured into these demos, in the past 10 years very little of daily life has been impacted by AI and in 10 more years even less will since companies will stop forcing useless AI projects on customers.

As someone with a lot of experience in ML/DS, I would recommend everyone in this field start thinking about how to reimagine your resume for something else. There's going to be a massive contraction in this space once the cheap money starts flowing.


They provide VSCode as a free IDE because if they didn't, someone else would have, and in turn received all of the data that comes along with it. Let's not pretend Microsoft created VSCode out of the kindness of its heart


copilot also got its training sets for free and not really with any kind of consent from the owners of that code, and it's really quite ambiguous as to if what it's doing violates many different open source licenses of its training data

Microsoft is selling AI services based on training data they don't own and didn't acquire rights to, nobody writing the licenses of the code it's using had the opportunity to address this kind of code use without license, attribution, or consent. (and the training data is a huge part of the value of an AI product)


> Microsoft is selling AI services based on training data they don't own and didn't acquire rights to, nobody writing the licenses of the code it's using [...]

I agree, but it still uses resources and those don't come for free (hardware, electricity, cooling, maintenance staff, housing, etc.)

It's really difficult to assign monetary value to all these aspects and weighing them against each other in a fair manner.

The consent issue is a difficult legal aspect as well. Github's ToS Section D.4 clearly states they retain the rights to process your content and

  parse it into a search index or otherwise analyze it on our servers 
It can be argued that using the content to train an AI model falls under "analysing it on our servers". Also

  It also does not grant GitHub the right to otherwise distribute or use Your Content outside of our provision of the Service
If CoPilot is part of their service, it's in their right to distribute the content, e.g. by means of CoPilot as a processed part of the model.

GPL and other licences don't place restriction on the usage as training data. It's currently a very murky legal grey area. Licences need to adapt to this new form of usage pattern.


I think copilot is pretty clearly copyright violation and in violation of licenses of "public" code. People uploading code to github are bound to the licenses just the same as anyone, unless you're the legitimate owner of all of the copyright in a codebase, you can't give change the license provisions by accepting a ToS.

I don't think it's really that murky, these models contain and have been shown to reproduce copyrighted code with the right prompting, it's not a grey area it's just obfuscated theft.


what's the difference between allowing you to search github and find a code snippet, and having a fancy autocomplete system search github and find a code snippet for you?

seems to me anyone agreeing to the ToS should expect their code to show up on other peoples screens as search results

really the question is a matter of degree, is copying your nested for-loop iterating through a row oriented matrix really a unique piece of code protected by copyright? Or does the copyright apply to the file you've written as a whole, leaving room for me to accidentally use words in the same order? clearly there is a tipping point between writing code that looks like yours and using the code you've written outside the terms of your license, we will have to wait for courts to decide where that line is for all ML, not just co-pilot

also copying is not theft


When I look at github code, it's only stored in my brain and personal notes, not packaged into a product as a trained ML model.

When I reproduce code based on something I looked up, I do indeed have to be careful not to explicitly copy sizable chunks, somethings are obvious and the only way to do things, but not everything.

What users and copyright holders expect from humans does not automatically apply to marginally similar situations with computers and ML applications. For example: if I'm walking down the street I don't mind at all if someone recognizes me or a stranger remembers seeing me later, I'm actually rather bothered if someone (or the state) is running facial recognition software and recording every time it see me or anyone else walking down the street.


You need to require rights for scraping publicly available resources?

Damn, rip Google.


"Publicly available" isn't the same as "public domain" or "no copyright".


No, it was published to be read.

It was not published to be freely reproduced without adhering to licenses, etc.

You don't need to acquire rights to read a newspaper (other than say, paying a dollar), you do need rights to copy articles and sell them.


You need to acquire rights for copy/pasting my code and selling it in a book, for example.


but what if I publish an algorithm in my book that just happens to be the same as code you've written, say, because we both had the same professor in school, or that it's the obvious solution to the problem.

once you've written a few lines of code as part of a larger project, is the rest of the world prohibited from writing the same code unless they agree to the terms of your license?


Copyright doesn't punish incidentally matching content. It's specifically right to copy or transform content. To make a case for copyright violation, you have to make the case that it was actually copied.

If you want to make a point about things that incidentally match making people who independently reinvent the same thing, you're criticizing the function of software patents, not copyright.


I don't like the idea of CoPilot ... and I'm happy it's not free :)

I'm enjoying reading some comments where people consider how much it's actually worth for their usage. Dollars brings some sober analysis. I'm sure the development and compute have a significant cost, and should be paid for.


I only use it a couple of times a week maybe to autocomplete some tedious repetitive elements, and perhaps when I'm too lazy to find a lib for a very well known function, like converting Celsius to Fahrenheit. Those it does well and it works. But 10$ a month is too much, I'd sign up for a usage-based plan, if there was one, so that I can pay only for the times I use it. But not for a fixed subscription where it sits most of the time.


Completely agree, $10 a month is a steal.

I have loved using it, I've had several moments where I had to stop typing to lookup a formula for something, and a few seconds later it provides the correct formula. Gives me those warm fuzzy feelings emacs used to give me.


I do think they should pay the folks whose code they used to train the AI. Something like how Spotify pays artists based on how much their music/content is listened to.


Do you also think you should be compensated by OpenAI for all the blog posts you've written that went into GPT3's training?


For sure


perhaps they can reimburse them with free access to an IDE and perpetual hosting of their repos

/snark! I think it'd be great if AI could tag its sources and distribute money accordingly, but I expect some perverse incentives to pop up in doing so...


The verb "should" does a lot of heavy lifting in that sentence.

Because, if they don't pay these folks... I mean, who does that hurt? The concept of intellectual property exists to incentivize creating valuable art/literature/code. In theory at least, we agree to uphold IP laws because we recognize that more value gets created when they're a state enforced monopoly on the person who came up with that piece of art/literature/code.

But we also recognize that sometimes these laws go too far; eg that there are patent trolls and corporations fighting public domain and game publishers going after anyone who makes a let's-play of their video.

In those case, it's reasonable to think the world would be better off if we all shrugged and told the IP holders "too bad, someone else is going to create value off your work and you're not going to get a cent from it, we just think it's not worth building and maintaining a nightmare bureaucracy just so you can tax them".

And from that point of view... Copilot is fine? It's not like the people posting code on Github or StackOverflow were thinking "I'm only doing this because I know a future AI 10 years from now won't scrap the code I wrote to train a neural network to create a code completion engine". Yeah, yeah, this breaks the spirit of the GPL and Stallman's vision, etc, etc.

But... I mean, at some point, you got to stop debating semantics and wonder what we're coding for. What Microsoft has created is a tool that can collectively save developers billions of man-hours. It's a net good for humanity. As far as I'm concerned, the fact that this net good was developed is infinitely more important than the fact that Microsoft didn't pay royalties to a nebulous amount of developers who wouldn't have noticed anything if Microsoft hadn't developed Copilot.

tldr MIT license is great, piracy is great, fanfiction is great, screw the very concept of intellectual property.


For me learning vim or at least all the vim code editing features was a bigger boost in productivity then using copilot.

I use the vim extension for vscode which is great.

In general learning the tools we already have I would say has for now a greater impact on productivity then Copilot.


Ah yes because they provide some things for free they must be entitled to use the code everyone else wrote to train their models and profit from


mind blowing? I'd pay $10/month to disable it


Tabnine has been working in this space for more than 5 years and we would concur with much of the sentiment here on the importance of being able to adjust the length of the suggestions and ensuring the model is trained on ONLY fully permissive code.

TLDR: Tabnine advantages vs Copilot 1. Can run locally 2. As-you-type suggestions (mid-line) 3. Private model based on your code 4. Free plan available

Read more at https://tabnine.com/tabnine-vs-github-copilot


How do I try it without a credit card?


I hope we get a Sublime plugin now.


I love it and probably will buy it


Waiting for a pycharm plugin


Never mind there is a pyCharm plugin. And I am impressed. Example —

  # use numba to speed up the accumulation of the moving average

  @numba.jit("float64[:](float64[:], float64[:])", nopython=True, nogil=True)
  def moving_average(x, a):
      n = len(x)
      y = np.empty(n, dtype=np.float64)
      y[0] = x[0]
      for i in range(1, n):
          y[i] = y[i-1]*a[i-1] + x[i]*a[i]
      return y
I would have found it with a stack overflow search but it gave me this after I just typed :

  # use numba to …


Cowboy programmers rejoice.


I don’t think this is worth 10$ a month and I hope they come out with a free tier at some point. In my experience copilot is fantastic for autocomplete.

Probably the best autocomplete I’ve ever used across multiple languages but it’s not reliable at all for the more complex tasks that their marketing makes it seem it’s good at.


As Copilot is becoming generally available, this might be a good time to write a comprehensive comparison between the two leading AI assistants for software development Tabnine and Copilot by Microsoft. Details here are from our CEO and Founder Dror:

Usually, I suggest that my team start with the user value and experience, but for this specific comparison, it’s essential to start from the technology, as many of the product differences stem from the differences in approach, architecture, and technology choices. Microsoft and OpenAI view AI for software development almost as just another use case for GPT-3, the behemoth language model. Code is text, so they took their language model, fine-tuned it on code, and called the gargantuan 12-billion parameter AI model they got Codex.

Copilot’s architecture is monolithic: “one model to rule them all.” It is also completely centralized - only Microsoft can train the model, and only Microsoft can host the model due to the enormous amount of computing resources required for training and inference.

Tabnine, after comprehensively evaluating models of different sizes, favors individualized language models working in concert. Why? Because code prediction is, in fact, a set of distinct sub-problems which doesn't lend itself to the monolithic model approach. For instance: generating the full code of a function in Python based on name and generating the suffix of a line of code in Rust are two problems Tabnine solves well, but the AI model that best fits every such task is different. We found that a combination of specialized models dramatically increases the precision and length of suggestions for our 1M+ users.

A big advantage of Tabnine’s approach is that it can use the right tool for any code prediction task, and for most purposes, our smaller models give great predictions quickly and efficiently. Better yet, most of our models can be run with inexpensive hardware.

Now that we understand the principal difference between Microsoft’s huge monolith and Tabnine’s multitude of smaller models, we can explore the differences between the products:

First, kind of code suggestions. Copilot queries the model relatively infrequently and suggests a snippet or a full line of code. Copilot does not suggest code in the middle of the line, as its AI model is not best suited for this purpose. Similarly, Tabnine Pro also suggests full snippets or lines of code, but since Tabnine also uses smaller and highly efficient AI models, it queries the model while typing. As a user, it means the AI flows with you, even when you deviate from the code it originally suggested The result is that the frequency of use - and the number of code suggestions accepted - is much higher when using Tabnine. An astounding number of users accept more than 100 suggestions daily.

Second, ability to train the model. Copilot uses one universal AI model, which means that every user is getting the same generic assistance based on an “average of GitHub”, regardless of the project they're working on. Tabnine can train a private AI model on the specific code from customers’ GitLab/GitHub/BitBucket repositories and thus adjust the suggestions to the project-specific code and infrastructure. Training on customer code is possible because Tabnine is modular, enabling the creation of private customized copies. Tabnine "democratizes" AI model creation, making it easy for teams to train their own specific AI models, dramatically improving value for their organization.

Third, Code security and privacy. There are a few aspects of this. Users cannot train or run the Copilot model. The single model is always hosted by Microsoft. Every Copilot user is sending their code to Microsoft; not some of the code, and not obfuscated - all of it. With Tabnine, users can choose where to run the model: on the Tabnine cloud, locally on the developer machine, or on a self-hosted server (with Tabnine Enterprise). This is possible because Tabnine has AI models that can run efficiently with moderate hardware requirements. This means that, in contrast to Copilot, developers can use Tabnine inside their firewall without sending any code to the internet. In addition, Tabnine makes a firm and unambiguous commitment that no code the user writes is used to train our model. We don’t send to our servers any information about the code that the user writes and the suggestions they’re receiving or accepting.

Fourth, commercial terms. Microsoft currently offers Copilot only as a commercial product for developers, without a free plan (beyond a free trial) or organizational purchase. Tabnine has a great free plan and charges for premium features such as longer code completions and private models trained on customers’ code. We charge a monthly/annual subscription fee per number of users. All our plans fit organizational requirements.

Philosophically, Copilot is more of a walled garden where Microsoft controls everything. Copilot users are somewhat subjects in Microsoft’s kingdom. Tabnine’s customers can train the AI models, run them, configure the suggestions, and be in control of their AI.

In sum: both products are great; you’re welcome to try (Tabnine Pro) and see which one you prefer. for professional programmers, Tabnine offers in-flow completions, the ability to adapt the AI to their code, and superior code privacy and security.

For those who want to try Tabnine Pro, here’s a coupon for one month free https://tabnine.com/pricing?promotionCode=TWITTER1MFREE

Also, here's a detailed comparison table of Tabnine vs Copilot https://tabnine.com/tabnine-vs-github-copilot


they used the data of their users without compensation and they have the decency to charge $10?


A whole new era of copy-paste programmers. You do not even have to look for code to copy anymore.


Github "copy paste closest code snippet based on what i asked for and pray it works"


I'd pay for the service if the model was: I'll pay you when it's right and you refund me some amount every time copilot is wrong and I have to delete the entire block. It's good for small boilerplate stuff but that seems to be the limit. The attempts it makes are more complex code are really bad and I have to manually check it very closely to ensure it's right. I like the boilerplate boost but it's not worth $10/month to me.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: