I’ve never understood the value proposition for Copilot.
In terms of difficulty, writing code is maybe on average a two out of ten.
On average, maintaining code you wrote recently is probably a three out of ten in terms of difficulty, and maintaining code somebody else wrote or code from a long time ago probably rises to around a five out of ten.
Debugging misbehaving code is probably a seven out of ten or higher.
GitHub Copilot is optimising the part of the process that was already the easiest, and makes the other parts harder because it moves you from the “I wrote this” path to the “somebody else wrote this” path.
Even during the initial write, it changes the writing process from programming (which is easy) to understanding somebody else’s code to ensure that it’s right before accepting the suggestion (which is much less easy). I just don’t understand how this is a net time/energy savings?
Did you try it? Because I've been using it for weeks and it makes me read these types of comments as "I don't understand the value of the internet" or "what's the purpose of owning a phone".
It's night and day if you have it enabled or not. There's just no question about the value proposition once you start using it.
I mean, you can tell comments here from people who have actually been using it, and people who have not tried it.
> Because I've been using it for weeks and it makes me read these types of comments as "I don't understand the value of the internet" or "what's the purpose of owning a phone".
And yet apart from making this very inaccurate comparison you haven't made any argument for why such a thing as Copilot would be useful to anyone. How do you personally find Copilot useful? And why do you think someone whose job demands more than copy/pasting boilerplate code should try Copilot? The onus is on you to convince the skeptics.
I think current co-pilot is less useful for person who entirely knows the language and libraries.
Right know I see more use for people who understand the language and libraries, but are frequently Googling ”how do I do xyz in P” (because they can’t recall certain things).
If you mainly deal with 1-3 languages on a daily basis that you have mastered, you don't routinely search for "How do I do xyz in P". Maybe if you're a junior or intermediate developer, or have a poor memory. But doing that frequently is a clear indicator that mastery have yet to been achieved.
It's not wrong or bad to search for help, but it doesn't indicate mastery of the language you're using.
Knowing a language or a tool doesn't mean you will always know the best or the smartest way to do something. This is not necessarily a test of your programming ability. And best practise is more often ever changing. Almost every language or tool is always ever changing and improving itself which best practise also keeps evolving.
Secondly you don't necessarily need to know or master a language or tool for every kind of work. You can just choose to learn as you go along with it, in which case knowing how to search and use the most effective way to do something is very useful.
Look, I don't care at all if you use copilot; you can use notepad to write your code if that floats your boat; do whatever you want.
What the parent post said is: Copilot is useful; it helps you write code with autocomplete suggestions.
If you think that you don't get productivity gains from an IDE or you're in the 'no IDE makes you more hardcore and better programmer so never program with an IDE' camp, we just have to agree to disagree.
So, I have no interest in that conversation.
...
However, there is a more interesting conversation here we can have:
Given that you have an IDE and use autocomplete:
1) Does Copilot give suggestions are meaningfully useful?
Yes. I honestly can't give you a better answer than that. Yes, it does, it's quite good.
If you don't believe me, try it.
2) Is it better than regular autocomplete?
Look, forget the 'Look, I typed 'process user form and display on UI' and it autocompleted a whole application for me!!' hype.
That's stupid and that's not how it works.
It's an autocomplete. It can autocomplete large chunks, but they're generally rubbish. ...but it does two very interesting things:
- It suggests things that are contextually correct.
For example: even though its rubbish at C++ syntax, it generates valid UPROPERTY and UFUNCTION blocks for unreal code. If I write a `for (y = 0;...` on an array, it generates the associated `for (x = 0;...` when I'm iterating over a 2D array.
Sure, it's a similar pattern to other code in related files, but still. This surprised me. I've never encountered an autocomplete that does that before.
Sometimes the suggestions don't make sense, and the larger the chunk, the less sense they make.
...but the suggestions that do make sense, make you regret not having it when you don't have it.
Like... regular autocomplete.
It's just a tool; it works very well at small scale autocomplete tasks.
- It can suggest comments from code as well as code from comments.
Literally, I can go above a function and type "/*" and it'll suggest a comment.
These don't always make sense, but often they're pretty close, and it saves me 20 seconds typing.
You have to carefully read these comments, because they tend to get random crap in them, but
once again, for short comments its not bad.
Again... it's surprisingly good. Not perfect. It doesn't write your comments for you... but, it's easy to get into the habit of getting it to autocomplete "Returns null if the object is invalid" for you instead of typing it out.
3) Should I use it?
Look, I literally do not care if you do or don't.
What I take issue with is people saying 'it has no value'.
Does autocomplete have a value? Then this has a value.
Saying it has no value is just trolling.
Is it worth the cost?
Well, it's free to use right now.... so, well, you can't beat free right? :)
Longer term, would I pay for it?
Probably (** Personal opinion only: Maybe... you should try it and decide for yourself?)
I'm probably going to use it, but isn't it valuable to be a bit skeptical and question what the long-term effects may be?
Over the past decades we went from snail mail to email to instant messaging, each step made it easier to write text to a person. Today, we are writing so many messages to each other that people have started arguing for less instant messaging and less email. Mainly because distractions and frequent context shifts allegedly reduce productivity and happiness.
With Copilot, we have a similar evolution where writing code becomes easier. Could this result in people writing more and more code since it is a smaller effort? What would this do to the developer ecosystem in the long term? Maybe code reviews will take longer because there is more code and because it is more likely for junior developers to introduce bugs using copilot code. Maybe this results in more bugs slipping through code reviews and into production and eventually lower productivity and happiness since more time is spent stress-fully fixing production errors. I can't predict the future, but I do think it's valuable to ask these questions before it's too late.
> If you use GitHub Copilot, the GitHub Copilot extension/plugin will collect usage information about events generated by interacting with the integrated development environment (IDE). These events include GitHub Copilot performance, features used, and suggestions accepted, modified and accepted, or dismissed. This information may include personal data, including your User Personal Information
It depends on semantics and your interpretation of value.
In the eyes of most people free == 'something that i don't have to pay for through my bank accounts or other means', as opposed to caring about analytics, telemetry etc.
At least AFAIK that's the common usage and what almost everyone means, though it's definitely worth it to talk in more detail about what hides under that term most of the time!
That sounds about as useful as Tesla's "auto-pilot" - good when it works but you have to always pay attention that it's not trying to kill you (or your code in this case).
A bad proposition for most people, because you can't trust it.
The whole trust thing is an interesting topic. I thought the same thing but then I got Open Pilot. I would feel comfortable falling asleep with that on at this point. It takes time though. Around 400 miles for me before I "fully" trust.
First, why would I need to argue this? Just try it and see if it’s useful for you. I know people who don’t use IDEs or don’t use syntax highlighting and it doesn’t seem to bother them much, so who knows.
Second, I’ve actually answered elsewhere in this thread.
you can sometimes just tap enter, inside a model or another file (in a framework like Laravel for example), and it'll literally guess the entire function. at the most If you just make a comment about what the function should do, or name the function something sensible it'll get the whole function maybe 58% of the time, and when it's wrong there's other options to choose from and one might get you 90% to the end goal, and just need a few modifications.
I've used kite, and tabnine and loved tabnine, but this is something different and way more like magic. I can't explain how, it just feels like it's reading my brain as I type..
I've only used it a bit and it's like glorified autocompletion. It messes up a lot in some ways too, often suggesting getFoo when in kotlin we just use foo.
It is really fun to see it sort of understand your code though, and every once in a while it does a smart suggestion that could've taken a while for me to figure out myself.
It's certainly a good autocomplete but it can't be used for anything more complex because you just can't trust it. Every now and then it produces an entire function filled with complexity and I think "This looks right but I would need to independently verify everything" and at that point it's easier to write from scratch.
It only spits out functions at most 10 lines long. There is very little that could actually be copywritten in a 10 line block. And most of those suggestions are incredibly generic problems like find the distance between two coords.
This is a cloud-based code suggestion platform. No corporation with a solid secrecy policy will allow you to use it. For private use; it costs so I prefer just learning the field. What precisely convinced you?
I don't see it either. The context switch between being in the zone/flow and writing the exact code I'm thinking of to suddenly reviewing blocks of foreign and quite possibly wrong code seems like a net negative value proposition. I can't even get autocorrect on my phone to do the right thing half the time.
Writing code is easy. Architecture, refactoring, and solving business problems are the hard parts of the job.
Writing new code is also generally the most rewarding aspect of the job. Co-pilot promises to turn that into just another unrewarding chore, like slinging 3rd party libraries together.
There are some code in your life you would wish someone could write that for you.
Because such code is dumb, tedious and joyless. I have to bite my lip sometimes to convince myself writing them is not a waste of time, because people demand it, but I hate it to my core.
Copilot is that unfortunate boy that has to do all that manual work. It is the ultimate code boilerplate mixer.
It is not going to write all code for you, if you goal is to have it THINK for your then you are due to disappointment. But it would be able to help you be a more efficient and slightly happier programmer.
Copilot also optimizes for speed to a degree. It's akin to advanced auto complete. IntelliJ auto-completion is great. As much as it pains to say this, I don't think I will be as effective writing Java in Vim as much as I am with IntelliJ. The key differentiator is the auto complete speed. Copilot I feel is just auto complete on steroids. It may not be perfect yet, but there is definitely a problem it solves.
Have you used it? My experience was quite atrocious. Copilot is not auto complete. It’s nonsense. I attempted to use it continuously for three weeks. I tried because I know someone who built it and I wanted to give them the benefit of the doubt.
It never prompted me with any code that was useful. It only ever slowed me down and caused me frustration. It’s nothing like Intellisense. It’s just trash.
I pretty much use it every day at this point and I notice when it is disabled.
> It’s just trash
I have a hard time relating to this kind of experience considering how useful it has been for me. What language are you writing in btw? When I use it for OCaml it's not that useful, perhaps because there isn't as much OCaml code to learn from :D
Could you cite an actual specific example? I have a difficult time believing what you are asserting that it provided absolutely no value at any point - it just sounds like a baseless ad hominem attack.
I find that it's fantastic for typescript and JavaScript allowing me to flesh out the completion of basic data object containers class definitions etc. extremely quickly.
If I don't remember the exact parameters that you have to pass into a certain NPM packages methods it will usually auto prompt me and help me complete it without me having to context switch to a browser and look it up.
TLDR; the subtleness of its wrongness destroys my ability to follow my train of thought. I always had to take myself out of my train of thought and evaluate the correctness of the suggestion instead of just writing.
For simple things, intellisense, autocomplete and snippets are far more effective.
For anything more complex, I already know what I want to write.
When I'm writing Angular code, it often fills in the correct boilerplate code, and is especially helpful when writing unit tests. I'm also quite surprised when it autocomplete various filter functions.
It isn't perfect, but it's been helpful filling in the mundane, simple stuff.
Sorry to tell you, but if you’re constantly, or even regularly, writing this much boilerplate code, then you probably need to change how you write code. Maybe try a different framework.
I pair program with a guy who completely refuses to use the keyboard unless there is no other option. He uses the mouse to cut/copy/paste everything possible. He is not handicapped. It is frustrating for me to pair program with him because of that.
He will spend an extra 3-5 seconds using his mouse in order to avoid typing.
The only drawback is....constraints. Autocomplete constrains suggestions to those that are valid or at least valid-adjacent (like it'll use something but auto-import it to make it valid, etc). Copilot fails miserably here and I don't yet see it improving anytime soon. Maybe it will, and if it does, it'll be great. But I won't hold my breath for it.
now, instead of copying off stackoverflow, it's gonna be off copilot. It will enable a lot more people to code who otherwise would not. Whether this is a good outcome or not...
This is even another example of just optimising for the easy bit.
I could hire 50 juniors that can code tomorrow if I wanted to. But even with an unlimited budget, finding good devs that can make it through a 2 year project without coming out of it with a big ball of unmaintainable shit is difficult.
The gulf from beginner to expert is already big, and the more crutches you use early on, the bigger it's going to get. There's a lot of people that wash out of the industry before they reach the point of being able to comfortably build good software (and be solely responsible for it).
I think copilot is another item in a long list of things that's good for big businesses (who optimise heavily for getting passable results with 1,000 mediocre devs instead of 50 good ones) and terrible for individuals in the long run.
The more experience I gain the less I use SO and more I just go to the sources or read the docs.
With googling for SO answers I have to parse the question, find a modern answer (because the accepted one is 10 years old and won’t work), parse that and adapt it to my problem. With documentation I just search for what I need and go straight to solving my problem and I’ve never felt more productive.
I feel like people new to programming focus too much on a specific problem at hand instead of learning the problem solving themselves. I wish I would’ve learned to figure out the issue myself from the start.
I feel like jobs where you can constantly rely on your already acrued knowledge are rare. Or maybe it’s just me and I work in fields where I have to learn new technologies constantly?
Recently I’ve been doing a lot of OCaml and it’s been tough as there’s very little on stackoverflow. Every time I have a question I have to spend a lot of effort looking for the answer instead of relying on someone before me having the same problem and posting the answer online.
> programming focus too much on a specific problem at hand instead of learning the problem solving themselves
you got the importance and focus wrong. People don't care for problem solving skills - they care for the specific problem being solved. That's why they pay someone to fix it.
They don't want to pay someone to "learn" problem solving (because the stakeholder don't care).
Stackoverflow immensely helped this sort of use-case - may be at the detriment of quality - but i cannot deny that copilot is going to accelerate this use-case.
Same. I was wanting to learn more about ActivityPub recently, and after reading the first two web search results, I remembered: this'll probably just be easier if I just read parts of the W3 spec (and it was)
My use of stackoverflow has vastly faded over time. I only ever go there to look for hints on ways to do something better, better practice, newer ways or a simpler way.
There are more and more wrong answers posted by what looks like the same people who have MVP on the Apple, Microsoft and Google forums who just whore for kudos points. I can't understand the motivation to dilute something of value.
>...now, instead of copying off stackoverflow, it's gonna be off copilot.
Eventually, I'm seeing another breed of SO questions, making sense of Copilot suggested fragments and seeking reassurance or alternatives... Then possibly the copying off, just as now.
By the same logic you've presented, what's the value proposition of the plain-old auto-completion? What's the value proposition of a slick editor? All you need is the built-in notepad and a debugger.
Speaking from my personal experience, I usually write code in TDD style, in which I test the properties of the software I desire upfront, then make it pass with a minimal amount of effort. When I see there's a need for refactoring, I refactor. And I repeat this process until it is done.
The three parts take roughly the same portion of time, and when I'm writing tests, I'm thinking about the functionality and value of the software. When I'm refactoring I'm thinking about the design. When I'm writing the implementation initially, I want it to Just Work™ in the first place, and I find Copilot is great for this matter: why not delegate the boring part to the machine?
You know, perhaps this is tangential to the point that you're making at best, but i still couldn't help but to notice:
> The three parts take roughly the same portion of time, and when I'm writing tests
that bit and have some strong feelings about it. At my current dayjob, writing tests (if it was even done for all code) would easily take anywhere between 50% and 75% of the total development time.
I wish things were easy enough for writing test code not to be a total slog, but sadly there are too many factors in place:
- what should the test class be annotated with and which bits of the Spring context (Java) will get started with it
- i can't test the DB because the tests don't have a local one with 100% automated migrations, nor an in memory one because of the need to use Oracle, so i need to prevent it from ever being called
- that said, the logic that i need to test involves at least 5 to 10 different service calls, which them use another 5 to 20 DB mappers (myBatis) and possibly dozens of different DB calls
- and when i finally figure out what i want to test, the logic for mocking will definitely fail the first time due to Mockito idiosyncrasies
- after that's been resolved, i'll probably need to stub out a whole bunch of fake DB calls, that will return deeply nested data structures
- of course, i still need all of this to make sense, since the DB is full of EAV and OTLT patterns (https://tonyandrews.blogspot.com/2004/10/otlt-and-eav-two-big-design-mistakes.html) as opposed to proper foreign keys (instead you end up with something like target_table and target_table_row_id, except named way worse and not containing a table name but some enum that's stored in the app, so you can't just figure out how everything works without looking through both)
- and once i've finally mocked all of the service calls, DB calls and data initialization, there's also validation logic that does its own service calls which may or may not be the same, thus doubling the work
- of course, the validators are initialized based on reflection and target types, such as EntityValidator being injected, however actually being one of ~100 supported subclasses, which may or may not be the ones you expect due to years of cruft, you can't just do ctrl+click to open the definition, since that opens the superclass not the subclass
- and once all of that works, you have to hope that 95% of the test code that vaguely correseponds to what the application would actually be doing won't fail at any number of points, just so you can do one assertion
I'm not quite sure how things can get that bad or how people can architect systems to be coupled like that in the first place, but at the conclusion of my quasi-rant i'd like to suggest that many of the systems out there definitely aren't easily testable or testable at all.
That said, it's nice that at least your workflow works out like that!
While the book is indeed good, it's pretty hard to do anything to improve that particular codebase because there are developers who are actively introducing more and more of the problematic patterns and practices even as i write this.
To them it isn't "legacy code" but just "code", while attempting to offer alternatives either earns you blank stares or expressing concerns about anything new causing inconsistencies with the old (which is a valid concern but also doesn't help when the supposedly consistent code is unusable).
To me it feels like it's also a social problem, not just a technical one and if your hands are essentially tied im that situation and you fail to get the other devs and managers onboard, then you'll simply have to either be very patient or let your legs do the work instead.
Thanks for sharing this. I can feel you because I have been working on a similar project but slightly better, however, it's painful still for me. I wrote a comment last month [0] that is more or less related to what you've said. Basically, you want to write fewer tests that really matter, while the infrastructure should be fast and parallelizable.
Sadly it's easier said than done, since it's not an easy thing to fix for an existing system. We've spent quite some time improving things to ease the pain on writing tests, it was getting better but would never reach the level if we were aware of this problem in the first place - there are tens of thousand tests and we cannot rewrite them all.
I'm not too familiar with your tech stack. But there are two things you mentioned that are especially tricky to handle for testing: DB and service calls.
For DB, there are typically two ways to handle it: Use real DB, or mock it.
Real DB makes people more confident, and don't need to mock too many things. The problem is it can be slow and not parallelizable, or worse, like your case there's no impotent environment at all. We had automated migrations, but the test was run against the SQL Server on the same machine, so it was not parallelizable so the tests took more than a day to run on a single machine. On CI there are tens of machines but still takes hours to finish. In the end, we generalized things a little bit, and used SQLite for testing in a parallel manner. (Many people suggest against this because it's different from production, but the tradeoff really saved us). A more ideal approach is to have SQL sandboxing like Ecto (written in Elixir). Another ideal approach is to have in memory lib that is close to DB, for example, the ORM Entity Framework has an in-memory implementation, which is extremely handy because it's written in C# itself.
If there's no way to leverage real DB, you have to mock it. One thing that might help you is to leverage the Inversion of Control pattern to deal with DB access, there are many doctrines like DDD repositories, Hexagonal, Clean Architecture but essentially they're similar on this point. In this way, you'll have a clean layer to mock, and you can hide the patterns like EAV under those modules. As you leverage them enough, they will evolve and there would be helpers that could simplify the mocking process. According to your description, the best bet I would say is to evolve toward this direction if there's no hope on using real DBs, as you can tuck as much as domain logic into the "core" without touching any of the infrastructures. So that the infrastructure tests could be just very simple and generic.
For service calls, the obvious thing is to mock those calls. The not so obvious thing is to have well-defined service boundaries in the first place. I cannot stress this enough. When people failed to do this, they will feel they're spending a lot of time mocking services, while at the same time they feel they've tested nothing because most things are mocked. Microservices were getting too much hype over the years, but very few people pay enough attention on how to define services boundaries. The ideal microservice should be mostly independent, while occasionally calling others. DDD strategic design is a great tool for designing good service boundaries (while DDD tactic design is yet another hype, just like how people care more about Jira than real Agile, making good things toxic). We were still struggling with this because refactoring microservice is substantially harder than refactoring code within services, but we do try to avoid more mistakes by carefully designing bounded contexts across the system.
With that said, when the service boundaries are well-defined, and if you have things like SQL sandboxing, it's a breeze to test things because most of the data you're testing against is in the same service's DB, and there are very few service calls need to be mocked.
The value prop for things like these is always the same: for the widely - and accurately, although that's irrelevant to my point - lambasted initial release to start the ten year journey down the road towards the creation of an eventual product that will make people link to comments like these the way they link to the original Dropbox Show HN: post.
There are levels of ease of which we have not yet dreamed, especially in the realm of information manipulation.
I mean, I'm obviously intensely skeptical that such a thing will happen at all, much less within ten years.
But I guess we'll find out eventually! And if mine does become the "640k ought to be enough for anybody" quote of this decade, then I suppose there are worse kinds of fame.
I think professional software devs won’t get much value from copilot.
OpenAI’s demo from a few months back showed it as a sort of bridge to convert natural language instructions into APIs calls. Eg converting “make all the headings bold” to calls to a word doc api.
640kb, not 64kb, is the value used in that apocryphal quote. A quick google search would have shown that. I wonder if there's a google copilot in the works for social media posts.
I guess it really depends on what you are working on.
If you are writing some non-trivial algorithms or working on some projects which requires delicate handling of things, then Copilot is most likely going to mess up.
But if you are working on many of those frontend code or backend CRUDs which are usually quite repetitive. Then Copilot could be helpful.
It's meant to be a souped-up autocomplete. You don't quite remember how to do a common thing, and instead of having to go look it up, the IDE suggests it for you and you can keep doing what you're doing. A bunch of small instances of that can save you lots of time.
> I just don’t understand how this is a net time/energy savings?
At the end of the day is all about trust. Do you trust code you find in SO/Copilot to be good enough for your use case?
In my case I do not trust SO code. Whenever I use SO, if I find some snippet that seems to be the code solution I'm looking for, I copy-paste the snippet on my IDE, read through it carefully, rename variable names as needed, handle edge cases, remove unused code, etc., etc. Any code solution I find in SO gives me the "starting" kick, which is about 10% of the total effort of writing code from scratch. The remaining 90% (to understand the code that is being committed) cannot simply go away. I do not expect Copilot will make much of a difference.
I also fear that Copilot will be teaching anti-patterns.
Just tried something really simple:
def is_palindrome
Copilot suggestion was
def is_palindrome(word):
if word == word[::-1]:
return True
else:
return False
facepalm
So good for technically correct solution but still...
This is an anti-pattern I think in pretty much any language that I know of and something that about half of my beginning students try when they learn about branching..
UPDATE: more howlers along the same vein
def haystack_contains_needle(haystack, needle):
if needle in haystack:
return True
else:
return False
It's not so much the parts where think hard and implement that perfect feature. It's insanely useful when you have to make sweeping pesky changes.
Like a pulling values from a config dict and initializing a bunch of methods from the class? Or setting up a testcase similar to the one you already have, but with different values? Or cleaning values from a form? It's not a bulk edit. But it's also not thoughtful code-writing. It's monotonous and mundane. And lots of people do a lot of this on a daily basis.
you're only thinking about the 'write a comment get a block of code' feature. it also has autocomplete/predictive functionality that speeds up coding quite a bit when it works
>GitHub Copilot is optimising the part of the process that was already the easiest, and makes the other parts harder because it moves you from the “I wrote this” path to the “somebody else wrote this” path.
It is worth mentioning, I suppose, that from Copilot's point of view it is the inverse. Maybe a necessary or at least desirable step towards the inevitable 'Copilot debugger'.
maintaining code someone else wrote is much higher on your rating scale. probably need the top end. because it nearly always involves some debugging and usually is not obvious what footguns exist
To make it faster to pump out go which is maybe > 50 noise with it's poor language features and it's anemic standard library that does not even have abs.
Copilot is crazy. The other day, I was writing a Python function that would call a Wikipedia API. I pulled from the internet an example of a GET request, and pasted it as a comment in my code.
Then, like magic, Copilot suggested all the remaining keys that would go in the query params. It even knew which params were to be kept as-is, and which ones would come from my previous code:
action = "query" # action=query
format = "json" # or xml
lat = str(latitude.value) # 37.7891838
lon = str(longitude.value) # -122.4033522
gscoord = lat + "%7C" + lon
...
api_path = base_url + "action=" + action + "&format=" + format + ... + "&gscoord=" + gscoord
As a guy who gets easily distracted while programming, Copilot saves me a lot of time and keeps me engaged with my work. I can only imagine what it'll look like 10 years from now.
This comment is accidentally the perfect example of why copilot is a horrific idea.
The old "just copy-paste from Stack Overflow" approach to development is satirised and ridiculed these days (despite being still in common practice I'm certain), because as we all know so well by now, an accepted answer on SO does not always equate to a correct answer. Yes, the SO guys & community do do their best to improve answer quality iteratively (wiki answers, etc.), but there's still a lot of bad answers, and even many of the "good" ones become outdated or don't keep up with modern best-practice (especially when it comes to security).
Omitting urlencoding isn't the biggest crime, but it is a pretty standard URL-building step, and the fact that a tool released this year is spitting out code that omits something so simple is fairly damning. It's also a micro-example of much larger errors Copilot will surely be responsible for. Missing url encoding can be an injection vector in many applications, even if it's not the most common risk, but miss encoding in other string-building ops and you've made your way into the OWASP Top 10.
The big difference between copilot and SO is there's no community engaging in an open & transparent iterative process to improve the quality of answers.
+1. URL encoding is also very relevant on the backend and has security implications, e.g., you want to be sure you are protecting against double encoded URLs. If you elide details like URL encoding using copilot that's dangerous.
For me APIs are actually one of the places it performs the worst. Copilot is like having an inexperienced yet annoyingly optimistic pair programmer. The code it generates appears conceivable in some hypothetical universe. No guarantee it's this one though.
Remember it doesn't actually know an API or how it should be used: it's putting things together to look like typical code. For me that has meant difficult to spot bugs like linking up incorrect variables from the rest of my code.
I wish it could integrate the first SO answer to a generated question, because I always end up there anyway having to fix things.
I think my experience has been sort of between you two. Maybe 1/3 times it's spot on. The rest of the time, there is some minor tweak I need to make (it gets a parameter or variable name wrong). I've yet to hit cases where the code it generates looks right but doesn't run as expected, thankfully.
I've only had it for about a week now but overall I'm happy with it. None of the code I'm writing is crazy cutting-edge stuff and in aggregate I'm sure it saves me more time than takes, including the time I spend reviewing and potentially changing the generated code.
That's a bummer. I just got whitelisted and was hoping it could save me time with some APIs where they only have code in X language or curl and I have to work backwards if I run into any issues.
Even for a quick script this worries me about copilot; if it suggests this, then more people use it and think this is right, commit it, and then copilot suggests this more - that’s a bad feedback loop. At least in StackOverflow you get someone commenting why it’s bad and showing how to use a dictionary instead
I'm not against "copying" code. I just looked up "python build url query" The first link describes the `urllib.parse. urlencode` function which takes a dict.
def search_wikipedia(lat, lon):
"""
use "requests" to do a geosearch on Wikipedia and pretty-print the resulting JSON
"""
And it completed it with:
r = requests.get('https://en.wikipedia.org/w/api.php?action=query&list=geosearch&gsradius=10000&gscoord={0}|{1}&gslimit=20&format=json'.format(lat, lon))
pprint.pprint(r.json())
It's like a junior dev who doesn't quit unnecessary code golfing. Somehow the AI is more comfortable with string-based URL manipulation, which is a straight anti-pattern.
Presumably because that's what it's seen in the training data. Remember, it doesn't care about what the code does, it's just doing a search for similar looking code.
That's what the rest of the thread is complaining about, it's still slapping the strings in there with basic formatting. No different than the top level approach.
Speaking as a former pentester, this is a fine way to form query params in this specific case, if lat and long are floats.
They're the only data you can control, and unless they're strings, it's useless for exploitation. Even denormal floats / INF / NAN won't help achieve an objective.
I broadly agree with you, but people are pummeling Copilot for writing code that I saw hundreds of times. Yes, sometimes I was able to exploit some of that code. But the details matter.
But I would still never not escape the params because you don’t know how that code will change one day or where it will end up, and chances are that you won’t remember to fix it later if you don’t fix it now.
We just had a major failure at work recently because someone decided to not decode URL params and their code worked fine for years because it never mattered… until it did.
Just do it right. It’s so easy. Why risk yourself a ton of headache in the future to save you a few seconds?
If the example code is everything that Copilot generated, there's no guarantee that lat or long are floats and that seems to be an implementation detail left to the user.
Isn't that a pretty big risk though? Specifically, that people will use co-pilot recommendations "as-is" and give little thought to the actual workings of the recommendation?
After all, if you have to intimately understand the code it's recommending are you really saving that much time over vetting a Googled solution yourself?
How so? I'd prefer a proper structured library, is that what you mean? If so, the Copilot code actually seems not dodgy - because the author _started_ with `base = ...` , indicating that they were string formatting the params.
It's harder to read than other methods, and it doesn't encode the URL parameters, which means it potentially produces an invalid URL, and in some cases could lead to security problems (similar to SQL injection).
Concatenating strings for example. As shown, it's the query string equivalent of sql injection.
Use something like URLBuilder, or URIParams, or whatever your platform supports. Don't use string concatenation ever, if at all possible, and if not possible (wtf?), then at least escape strings.
I usually try to avoid working with URLs as bare strings like this, both for readability and correctness (URL encoding is tricky). With ‘requests’ you can do something like pass a dictionary of your query params and it takes care of forming the actual request URL.
It's much safer (i.e. fewer degrees of freedom for bugs to appear) to use f-strings in situations like this one.
One correlated but ancillary benefit, is that there are fewer variables to simulate the state for in your brain, while you're reading the code. You don't have to wonder if a variable is going to change on you, in-between when it is initialized and when it is used.
It's safer still to use a library (e.g. urllib3) that does encoding for you (allowing you to omit magic strings like `"%7C"` from the logic of this function alltogther).
Like GP said, very handy for one-off scripts or areas of your codebase where quality is "less important". I may be pedantic, but I wouldn't give this a pass on code review.
the "nice" way of doing this would would be to create a list of your stringified arguments, mapped urlencoding over them, and then join them with the parameter separator. this ends up being resilient to someone adding something that ends up being incorrect, and makes explicit in the code what you're trying to do.
That's impressive. Discoverability and the varying quality of documentation is a big headache for new programmers or people engaging with a an unfamiliar framework/API/library. I really like the comments pointing out alternatives (json or xml) and the static lat-long values.
One reason I've championed the development of visual programming (flow-based, node diagrams, etc) is that while you don't want to compress a big complex program down into a single layer and become unreadable in the process, graphical methods are a great way for people to see what options they have available and just point at things they want to try out.
Instead of struggling with syntax at the same time as trying to find out what they can do with a new API, they can engage in trial-and-error to find out the capabilities and what makes it worth using, then build up their competency once they are clear about their objectives.
I'm looking forward to trying this now that it's available for my favorite IDE, but I'll probably want to set up a hotkey to switch it on and off as I need it. Once I get fully comfortable with something I often find code completion tools get in the way of my flow.
The next version of Copilot will submit its answers to HN and return the highest-voted comment that compiles, after stripping out the well actually spurious tokens. Just look how well it worked this time?
Yesterday I tried to convert the representation of a number into another representation/type in Rust:
let coeff = BigUint::from_str(x).unwrap();
let coeff: <<G1Affine as AffineCurve>::ScalarField as PrimeField>::BigInt =
coeff.try_into().unwrap();
let x: <G1Affine as AffineCurve>::ScalarField = coeff.into();
I wrote that, then I wanted to move that code in a function so I wrote:
fn string_int_to_field_el(s: &str)
copilot suggested the following one-liner that did exactly the same thing:
fn string_int_to_field_el(s: &str) -> <G1Affine as AffineCurve>::ScalarField {
let x: <G1Affine as AffineCurve>::ScalarField = s.parse().unwrap();
x
}
I still don't understand how some heuristics could produce this code. It's mind blowing.
In a way it is impressive codepilot knew how to separate the query string while still being correct. However this is a fairly naive way to build a url that I wouldn't encourage committing. If I saw this in code in a review I would recommend using a dict and `urlencode` or one of the various other URL builders available (either in the stdlib or through another library like `yarl`/etc.).
Now we just need to train it to make a dictionary for that info instead of forming a long url. But if it has to use a long url to use urljoin and/or string formatting.
I’ve been using this for weeks and it blooooows my mind. It comes up with crazy recommendations, just yesterday I wrote this big ass logic to do something, then I wanted to move that code to a function so I wrote the function name and I kid you not copilot suggested a one-liner that worked… the thing is so useful and I’m not writing simple code (writing cryptographic code). And when it’s not doing that at the very least it provides auto-completion to lists where a counter has to increase and things like that. It’s just baffling, I don’t think I’m that directly impacted by AI, or at least this is the first time where I’m like “wow, AI really is changing the world RIGHT NOW”. Half-joking: can’t wait for it to just write my code.
Also if this feature would be paid tomorrow, I think I would pay for it. It’s really noticeable when I don’t have copilot enabled now.
Oh and, autocompletion doesn’t work with markdown files because of markdown plugins I think? But this is another level of insanity: when I’m writing english it figures a lot of the sentences I want to write. Makes me question if I’m just a deterministic individual with no choice.
Delegating the implementation of something that you are notoriously never supposed to roll your own, to a text generator AI.... What could possibly go wrong?
I think you got the wrong idea on how people use copilot. You don't just accept everything it throws at you. Think about it as an auto-complete on steroid, would you say that auto-complete is dangerous to use? If not, then this is the same. The suggestions of copilot don't always compile, but they sometimes manage to find the right function to use, or the right combination of functions, or even the right comment (if you're writing a comment).
The problem with copilot is that other than the most basic boilerplate generation. It takes just as much effort to verify its output is correct than to come up with a correct answer.
I disagree, from my usage it sets up a lot of the boilerplate code, the types, etc. and will also find the correct functions and all without having me look up the doc again.
I was experimenting with it over the weekend and basically got it to build out a simulation of schooling/flocking boidfish swimming around a tank. That was fun! I certainly wasn't expecting it to create the whole darn thing, but it did. I just added an occasional comment to nudge it toward doing exactly what I wanted.
A lot of the examples people are giving of code Copilot filled in for them sound like what would be called plagiarism, and probably also copyright infringement.
Which I think was fairly predictable.
What wasn't predictable was that someone would ship this Copilot anyway, consequently exposing their company and their users' companies to liability.
Imagine if you hired an intern who was copy&pasting bits of GPL'd code throughout your system. This would not be a good job, it would be something that needed immediate attention from legal counsel and others, and mean reverting every commit the intern from heck made if you couldn't prove convincingly it wasn't tainted. Especially if you're a startup, who needs to assure investor due diligence in good faith that you actually own the IP.
Wait til stackoverflow sues everyone into oblivion!
Letting your intern blindly commit to your code base seems like the bigger issue here. The entire purpose of an internship is to learn and to be guided by professionals, not to be treated as a cheap laborer. You don't hire interns, you train interns.
If the analogy works better, imagine that you hired a developer who copy&pasted GPL'd code throughout your system.
(And people couldn't tell in code reviews, nor on other occasions to see the code, since it's not normal to recognize GPL'd code on sight; everyone just assumed the developer was productive.)
If the analogy works better, imagine that you hired a developer who copy&pasted whatever-license-your-company-is-most-afraid-of-and-is-highly-likely code throughout your system. :)
>The entire purpose of an internship is to learn and to be guided by professionals, not to be treated as a cheap laborer. You don't hire interns, you train interns.
This has not been my experience. I was dropped into the developer team and expected to know the entire tech stack and was not trained by anyone from the company at any point. Have I been bamboozled??
There’s a huge spectrum of “training”. At least you know you have an expected tech stack. Go learn it and ask questions if things are confusing. Don’t wait to get “trained”.
It helped that I was already semi-familiar with it, but the other engineer and team lead quit shortly after I joined, and I was left by myself to complete the tasks. It was brutal, and I'm looking for a different job to get away from this workplace for this reason.
We did and currently do have a manager type person. They are not code-savvy though, which is sometimes frustrating. We just recently hired another intern, a regular developer and a lead, so it's slightly less painful now, and I am still the one working on the primary applications. The others are working on separate things. Before the hiring, it was me and another developer, but he doesn't develop the main applications. The task list at that time was infrequently used, but it's become daily routine now.
I think it depends if you're getting paid. If they call you an intern because you're still in college or have no professional experience, but they're paying you fairly then it seems like a fine arrangement they ask you to do real work. On the other hand, if your compensation is primarily "experience" then I'd say you're being bamboozled.
I am being paid, although significantly less than my previous job. It's not that I can't do the work, but it feels like I'm being taken advantage of because I have no professional experience; although I have almost 10 years of personal experience. It might sound like I'm complaining about the work itself, but it's more than that to me.
That’s what stood out to me too! His plugins are always so thoughtful and idiomatic Vim, they really ought to be default. I guess except this one should remain opt in.
There tends to be a lot of repetitive code in the world. I primarily write JS, Py, and Rust. Sometimes, I might declare something like a function table, and Copilot will automatically fill in the class definition with everything I defined.
I'm not using Copilot to write new algorithms or solve library-specific problems, but it sure is next-level in picking up patterns in a file and predicting where you want to go next. Obviously, good code is succinct (not repetitive), but it sure is helpful when in that early prototyping stage. I admire it's ability to infer a correct assertion when writing Unit Tests - it made it much easier for me to write tests recently and helped me recognize a few bugs.
Same experience here. I think a lot of prime are being a little too philosophical about it. Where it shines is small helper functions and predictive boilerplate. It’s more like emmet imo then a pair programmer.
Yesterday, I was disgusted to see framers putting up a house that clearly plagiarized the entire internal structure of my own. Same joint interfaces, same structural idioms when dealing with things like staircases, windows, and rafters, same fasteners, same adhesives, even the building materials! Aside from the most general aspects of the layout, it was exactly the same right down to the inch! People have no professional integrity these days.
copilot's won't suggest anything worth calling plagiarism, just mundane plumbing and maybe textbook algorithm implementations. have you seen it generate anything more glorified than StackOverflow-esque code snippets?
They had to blacklist Q_sort because they couldn't stop copilot from copying the complete function with comments (not just the algorithm) from the quake source, however it did not autocomplete the correct license for it.
They would also have to exclude virtually all code that isn't public domain, too. MIT and Apache-2.0 and BSD all require that the copyright notice and license text is preserved in downstream use.
I tried it on IntelliJ recently. The examples I tested blew my mind. Yet, I think there are two things that need to improve to get me to use it regularly (it will get there!):
- less than perfect import/types/var suggestions that LSP in typed languages would've made perfect suggestions for (e.g named import in go would use the package name instead).
- latency feels a bit high and my thoughts would get interrupted waiting for a suggestion to come.
For the former, I wonder hard feasible it would be to give structured suggestions to the LSPs that it would swap for correct var names and imports and such. Or test each suggestion with the LSP for error counts and offer the least erroring suggestion.
I've been getting a lot more misses than hits with Github Copilot, even when writing elementary math or utility functions; but despite its error I am nevertheless astonished at its approximation of intent.
Very eager to see Github Copilot catchup to some bright line of signal v noise.
I would say the reverse, I’m getting so many hits that I’m mindblown. And when it missed, I can generally still use that and fix the suggestion, as it’s faster.
It's a glorified autocomplete to me. It seems like it feeds me mostly things I'd get from searching Stack Overflow. The first day was pretty interesting but the novelty wore off quickly. You still need to grok what it spews out and see if it's correct.
And the worst thing about that is that you don't get the context of the Stack Overflow threads, where people discuss the impact of the given solution and alternatives. So after a week, off it went for me.
I'm constantly blown away with what it spits out even when its wrong. When it pulls in the greater context of the app and generates comments from scratch using the context of the file, its just incredible.
I have many thoughts about Copilot, but here are two.
First, as much as I don't like the idea of Copilot, it seems to be good for boilerplate code. However, the fact that boilerplate code exists is not because of some natural limitation of code; it exists because our programming languages are subpar at making good abstractions.
Here's an example: in Go, there is a lot of `if err == nil` error-handling boilerplate. Rust decided to make a better abstraction and shortened it to `?`.
(I could have gotten details wrong, but I think the point still stands.)
So I think a better way to solve the problem that Copilot solves is with better programming languages that help us have better abstractions.
Second, I personally think the legal justifications for Copilot are dubious at best and downright deception at worst, to say nothing of the ramifications of it. I wrote a whitepaper about the ramifications and refuting the justifications. [1]
(Note: the whitepaper was written quickly, to hit a deadline, so it's not the best. Intro blog post at [2].)
I'm also working on licenses to clarify the legal arguments against Copilot. [3]
I also hope that one of them [4] is a better license than the AGPL, without the virality and applicable to more cases.
Edit: Do NOT use any of those licenses yet! I have not had a lawyer check and fix them. I plan to do so soon.
> This means that Google was careful to not make a large amount of copy-righted works publically accessible. Such is not the case for GitHub Copilot in particular Armin Ronacher’s tweet [19]
The fast inverse square root algorithm referenced here didn't originate from Quake and is in hundreds of repositories - many with permissive licenses like WTFPL and many including the same comments. It's not really a large amount of material, either.
GitHub claims they haven't found any "recitations" that appeared fewer than 10 times in the training data. That doesn't mean it's a completely solved issue though, since some code may be in many repositories yet always under non-permissive licenses.
> and I would argue that it will not be the case for ML models in general because all ML models like Copilot will keep suggesting output as long as you ask for it. There is no limit to how much output someone can request. In other words, it is trivial to make such models output a substantial portion of the source code they were trained on.
With the exceptions mentioned above, what you get back from asking for more code won't just be more and more of a particular work. Realistically I think you'd be able to get significantly more from Google Books.
>The fast inverse square root algorithm referenced here didn't originate from Quake and is in hundreds of repositories
With the exact same comments?
> many with permissive licenses like WTFPL
So it would be perfectly legal to do whatever I wanted with the source for GCC as long as there was a single fork on github that replaced the GPL with a MIT license? Quite sure the FSF would be perfectly fine with that.
> Quite sure the FSF would be perfectly fine with that.
I believe the person republishing GCC code under MIT would be liable.
Also, I'm not recommending that you use code you know has been incorrectly licensed. Just that in cases where certain "folk code" is seemingly widely available under permissive terms, Copilot isn't doing much that an honest human wouldn't.
A better example against Copilot would be trying to get it to regurgitate some code that has a simple known origin and is always under a non-permissive license.
> The fast inverse square root algorithm referenced here didn't originate from Quake
Where did it come from then? And what license did the original have?
> and is in hundreds of repositories - many with permissive licenses like WTFPL and many including the same comments.
If the original was GPL or proprietary, then all of this copies with different licenses are violating the license of the original. Just because it exists everywhere does not mean Copilot can use it without violating the original license.
> It's not really a large amount of material, either.
No, but I would argue that it is enough for copyright because it is original.
> GitHub claims they haven't found any "recitations" that appeared fewer than 10 times in the training data.
Key word is "claim". We can test that claim. Or rather, you can, if you have access to Copilot, you can try the test I suggested at https://news.ycombinator.com/item?id=28018816 . Let me know the result. Even better, try it with:
// Computes the index of them item.
map_index(
because what's in that function is definitely copyrightable.
> With the exceptions mentioned above, what you get back from asking for more code won't just be more and more of a particular work. Realistically I think you'd be able to get significantly more from Google Books.
That can only be tested with time. Or with the test I gave above.
I think that with time, more and more examples will appear until it is clear that Copilot is a problem.
Nevertheless, a court somewhere (I think South Africa) recently ruled that an AI cannot be an inventor. If an AI cannot be an inventor, why can it hold copyright? And if it can't hold copyright, I argue it's infringing.
Again, only time will tell which of us is correct according to the courts, but I intend to demonstrate to them that I am.
> Where did it come from then? And what license did the original have?
From what I read, the code has been altered and iterated on as it was passed down. The magic number constant is claimed to have been derived by Cleve Moler and Gregory Walsh.
> If the original was GPL or proprietary, then all of this copies with different licenses are violating the license of the original. Just because it exists everywhere does not mean Copilot can use it without violating the original license.
If it was originally proprietary (this predates GPL) I believe the liability would be on whoever took that proprietary code and republished it under MIT/etc.
To be clear, I'm not recommending that you use code you know has been incorrectly licensed. Just that in cases where certain "folk code" is seemingly widely available under permissive terms, Copilot isn't doing much that an honest human wouldn't.
> Key word is "claim". We can test that claim. Or rather, you can, if you have access to Copilot
I don't unfortunately. As a side note, your function already existed in Apache-licensed code. But since it's not in many repositories I'd be willing to bet Copilot won't regurgitate it - I could message around a few people who might be able to try it.
> Nevertheless, a court somewhere (I think South Africa) recently ruled that an AI cannot be an inventor. If an AI cannot be an inventor, why can it hold copyright?
GitHub's intention isn't for Copilot to hold the code's copyright, but for the user to.
> GitHub's intention isn't for Copilot to hold the code's copyright, but for the user to.
That is true, so I have two things I can do:
1) I can argue that Copilot is actually the distributor of the code, which means Copilot is infringing, or
2) I can go after the user for infringing, and if I win, that user would not want to use Copilot anymore for liability reasons. Or they could go after Microsoft themselves.
Why not do both? So that's what I am doing, or rather, will do.
I got access to Copilot technical preview earlier today, here's the completion you wanted to try:
// Computes the index of them item.
map_index(int item, int *array, int size)
{
int i;
for (i = 0; i < size; i++)
{
if (array[i] == item)
{
return i;
}
}
return -1;
}
The boilerplate often exists because removing it requires assumptions and guesses. The language should never guess, copilot has the benefit of being able to give you wrong answers sometimes and you get to decide if they are correct.
By definition, boilerplate is code that is repetitive. That repetition should be minimized by the programming language. If there are assumptions and guesses, that's because there's less repetition than you think.
I am talking about eliminating repetition, not code with differing assumptions.
Writing tests for the sake of coverage is already practically useless which is what a lot of orgs do, This could maybe generate such tests. However it doesn't materially impact quality now, so not much difference if automated.
One of the main value props for writing meaningful unit tests, is it helps the developer think differently about the code he is writing tests for, and that improves quality of the code composition.
I feel the opposite of codebases where having high coverage has been a priority:
* The tests doesn't actually test functionality, edge cases etc, just that things doesn't crash in a happy-path.
* Any changes to an implementation breaks a test needlessly, because the test tests specifics of the implementation, not correctness. Thus it makes refactoring actually harder, since your test said you broke something, but you probably didn't, and now you have to double the work of writing a new test.
* In codebases for dynamic languages, most of what these tests end up catching is stuff a compiler would catch in a statically typed language.
No, as a sibling comment to mine shows, it's actually easy to make 100% coverage with bad tests, since one doesn't challenge the implementation to handle edge cases.
It's easy to achieve 100% coverage with happy-path code and low quality shallow tests, agreed.
AFAIK, «high coverage» may have different meaning for different people. For me, it's «high quality», for others it's «high percentage», e.g. «full coverage» or «80% coverage», which is easy to OKR.
It's the fact this could even have a different meaning that makes this a useless metric - defining 'quality' or 'coverage' is subjective. The majority of tests written are meaningless noise, and serve mainly to distract from covering 'critical' failures. Again, a subjective measure, in the sense that was is critical to you and me may not be the same thing.
Which is what makes this whole concept of code coverage so much toxic nonsense...
Not to argue against writing 'quality' tests, but high 'coverage' actually decreases quality, objectively speaking, since erroneous coverage of code serves negative purposes such as obscuring important testing, enshrinig bugs within testing.
I would make my case here CodePilot and all such 'AI' tools should be banned from production, at least until they solve the above problem, since as it stands they will serve to shovel piles of useless or worse, incorrect testing.
It is also important to remember what AI does, i.e. produce networks which create results based upon desired metrics - if the metrics were wrong or incomplete, you produce and propagate bad design.
So yes people use it now as a learning tool (fine) and it will get 'better' (sure), but as a tool, when it gets better, it will constrain more, not less, along whatever lines have been deemed better, and it will become harder, not easier, to adjust.
I saw a cool study recently (summarized well here[1]) with an empirical experiment on how well code coverage predicts how well a test suite catches bugs. They found that the number of test cases correlated well with the test suite's effectiveness, but, when controlling for the number of tests, code coverage didn't.
It was a pretty thorough study:
> Our study is the largest to date in the literature: we generated 31,000 test suites for five systems consisting of up to 724,000 lines of source code. We measured the statement coverage, decision coverage, and modified condition coverage of these suites and used mutation testing to evaluate their fault detection effectiveness. We found that there is a low to moderate correlation between coverage and effectiveness when the number of test cases in the suite is controlled for.
Given their data, their conclusion seems pretty plausible:
> Our results suggest that coverage, while useful for identifying under-tested parts of a program, should not be used as a quality target because it is not a good indicator of test suite effectiveness.
That's certainly how I approach testing: I value having a thorough test suite, but I do not treat coverage as a target or use it as a requirement for other people working on the same project.
The example i usually give [1] in javascript is say you have function
(x,y) => x + y
Orgs targeting code coverage write a test for 1,2 => 3 and get 100% coverage and then stop as there is no incentive to go further. They don't write tests for say
(1,null)
(null,null)
('x',1)
(NaN, Infinity)
and so on.. these additional tests will improve coverage of scenarios and code coverage will not move.
I have seen projects where a test will have sequence of steps which trigger the code but the assertion is effectively true === true, or they will replicate the same function in the test instead of generating proper mock data or myriad different absurd testing approaches. This comes from twin pressure of showing coverage and having tests pass.
Coverage is also a challenge in code which uses AI/ML libraries or that use third party services. These really need statistical testing with large volume of diverse well maintained samples and results need statistical analysis for error rates not different from how manufacturing does it, I don't see that often. For code using face detection for example a single face getting detected or not is hardly an adequate test.
Finally, it is easier to improve coverage by testing simpler code than improve coverage/ refactor a function which has say 10 nested branches, so it is not uncommon to see 90% coverage and 10% of the most used / most error prone code poorly or not tested at all.
There are some methods to address these like mutation testing, do retros for failure of tests to capture production bugs, it is not easy to measure and coverage driven orgs will not see their metrics moving by doing these.
Well written test suites will also have good coverage, but not necessarily other way around. Developers who care and understand what they are doing and why they are doing it, will use coverage as only the first step to see where there are gaps in their tests.
Tests are also code that need peer reviewed and maintained, if the tests depend on implementation and constantly break or contains improper mocks or assert poorly. A lot of not well written tests is hindrance to development than aid it.
[1] Yes, most of these are not applicable in a strongly typed language, but it is far easier as a illustration .
So one argument against code coverage requirements is that poor engineers won't test correctly. Without the code coverage requirements you're in the same situation.
Problem is with 100% code coverage of badly guarded / implemented code you'll have a fall sense of security if you're just looking at coverage as the metric of quality. Anytime I've worked with a company who had a required code coverage percent, they never actually cared what the code being covered looked like only that it was covered in some test.
If you are using floating point numbers implemented in hardware, then infinity is absolutely a valid value and one that your code will encounter. This is true regardless of language, as long as the language requires or allows IEEE-754 semantics.
I am not aware of any language (outside of intentionally-minimalist esolangs) that doesn't support floating point numbers. In some languages (like JavaScript) that's the only kind of number you get.
If you are seeking a minimum value within some complicated iteration, it’s easier to start your min accumulator as Inf than null with extra null checks.
I've found that for large codebases of dynamic typed interpreted languages Test Coverage is very useful at preventing typos or subtle bugs that wouldn't be caught otherwise.
I think replies mentioning automatic unit test generation miss the point.
To me, the value of copilot helping to write tests is that we, the engineers, come up with the test cases, and copilot helps write the code for that case.
I think humans will still be more imaginative in the test cases they can dream up (although I’ve never used an automatic generator, maybe they’re better than I think), but almost all test code is boilerplate, either in the setup or the assertions.
If I don’t have to write that repetitious, yet slightly different boilerplate for each test case, that frees me up to design other interesting test cases (as opposed to getting tired of the activity by the time I cover the happy path) or move on to the next bug/feature work.
QuickcCheck-type tools (generators for tests that know about the edge cases of a domain - e. g. for the domain of numbers considering things like 0, the infinities, various almost-and-just-over powers of two, NaN and mantissas for floats, etc.):
Fuzz testing tools (tools which mutate the inputs to a program in order to find interesting / failing states in that program). Generally paired with code coverage:
Mutation / Fault based test tools (review your existing unit coverage and try to introduce changes to your _production_ code that none of your tests catch)
Copilot has been super fun! Here is a small website I generated via only comments - in the HTML and JS. The CSS needed a bit more massaging but it also auto generated.
the fact that most of Co-Pilot's usefulness comes from repeating common snippets of code makes me think there has to be a much simpler way to reduce the boilerplate of "common tasks"
In my opinion, dependent type systems are that way.
A dependent type is a type that depends on a value. What something is depends on something else; What something is is derived from something else – with computation. What dependent types are are calculations of type structure.
Unbounded, this becomes intractable like all unbounded computational expressivity. So that’s what modern dependent type systems need to solve—and they do. There exist formal results that frame the computational expressivity of dependent type systems and make perfectly feasible and tractable the task of deriving significantly complex types. Automatically.
The Idris language is a great example of where we’re heading.
The reason I’m more interested in this sort of thing rather than Copilot is that dependent type systems are based on formally rigorous methods. You end up with formally verified programs, by way of the same mechanisms that allow you to derive them automatically. They’re also easier to compose, for the same reasons.
Edwin Brady’s Idris demonstratons on Youtube show a bit of what’s possible. In one, the compiler automatically writes a formally-correct type-directed matrix multiplication function. In another, run-length encoding and decoding functions are generated from a type definition.
The book Type-Driven Development with Idris is a great read. Mind-expanding.
Programming is automation. The automation of automation is… dependent types.
Formally rigorous methods have almost always lost out in the market against slop like JavaScript.
I think the reasons are completely pragmatic and boil down to two things.
The first is that rigorous systems usually take longer to build and meanwhile the slop ships first and gains mindshare.
The second is that rigorous systems are usually built by academics or specialists in some specific vertical and lack the easy installation, easy onboarding, and integration with other platforms that more pragmatic sloppy platforms tend to prioritize first. Those things get prioritized first because sloppy languages and systems tend to evolve from makeshift "shop jigs" used to get stuff done, not from research.
There is much less of it – for instance, as a kind overly specific example, serializers/deserializers and API code can be automatically generated from the model definition.
Idris code tends to sort of “fold in” test code into the function definitions themselves, so you do less work AND it’s much more rigorous.
Then there’s the compiler IDE interface which is explicitly intended to be an upstanding citizen in developer workflows.
I also want to mention how easy it is to write Idris compiler backends for any given language. This is on purpose, and is made possible by what Idris is: It’s dependent types, and dependent types is automation of the automation.
So it’s not a question of “making muggle developers eat their formal vegetables” or the like. It’s that when the automation of the automation hits, it makes work so much faster and more efficient and more pleasant that it outcompetes the slop because slop is slop. Idris might be that automation; It might not. We’ll see, right?
I’ve picked this hill to die on. At least one death, of something. Maybe just a feeling or a hope, but something.
Why? I haven't used it but it seems like the only reason it is good at repeating common snippets of code is because it is so clever. I don't think a simpler solution would be possible because it isn't exactly repeating code (except in contrived examples). It's adapting it to the context.
I have a few questions about copilot. I haven’t gotten a chance to use it yet.
Is it irrational that this makes me a little anxious about job security over the longterm? Idk why but this was my initial reaction when learning about this.
Given the scenario where copilot and its likes becomes used in a widespread manner.
Can it be argued that this might improve productivity but stifle innovation?
Im pretty early in my career but the rate things are capable of changing soon doesn’t sit too well with me.
> Is it irrational that this makes me a little anxious about job security over the longterm?
In the 50s, we programmed computers with punch cards. Who does now? How many web developers today could tell the difference between `malloc` and `calloc`? Probably not that many.
For a lot of developers, programming today bears very little relation to programming decades ago. Copilot is like any other innovation - it obsoletes some skills, and it introduces new ones.
I doubt copilot will reduce the need for engineers: but it may change the work they do. But that's no different to any other industry.
I don’t like this narrative. Going from punch cards, to Assembly, to C and to dynamic languages all empowered the programmer with more expressive languages. I don’t believe going from Python to programming via code comments is the same deal.
It would be more like we still write asm but we have editors that let you write a little C code and then it spits out a paragraph of ‘reasonable’ asm that still has to be maintained.
That's roughly how I've seen experienced people work on high-performance code. I mean, they rarely maintain the generated assembly directly, but they do write code with the assembly it would generate in mind, and keeping the code generating comparable assembly over time is part of maintaining the performance-critical parts of the codebase.
>I doubt copilot will reduce the need for engineers
Every time this happens, everyone just shifts the goal posts and they now want more features, faster. The majority of software out there sucks. If programmers are now 2x faster, users will demand that some random crud app be at Google software quality. And Google's software will be unimaginable by today's standards.
All of this will increase the value delivered by software, which will bring in greater revenue, which will be reinvested in more developers.
- Copilot is qualitatively different from the kinds of automation of programming we've seen before.
- It's barely version 1.0 of this kind of thing. Deep learning has been advancing incredibly for a decade and doesn't seem to be slowing down. Researchers also work on things like mathematical problem-solving, which would tie in to "real work" and not just the boilerplate.
- In past examples of AI going from subhuman to superhuman, e.g. chess and go, the practitioners did not expect to be overtaken just a few years after it had not even felt like real competition. You'd read quotes from them about the ineffability of human thought.
What to do personally, I don't know. Stay flexible?
There is a world of difference between what Copilot does and what an engineer does. Imagine reading a design document for a feature and implementing that on a large, years-old codebase, ie, what many engineers do on a daily basis. Copilot isn't even 1 millionth of the way to even beginning to solve that problem, it would require human-level AGI with the capability of understanding human cultural and institutional context: something categorically different than anything happening in deep learning.
I'm not so confident this is really different from how a pro Go player would react 10 years ago to the analogous question.
Put it this way: in 5 years will there be an AI that's better than 90% of unassisted working programmers at solving new leetcode-type coding interview questions posed in natural language? Arranging an actual bet is too annoying, but that development in that timeframe doesn't seem unlikely. It might take more than a scaled-up GPT, but as I said, people are working on those other directions too.
In that future, already, the skills you get hired for are different from now (and not just in the COBOL-versus-C sense). Maybe different people with a quite different mix of talents are the ones doing well.
> I'm not so confident this is really different from how a pro Go player would react 10 years ago to the analogous question.
Yes, and there were people in the 1960s who thought computers of the time were only a decade away from being smarter than humans. The question is one of category -- Go is something that a computer could conceivably be better than a human being at. There were certainly Go programs better than some human beings at that time. "Reading a human language document, communicating with stakeholders to understand the requirements in human language, understanding the business requirements of a large codebase, and writing human-readable code" is so categorically different than what Copilot does that, and something that no computer is currently capable of. If such a thing is even possible, we haven't even begun to tackle it.
> in 5 years will there be an AI that's better than 90% of unassisted working programmers at solving new leetcode-type coding interview questions posed in natural language?
I think that's highly unlikely, but it is within the bounds of possibility given what we know about AI currently (and probably, like GPT, it will only work under specific constraints). But the gap between that and what an engineer does on a daily basis is enormous.
A programmer's job bridges the informal and the formal. Previous automation practically always was about helping you work with the formal end. A tool that can bridge the informal and the formal on its own is new. That was my first point and most basically why I'm suspicious of dismissals. These developments don't have to 100% substitute for a current human programmer to change the economics of what talents are rewarded.
I could see it being big with domain specialists who can tell very specific 'user stories' but don't want to stop what they're doing in order to learn programming.
I could see it being huge for the GUI scraping market. Or imagine a browser plugin that watches what you're doing and then offers to 'rewire' the web page to match your habits.
Imagine some sort of Clippy assistant watching where you hover attention as you read HN for example. After a while it says 'say, you seem to be more interested in the time series than the tree structure, would you rather just look at the activity on this thread than the semantics?' Or perhaps you import/include a sentiment analysis library, and it just asks you whether you want the big picture for the whole discussion, or whether you'd rather drill down to the thread/post/sentence level.
I notice all the examples I'm thinking of revolve around a simple pattern of someone asking 'Computer, here is a Thing; what can you tell me about it?' and the computer offering simplified schematic representations that allow the person to home in on features of interest and either do things with them or posit relationships to other features. This will probably set off all sorts of arms races, eg security people will want to misdirect AI-assisted intruders, marketers will probably want tos tart rendering pages as flat graphics to maintain brand differentiation and engagement vs a 'clean web' movement that wants to get rid of visual cruft and emphasize the basic underlying similarities.
It will lead to quite bitter arguments about how things should be; you'll have self-appointed champions of aesthetics saying that AI is decomposing the rich variety of human creativity into some sort of borg-like formalism that's a reflection of its autistic creators, and information liberators accusing the first group of being obscurantist tyrants trying to profit off making everything more difficult and confusing than it needs to be.
You should only be worried if you just copy and paste boilerplate between files. If you're actually able to solve problems and design things there's nothing to be worried about
>...Is it irrational that this makes me a little anxious about job security over the longterm?
Noone knows what future holds, so some anxiety is just a fuel for adaptation.
For example, should Copilot take a widespread use, the number and scale of projects that will have to be maintained expands too. Moreover, making sense of the quilt patchwork of bits and pieces of code, I'd guess, written in many iterations/versions of Copilot's prowess is a very much a secure, if soul-killing, job for many. Not much different from what maintenance jobs are now. You're lucky when a project retains some clear overspanning architecture/style.
Anybody remembers the joys of GUI wizards, with tons of auto-generated code that "just works, just now"? Remember that desire to suggest a healthy rewrite? Well, now you could probably also promise that it would be an even quicker rewrite too!
But even then, the final responsibility for the code is on the programmer. One maybe could forge the code quicker, but code review is still supposed to be a human's job. Hopefully.
I wouldn't worry about job security. Programming is a story of steady gradual automation and all that's done is increased its reach and multiplied the number of niches where there is demand for software.
Future developers might be more like architects guiding AIs and then occasionally jumping in to hand-hold or correct the result.
> Is it irrational that this makes me a little anxious about job security over the longterm? Idk why but this was my initial reaction when learning about this.
I've been programming since the '80s. It's my opinion that the age of humans writing code is coming to a close. Perhaps another 20 years or so, with the peak in ~10; but I'm less certain about the timeline than the destination. There will still be a long tail, but most of the human work will shift to design and wrangling algorithms. The remnant will be hobbyist, such as Commodore 64 programmers today.
Why do you expect machines to not enter the domain of design and algorithms as well? Is there something specifically human about them?
I expect most human intellectual activities of today (from coding to scriptwriting to medicine) can be performed by machines if the current trend continues.
I don't think you should worry too much. While it will certainly make it easy for less experienced programmers to catch up with you, it'll probably free you up from a variety of simple but tedious housekeeping tasks and let you concentrate more on designing your program, while also reducing your time to ramp up on stuff that isn't familiar to you.
I suppose this thing upload your code to Microsoft servers then run some deep learning algos on it and get back to you with suggestions?
It also probably keep your code to feed it to the algos.
If big tech was trustworthy I would use this with glee. But when I see how the world is turning, I'll continue to type boilerplate by hand (as long I'm allowed).
> I suppose this thing upload your code to Microsoft servers then run some deep learning algos on it and get back to you with suggestions? It also probably keep your code to feed it to the algos.
Be sure you don't commit your code to github...as that's literally uploading your code to Microsoft servers, I mean if you just don't trust them for being 'Microsoft'.
I would be even more concerned about licensing and copyright at this point. We might see some interesting legal discussions around this, given you probably (I did not check the terms) give consent to Github using your code for other purposes beyond querying their model.
The davinci-codex model that copilot uses also has the capability of translating someone else's code into plain English. This has a great value potential.
Ultimately I'd like to have a conversation with a machine, where I can describe what I want, and the machine can serve me a possible answer, and then I can respond to clarify the idea.
If I want to throw an exception if an object is null or undefined, Co-pilot will do the if and the exception throw using the right error class, and a more meaningful error message that what I usually came up with.
If want to map some random technical data table to something useful in my programming language, I can copy paste the data from the documentation in pdf or html into a comment block, give an example, and co-pilot will write everything in the format I want.
If I want to slightly refactor some code, I can do it once or twice and co-pilot can help me a lot to refactor the remaining.
If I want to have a quick http server in nodejs, I don't have to Google anything.
This is the kind of thing I would need to see in real time, because I simply can't believe that it does any of these things in a way that is reliable and doesn't involve having to search through and make sure it hasn't made any mistakes, taking just as much time as if you did it by hand.
Likewise skeptical, but I have been super impressed with it. I just got in to the technical preview, and worked through a specific task I needed to do (involving mongoose, a mongo aggregate query, a few loops, some date functions) and started by adding a comment above each line. It helped a lot actually, felt like a collab.
I'll reproduce a generic example I sent a friend.
Prompt:
const redis = require('redis');
// Create a redis client
Copilot suggested:
const client = redis.createClient();
I accepted it and moved 2 lines down, prompt:
// Get the current count from redis
Copilot suggested a start to the operation:
client.get('count', (err, count) => {
I accepted it, and Copilot continued:
if (err) {
console.log(err);
return;
}
// If there is no count, set it to 0
if (!count) {
count = 0;
}
// Increment the count
count++;
// Set the new count in redis
client.set('count', count);
// Log the new count
console.log(count);
}
Redis has the INCR command that does this in redis without the additional round-trips (and race conditions). It also sets the value to 0 if the key doesn't exist.
So, I actually consider this to be exactly the bad behavior that people accuse Copilot of.
This was just one of its suggestions, but you're right of course.. it's all based on the training data and idioms used there. If it doesn't weight more modern code higher, if it's not aware of new versions and methods, it isn't going to be super intelligent.. but it can still give you some ideas.
Writing that probably takes a longer time than just doing it with the IDE help in jetbrains, though?
Press "r", press "." to autocomplete so it now says "redis.", then write "cC" and it suggests createClient (or write "create" or "client" if you're not sure what you're looking for). Now it says "redis.createClient()". Press ctrl+alt+v to extract a variable or write ".const" and press tab to apply the const live template. Ending up with your result in two seconds.
The power of something like Copilot is in building out stuff you're not familiar with or don't have templates set up. It's probably not as helpful when you already have a clear idea of what you want to do and just need it rather than think about it.
+1 for the variable extraction thing, I've been using their IDE for ages and it never occurred to me to look for such a thing.
I use it especially much when writing old school java. Instead of writing "MyClass myclass = new MyClass()", I just write "new MyClass()" and get the typing for free. Even better when you do longer expressions and don't want to think about the type up front. Like working with streams or so.
Yeah this would be quite helpful for me as I tend to just experiment with things in the console (cleaning up messy datasets and the like) and then copy or rewrite into something more structured later. I feel like I'm only using about 20% of what Pycharm can do.
It's _very_ good at "learn by example" with some twists. It _does_ make mistakes, and I do double check it, but it still definitely saves time. I used it to write the bulk of a new implementation of a new audio backend for a game engine yesterday - it filled out a lot of the "boilerplate" integration work (e.g. generating all the functions like "set volume/pan/3D audio position" that map over 1:1 to functions in the other library).
I will say, though, that it's also good at making up code that looks very believably real but doesn't actually work.
The ethics involved in Copilot are a bit strange, and I'm not sure I'll keep using it for those reasons, but it does a good job.
Tried it out for a while, and it's clear that it's trying to get people to be faster at writing boilerplate, not get people to write better code.
I'm a bit scared for what this means as I don't think being able to faster write boilerplate is something worthwhile. The example ed_elliott_asc made is one of those examples where instead of fixing things so you don't have to repeat yourself, copilot makes it easy to just live with the boilerplate instead.
Possibly yes, if you're only contrasting it with being able to be faster.
I mean, it's sort of a false dichotomy -- it's omitting a "default speed" for writing boilerplate that is neither enhanced nor impeded.
the potential issue with an enhanced speed for writing boilerplate is that it means that there'll just be more and more boilerplate to maintain over time, and it's not clear what that cost over time will be.
How much more effort will be expended to replace things in multiple places? It exacerbates existing issues of "these two things look almost the same, but someone manually modified one copy...should that change be propagated?"
Meaning, it's essentially an ad-hoc code generator. Code generation can be a very useful technique (see protobufs), but without the ability to re-generate from a source?
Perhaps a possible enhancement might be for copilot to keep track of all blurbs it generated and propose refactoring/modifying all copies?
> I mean, it's sort of a false dichotomy -- it's omitting a "default speed" for writing boilerplate that is neither enhanced nor impeded.
I'm not sure. I think that understanding is omitting a "default amount" of boilerplate that will have to be written regardless of one's individual preference that is really a function of the language / framework of choice, the existing codebase and the problem at hand.
Removing that boilerplate would be ideal but is not always possible given limited resources, time constraints and the usual inability to make sweeping changes to the codebase
So we settle for the second best solution which is to automate away that tedious process (or short of that provide developers with tools to get it out of the way faster) so we can all focus on "real work"
In my experience whenever someone tries to "innovate" away boilerplate they end up creating shitty abstractions that are inflexible, poorly documented, and unmaintained.
Boilerplate generally exists for a reason, and it's not because the creator likes typing.
Writing a slightly abstracted library to handle populating a list isn't necessarily "fixing" something. It might be, for sure, but is going to be very use case dependent, and there are a lot of instances where it's better to have 5, 10, or yes even 15-20+ nearly-identical lines and be done in a minute or two (or 5 seconds with Copilot IME) than spend half a day tweaking test coverage on your one-off library.
> Writing a slightly abstracted library to handle populating a list
> than spend half a day tweaking test coverage on your one-off library
If you need to write a library and spend half a day to populate a list, you have bigger problems than boilerplate.
Nothing wrong with having duplicate lines. The problem becomes when writing those lines become automated so you start spewing those all over the place.
Boilerplate is exclusively what I use AI-powered code completion for (currently Tabnine).
In a perfect world we’d all have excellent comprehensive metaprogramming facilities in our programming languages and no incidence of RSI (e.g. carpal tunnel syndrome). Code completion is a good tool to deal with these realities.
It's going to be great for exploratory data science. You don't really need stellar, maintainable or extensible code for that, the early stage is largely about iteration speed.
Iteration speed also depends on code being well written and performance code, you need to get results faster to iterate faster.
Also if your don't fully understand your code( when generated or copied from SO) as not uncommon with junior developers and data science practitioners, then they struggle to make even small change for the next iteration, because they don't fully understand what their code is doing and how.
When your code is composable or modifiable easily then iterations become faster because you understand what you have written. One of the reasons why analysts prefer Excel if data size is within limits.
I wrote a test recently for a simple "echo"-style server: clients writes a name to a socket, server replies with "Hello, " + name. Nothing crazy.
In the test body, I wrote "foo" to the socket. Copilot immediately filled in the rest of the test: read from the socket, check that the result is "Hello, foo", and fail the test with a descriptive error otherwise.
wtf? How did it figure out that the correct result was "Hello, foo"? I changed "Hello" to "Flarbnax", and sure enough, Copilot suggested "Flarbnax, foo". I added a "!" to the end, and it suggested that too. After pushing a bit further, the suggestions started to lose accuracy: it would check for "???" instead of "??" for example, or it would check for a newline where there was none. But overall I came away very impressed.
It is massively improving my productivity, things I couldn’t be bothered to write it does for me.
The thing I find really good is it can predict what I will do next. Say if I have a list of columns in some text somewhere in the project when I write one “df = df.withColumn(“OneItemInList”)”
Copilot will then add the same for all the other items - is really nice
I find it works well when my intent is clear. For example, I might want to log a value I just computed for debugging purposes. I type LOG, wait a second, and it completes a zephyr logging macro, complete with a sensible message and the value I just computed.
It sort of feels like pair programming with an undergraduate, except copilot never learns. That isn't to say it's bad, more that it is just a tool you can hand simple stuff off to, except the handoff is zero-effort.
EDIT: I will say that there are times when it makes up fantasy constants or variable names, that seem plausible but don't exist. An eventual version of Copilot that includes more normal autocompletion information, so it only suggests symbols that exist, will be a stronger tool.
It's truly amazing, almost felt magical the first few autocomplete results I got.
There's the benefits that a lot of people mentioned, but to me the biggest benefit is I can avoid procrastination. Usually when I'm blocked on something I'll run a search in the browser, but very quickly I end up going off the trail and just browsing the web and losing a lot of time. Now when I'm blocked I simply type the comment of what I'm trying to do and the autocomplete suggestion is pretty damn good and unblocks me very quickly. More surprising of all, it somehow understands my code style by looking at the context.
I was working on some internationalization stuff, translating some phrases from english to portuguese and Copilot just did it for me, does not seem like much but for me that is amazing.
I was able to write {"Settings":...} and Copilot completed with {"Settings": "Configurações"} that tool is simply amazing.
It can generate the body of test cases well, especially in BDD frameworks where you write the high-level scenario first to prime it with context. Less tedium encourages me to write more tests.
More verbose languages like C++ become less obnoxious to write in. I know RSI has been mentioned and any tool which cuts down on excessive typing will help with that.
It sometimes reveals bits of the standard library I wasn't aware of in unfamiliar languages. I can write my intent as a comment and then it may pull out a one-liner to replace what I would have normally done using a for loop.
The main downside I've observed is that if I'm not trying to reign it in, it can result in a lot of WET code since it can pattern match other areas of the surrounding code but can't actually rewrite anything that has already been written. It is important to go back and refactor the stuff it produces to avoid this.
What I like most about Copilot is seeing different programming styles suggested to me
For example, I didn't know about self.fail() in unittests and had never used it, but Copilot suggested it and it produced the most readable version of the unit test
I only played around with it on OpenAI but it's the same model as far as I know. It's pretty good at regurgitating algorithms it's seen before. It's not good at all at coming up with new algorithms.
It's very good at translating between programming languages, including pseudocode.
It can write a lot more valid code much quicker than any human, and in a whole slew of languages.
I haven't had the urge to use it much after playing around with it constantly for a few days, but it was pretty mind-blowing.
Your response makes me wonder if poisoning the well is possible by submitting code to Github with multiple languages and coding styles. A single file with a function signature written in Javascript and the body written in Python + Ruby. Enough code would surely break the AI model behind it. Unless Copilot has some sort of ingestion validation which wouldn’t surprise.
Probably but you would have to submit an absurdly large amount of code to make a dent. Practically unreasonable considering their training corpus is also increasing per lines of public code submitted on github.
So not only would you have to submit a insanely large amount of code but you're also racing against literally millions of users writing legitimate code at any period of time.
I don't know if this is true, but I would assume that the tokenizers they used for Codex use actual language parsers which would drop invalid files like this and make this attack infeasible.
When I was playing around a couple years ago with the Fastai courses in language modeling I used the Python tokenize module to feed my model, and with excellent parser libraries like Lark[0] out there it wouldn't take that long to build real quality parsers.
Of course I could be totally wrong and they might just be dumping pure text in, shutter.
In any training with code I've done, we've written a parser that validates against tree sitter grammars to make sure it's at least syntactically valid against some known subset of languages we're training on.
I’m which case shifting strategies toward code that looks correct but isn’t using shared syntax between languages as well as language specific gotchas.
Yeah but if malicious intent is a concern you can just spin up a sandboxed instance to run the code to check first.
Really the thing is there's not way to ascribe correctness to a piece of code right, like humans fail at this even. The only "correct" code is like rote algorithmic code that has a well defined method of operation. And there's likely a lot more correct examples of that, like way more than you'd ever be able to poison.
You may be able to be misleading though by using names that say one thing but do another, but again you'd be fighting against the tide of correctly named things.
If I have an if else case, a switch statement or something similar, it can often predict the next branch exactly how I would write it. That‘s probably 80% of the suggestions I accept, the rest is single line autocompletes. I have never accepted a whole function implementation, and they are actually rather annoying because they make the document jump.
It‘s useful enough for me, as a magic autocomplete.
Useful to say, write a python script, doing some mandane things, like generate all the argparse lines for you, read the files, etc.
In a way, it does the dirty pipes surprisingly well. But when it comes to implement the core of the algorithm, it is not there yet, but the potential is huge.
As other pointed out, it makes boring or repetitive tasks a breeze.
Also, it’s like a more clever auto complete most of the time, even when it’s wrong in calling a function you can use it as foundation code to go faster.
And you don’t need to think too much about it, it really keeps you in the flow.
I've actually found it helpful as an API autocomplete, but... also not helpful at the same time.
So for example I was working with processing an image to extract features and a few variants of docstrings for the method got me a pretty close to working function which converted the image to gray scale, detected edges, and computed the result I wanted.
The helpful thing here was that there were certain APIs that were useful as a part of doing this that it knew but which I would have to do look up. I had to go through and modify the proposed solution: it got the conditional in the right place, but I wanted a broader classification so switched from a (255, 255, 255) check to a nearBlack(pixel) function which it then autocompleted successfully. I also had to modify the cropping.
When doing a similar task in the past I spent a lot more time on it, because I went down a route in which I was doing color classification based on the k nearest neighbors. Later I found that the AI I was working on was learning to exploit the opacity of the section of the screen I was extracting a feature from in order to maximize its reward, because it kept finding edge cases in the color classifier. I ended up switching to a different color space to make color difference distance functions more meaningful, but it wasn't good enough to beat the RL agent that was trying to exploit mistakes in the classifier.
Anyway, what I'm getting at here is that it is pretty easy to spend a lot of time doing similar things to what I'm doing and not get a great solution at the end. In this case though it only took a few minutes to get a working solution. CoPilot didn't code the solution for me, but it helped me get the coding done faster because it knew the APIs and the basic structure of what I needed to do. To be clear, its solutions were all broken in a ton of ways, but it didn't matter it still saved me time.
To give another example let's say you have a keyboard key event press and you weren't sure about how to translate that into the key that was pressed. key.char? key.key? str(key)? key.symbol? A former method of figuring out what the right key might be is looking up the code, but with CoPilot you type '# Get the key associated with the key press' then hit tab and it gives you code that is broken but looks perfect and you gain a false sense of confidence that you actually know the API. You later realize after being amazed that it knew the API so well that you didn't have to look it up that actually the key press event handles symbols differently and so it errors on anything that is used as a modifier key.
My general impression is something like: Wow, this is amazing. It understood exactly what I wanted and it knew the APIs and coded up the solution... Wait, no. Hold on a second. This and that are wrong.
I am in the same boat with you. I am simultaneously wowed and underwhelmed to some degree.
Yes it is amazing when it gets right, it feels like cheating. But at the same time, it many times, does ... too much? To read a huge chuck of code and figuring out where it goes wrong is not a thing for me. Also Copilot doesn't really know the API, so yes, the amount of mental tax isn't less to make sure your program really behaves.
But again, I see the idea of Copilot is already huge win. I hate writing those manual scripts just offer people an entrance to some utility behind. Copilot does those things, surprisingly well and with accuracy.
Let it improve in the future, and we will see changes that quite fundamental to the idea of programming itself.
Legally, it needs to be opt-in in order to protect downstream consumers of code written by Copilot.
Copilot sometimes reproduces code verbatim. You can't use open source code except under the terms of the license. Authors whose code may be reproduced by Copilot need to grant a license to downstream consumers, and republishers of Copilot-generated code need to adhere to the terms of that license.
Copilot is inserting ticking time-bombs into its users' codebases.
Nope. Copilot users are inserting "ticking time-bombs" into their own codebases.
The buck stops with the user, when they use code from any source at all, whether it's their head, the internet, some internal library, lecture notes, a coworker, a random dude of the street, or who knows what else, it their own responsibility to ensure the code they're using has been released under a license they can use. They don't get to go back and point fingers just because they didn't do their own due-diligence.
The exception would be if a vendor provides code under a legal contract providing liability and an agreed license, that has not happened in this case so there's no reason to expect any legal protections.
We agree that downstream users who redistribute copyrighted code regurgitated by Copilot are in violation of copyright.
It doesn't seem to me as though the distinction between "Copilot reproduced the code and the engineer copy/pasted/saved it" versus "Copilot inserted the code" is crucial.
There's a separate question about Microsoft's own liability. When Copilot reproduces open source code without adhering to the terms of the license, that's redistribution and thus copyright infringement. A copyright owner might not be able to get substantial monetary damages, but they ought to be able to get a copyright injunction.
I wonder what happens to Copilot should a Github user secure injunctive relief, forcing Microsoft to exclude their code from Copilot.
> It doesn't seem to me as though the distinction between "Copilot reproduced the code and the engineer copy/pasted/saved it" versus "Copilot inserted the code" is crucial.
I think that could be crucial.
If I read a computer science book, and from that produce a unique piece of code which was not present in the book, I have created a new work which I hold copyright over.
If I train a machine learning algorithm on a computer science book, and that ML algorithm produces some output, that output does not have a new copyright.
Similarly, if copilot synthesizes a bunch of MIT code and produces a suggestion, that may be MIT still, while if a human does the exact same reading and writing, if it is an original enough derivative, it may be free of the original MIT license.
The way I'm reading your reply seems like sophistry, so I expect I'm misunderstanding you.
Scenario 1: Copilot, operating as an IDE plugin, placed the suggestion directly into the text. To accept the suggestion, the engineer hit save.
Scenario 2: Copilot, placed its suggestion in an external file. The engineer copy/pasted the suggestion verbatim into their IDE, then hit save.
These don't seem as though they materially affect the situation. Regardless, the downstream user who somehow brought the copyrighted code into their codebase (which they subsequently redistribute) is infringing.
This theoretical case where Copilot is not involved and the user synthesizes something on their own is not germane. Copilot is involved.
What are you folks getting at? That Microsoft is in the clear? That the end user is in the clear? That "I'm just making suggestions" is akin to "I'm just asking questions" and absolves the suggester of liability? I don't get it.
Thanks for giving me the benefit of the doubt, but I do not deserve it in this case. I misread what I was responding to and my response was off the mark.
You're right to be confused, and my reply can be ignored as off-topic for the thread i'm in.
That's generous of you, since you were not alone. It seems as though I could have done a better job of emphasizing from the get-go that I thought infringement by the end user was the key point, rather than infringement by Microsoft.
> It doesn't seem to me as though the distinction between "Copilot reproduced the code and the engineer copy/pasted/saved it" versus "Copilot inserted the code" is crucial.
Yes, the only thing that matters is who authorized the code to be published, which is never Copilot (an automated system that takes tickets, has copilot craft patches from them, then publishes them with no human review would be a) very cool and b) an incomprehensibly terrible idea; but even then there is still a human authorizing the code to be published, just residing a level of abstraction removed from the process itself)
>It doesn't seem to me as though the distinction between "Copilot reproduced the code and the engineer copy/pasted/saved it" versus "Copilot inserted the code" is crucial.
Is Google responsible when they index licensed code, then others steal it? It's surely the liability of the programmer to "check" (Not really sure how this would work, either).
What matters is that users of Copilot (the "others" who "steal it" in your scenario) are liable for infringement. That renders Copilot impractical as a tool for production use, regardless of whether Microsoft has any liability.
If it is relevant at all, the threshold of originality applies to the allegedly-copyrighted source consumed by Copilot (as regards bare infringement, not wilfull infringement). If that doesn't meet the threshold, it is not copyrightable. If it is, unauthorized copying not within a copyright exception (e.g., fair use) is infringement.
I can't see any case where originality of the snippet presented by Copilit matters (if Copilot were a person, it would matter to determining if the snippet on its own was copyrightable by Copilot, but still wouldn't be relevant to whether to original copyright was violated.)
How much of the code needs to be "unique" across a single codebase for copilot to be illegally pasting it downstream ?
For a great deal of copilot insertions, it's like the equivalent of me writing the sentence, "the man gasped in surprise" in a novel I'm working on. Yeah maybe that sentence came from somebody else's novel, but you can find the same damn sentence in a thousand other books/papers/etc as well.
A friend of a friend, told me he is furiously adding code to Github, with subtle security bugs. He can't wait for it to show up in the proper places...Courtesy of Copilot ;-)
I wonder if that could be a nice money maker. Introduce a lot of generic functions with common names, add security bugs. Maybe add a README entry telling people to not trust the code because you're using it to demonstrate what insecure code looks like, and then wait for some big company with a bug bounty to introduce it to their code base. License your code proprietary of AGPL to make sure the company is the one who gets in trouble if they admit the code comes from you.
With enough nearly-working functions spread across multiple projects in every language known to man, you could practically automate your way into a steady stream of hacker bounties.
People would probably call it unethical, but if Copilot's massive IP violations are okay then who cares. As long as the project's security flaws are recognisable by humans it doesn't matter IMO.
What if I - a human - read your code, learn from it, and then use that knowledge to write my own code? Am I stripping your license off and ignoring it?
It is. Machines are running algorithms, which, by definition, are not creative.
Of course, people may argue that people are not creative, but considering that for a recent court case, it was decided that AI cannot be an inventor, it does matter to that court at least.
It's not. The legal system doesn't have the consistency or dependability of code. A lot of legal 'results' (judicial decisions) are somewhat arbitrary, balancing abstractions with practicality and so on, but get attention because they include clear explanations of the basic issues. Other lawyers then treat those decisions as 'legal facts' by echoing the language and (basically) challenging other lawyers to come up with a more impressive explanation than the decision that they are citing.
Look into a famous recent copyright case, Cariou v Prince. Cariou is a photographer, who made a series of photographs of people in Jamaica an dpublished them in a book. Prince, an artist, liked them, and then treated them as raw material, basically printing them up large, adjusting them, and slapping some paint on top, and declaring it original art - indeed, he called it 'appropriation art' saying 'this is mine now.'
Cariou was upset (very understandably) and sued. The judge found for Cariou, said that Prince was a bullshit rather than a visual artist, and ordered the infringing work to be collected and set fire. Prince appealed, and his appeal succeeded, with the 2nd circuit saying it was "transformative" from the point of view of a "reasonable observer" and therefore fair use, because Prince had added a different aesthetic by turning the portrait photos into oversize graffiti-collage mashups. Cariou gave up at that point as he didn't have the resources or will to fight the case further, and eventually the two artists settled.
Look up the art and see for yourself. I think Prince does have his own aesthetic, but it's a very shallow one that just surfs on other people's work, not very different from drawing glasses and a mustache on top of an existing portrait and saying you made an original work. In many ways, what he's selling is his taste, and the modifications he makes to the picture are just a sort of signature that's semantically equivalent to saying 'I, Richard Prince, approve of this image' - the artist as curator-critic of others' work, if you like.
Back in the computing context, this decision substantially loosens the boundaries of copyright. Found some code you wish you had written and want to put out your own thing, but feel stymied by the license? Just refactor it, add a bunch of sassy comments, and make the interface (whether, CUI, CLI, or API) aggressively different - not necessarily better, just distinctive. If it's fun and whimsical, make it corporate and bog-like. If it's scientific and functional, make it silly and juvenile. Just futz with it a bit until you can plausibly say you either made it better or easier to read or more accessible/intercompatible in some way. Hell, slow it down a bit and say your code smells better because it cooks the JSON longer, and generates a bunch of 'useful' statistics that might seem superfluous to the original designers but are essentially interesting to you. You'll probably get away with it.
All of that completely ignores the fact that a court just basically said that a machine cannot invent, which is a small step away from saying a machine cannot copyright, which would mean that Copilot is not transformative.
The copyright case you cited involves humans. We are talking about machines.
Copilot is used by human programmers. It's not cloning entire programs, little bits of programs that are curated and assembled into something new are easily going to past the test of transformativity. You go into court with that argument, and the other side will just point out that you have no way of showing the entire program you're complaining about was written by machine.
You seem to be arguing that if it includes any copyright code at all, the whole program is thus an infringement. You will be laughed out of court with that. I'm sorry, but I really don't think you've thought your argument through.
I'm sorry, but I think you have not thought my argument through.
First, I'm well aware that a whole program is not an infringement. It doesn't have to be for there to be infringement by some piece of code in the program, which would be what I was arguing.
When I go into court, what I will say is, "This piece of code infringes, here is my original." The court will rule on whether that piece infringes.
At that point, I have not argued at all about machines, and I don't have to. All I have to do is argue that the code infringes because then, if I win, the company or entity that is infringing on my code has some options:
1) They can claim it came from Copilot, and thus, that they are not at fault, at which point the court will laugh them out of court, as you say, and I collect my damages.
2) They can sue Microsoft for stripping the license from the code and making them unable to find the provenance of the code, and I still collect my damages.
If Microsoft is successfully sued, they will have to ensure Copilot does not strip licenses and gives the programmer information about where the code came from. That would be a win for me because all I want is for my code to be used according to the licenses.
If Microsoft is not successfully sued, then companies will realize that using Copilot is a bad idea because they could be held liable for code that they do not know the provenance of, which is a bad business risk, one which companies will not take, even big companies like Google, which does not use any AGPL code.
Regardless of which option they choose, that initial victory for me in court means that companies now know that using Copilot can lead them into a copyright minefield unless licenses are passed through, so I win against Copilot without ever arguing against it in court.
tl;dr: I'm not going to be directly suing Microsoft. Instead, I'm going to exercise my rights under copyright and my licenses as I should and let the dominoes fall toward making Copilot a dead product.
Frankly, I think it's on you to lay out your argument in full rather than assume everyone is privy to your thought process.
You seem to coming at this as if the law is a purely mechanistic thing that can quickly resolve disputes, overlooking how these things play out in the real world, like Oracle v google going on for a decade or the even longer litigation involving SCO and IBM.
I mean, what makes you so sure the court is going to give you a quick judgment on the infringement, or that it's going to agree with you about the size of code fragment that that is sufficient to infringe? Perhaps if they do verbatim copies of some unusually original algorithm you have developed, but given the fact that Copilot enthusiasts mostly laud it for it's ability to generate decent boilerplate/housekeeping code, a court might well find that the similarity to your code isn't infringing because the code in question doesn't do anything very distinctive. Commercial code shops are risk averse, it is true, but they also tend to have house styles on everything from variable naming to formatting that would further muddy the waters.
I feel a lot of your argument is begging the question (in the legal sense of assuming your conclusion) without considering whether the court will agree your code was infringed upon. Surely you can can agree that sufficiently small code fragments won't meet this threshold because they're too basic or obvious. Because your whole argument here rests upon that assumption, it comes off as a wish fulfillment scenario where Copilot disappears because nobody likes the risk calculus; your stated goal of 'making Copilot a dead product' seems more emotional than rational.
In reality it will take you a long time to get a result, and if enough people find Copilot useful (which I suspect they will), legal departments will adapt to that risk calculus and just figure out the cost of blowing or buying you off in the event that their developers carelessly infringe. If it sufficiently improves industrial productivity, it will become established while you're trying to litigate and afterwards people will just avoid crossing the threshold of infringement.
Honestly, this exchange makes me glad that I don't publish software and thus don't care about license conditions on a day to day basis.
> Frankly, I think it's on you to lay out your argument in full rather than assume everyone is privy to your thought process.
No, it's on you to not assume you know everything about my thought process before I show you otherwise.
Could I have communicated better? Yes. But I didn't assume you knew everything about my thought process. I thought it wasn't necessary for you too until you assumed that you knew my argument better than I did.
> You seem to coming at this as if the law is a purely mechanistic thing that can quickly resolve disputes, overlooking how these things play out in the real world, like Oracle v google going on for a decade or the even longer litigation involving SCO and IBM.
Once again, you are assuming. Yes, I know law is not mechanistic. Yes, I know going to court would take a long time.
Going to court is not the only thing I am doing. I also created new licenses, which I would not have if I only cared about what happened in court.
Going to court would be to attempt to argue for and enforce my viewpoint (indirectly). It would be a last-ditch attempt.
The first thing I am doing is creating new licenses specifically meant to "poison the well" for machine learning on code in general and Copilot in particular. [1]
With those licenses, I hope to make companies nervous about using Copilot for anything that might be using my licenses. This hesitation may only apply to code with my licenses, but the FAQ for those licenses ([2] is an example) are also designed to make lawyers nervous about the GPL and other licenses.
If I succeed in making the hesitation big enough, then Copilot as a paid service would be dead, and hopefully enough companies will prohibit the use of Copilot, as is already being done. [3]
Going to court, then, would only happen if I found someone infringing.
This will be especially helped by the fact that the vast majority of the code under those licenses will be in a language I'm building right now. If there's open source code in the language, then I can search that code for infringements caused by Copilot.
> I mean, what makes you so sure the court is going to give you a quick judgment on the infringement, or that it's going to agree with you about the size of code fragment that that is sufficient to infringe?
Do you think I would be stupid enough to pick an example to bring before court that would not be obviously infringing?
Winning in court is not just about being right, it's also about picking your battles, and I would be very choosy.
> Surely you can can agree that sufficiently small code fragments won't meet this threshold because they're too basic or obvious.
Yes, and as I said above, I won't use any of those.
> Because your whole argument here rests upon that assumption, it comes off as a wish fulfillment scenario where Copilot disappears because nobody likes the risk calculus;
You realize that this is the entire basis for the cybersecurity industry? The entire point is to make it economically infeasible for bad guys to do bad things in cyber space; it's to make the "risk[/reward] calculus" skew in favor of the good guys so much so that bad guys just stop operating.
Making the risk calculus riskier for your opponent is how wars and legal cases are fought too, but such tactics are not confined to the warroom or courtroom. That's why my opening salvo is licenses to sow doubt, to change the perception of the risk calculus. Battles like this are won by "winning minds," which in this case means convincing enough people to be nervous about it.
> your stated goal of 'making Copilot a dead product' seems more emotional than rational.
This is something where you are partially right. There is a lot of emotion behind it, not because I'm an emotional person (I'm actually on the spectrum and less emotional than the average person), but because I objectively considered the ramifications of what GitHub is doing with Copilot, realized how bad those ramifications were, and that lit a fire under me.
I wrote about the ramifications and refuted the dubious legal justifications in a whitepaper [4] for the FSF call for papers [5]. (Intro blog post at [6].)
But if you will read through the paper, you will find that there is rationality in my thoughts. I just happen to think this is a fight worth taking. Thus, the emotion.
> In reality it will take you a long time to get a result, and if enough people find Copilot useful (which I suspect they will), legal departments will adapt to that risk calculus and just figure out the cost of blowing or buying you off in the event that their developers carelessly infringe.
"Buying me off" would include checking that Copilot didn't output my code, and if it did, to follow the license. I'm not sure they would like the added work to use something that is supposed to save work on the easiest part of programming. But even if they did, I would be satisfied.
And that points to another part of my "thought process": the reason that I think I've got a chance is because I think the "reward" side of the risk/reward calculus is not very high with Copilot because it is the easiest part of programming.
Almost everything in programming is harder than writing boilerplate, and as I said in another comment [7], I think there are still better ways of reducing boilerplate. In fact, the language I am working on is designed to help with that. So my perception, which I acknowledge could be wrong, is that the reward for using Copilot is not high, which means I may not have to raise the risk level much for people to change their minds about it.
But the most important point would be to make legal departments and courts recognize that copyright still has teeth, or rather, argue well enough to convince people of that fact, despite what GitHub is saying.
> If it sufficiently improves industrial productivity, it will become established while you're trying to litigate and afterwards people will just avoid crossing the threshold of infringement.
This would be a win in my book too. I am going to be the first person to write boilerplate code in my language, which means that anyone who writes in this language will be "copying" me. I don't care about the boilerplate, though; they can copy that as much as they want.
> Honestly, this exchange makes me glad that I don't publish software and thus don't care about license conditions on a day to day basis.
I feel you on that. The only reason I do is because I feel like my future customers deserve the blueprints to the software they are using the same way the buyers of a building deserve to get the building's blueprints from the architect. If I did not have that opinion, I would probably not publish either.
If Microsoft noticed that a substantial number of contributors were putting in these "Poisoned Well license restrictions" in their repositories, it would be relatively trivial to automatically filter out those code bases using some basic heuristics to determine if the license was biased against systems like copilot.
> It depends how much you "learn from it" and then write your own code vs how much you copy/paste.
I think this is the crux. It doesn't matter that they used GPL code for training. It only matters if someone uses CoPilot to make a close copy of that code.
Fortunately Copilot is actually not commonly just copy/pasting. They did a study on that if you search for it (admittedly not an independent study but it's the only one we have).
> Why include that?
Because so many people are making the assumption that the law agrees with their overly simple interpretation before it's even been decided. It's a bit tedious.
This reaction is why we can’t have nice things :( after trying copilot I’m convinced that this kind of feature is going to bring the world to a next phase. Open source was part 1, this is part 2.
Understandable, but my comments are legitimate "I'm a shocked at how copilot can do this" reactions. I'm not a Github fanboy, just someone who has been using that new feature for a few weeks. It's just that insane.
I would seriously consider opting in so long as my authorship was acknowledged and my license was upheld. FWIW I tend to release things under permissive licenses (although I think if I was copyleft-inclined I might feel the same way: just match my license).
But stripping my copyright, copying my work without my permission and presenting it to users on terms I did not agree to, all of that is unacceptable.
Another comment said "don't open source your code", which I would agree with. If you don't want people reusing your code, just don't open source it. What's the point otherwise?
I’m not sure I understand that point of view though. Copilot just suggests relatively short snippets, it’s not copying large chunks of your library or products. If you truly have an innovative algorithm and you don’t want people to use it like that, you’ll have to go the patent route
I've been using Copilot in VS Code and it's been surprisingly useful. It makes suggestions pretty rarely, but when it does I accept about 50% of them. Generally these are few-line functions and it just gives me what I would have written after thinking about it a moment.
I signed up for the copilot technical preview right after it was announced a few months ago, but I haven't gotten an invite yet while all my friends who signed up later did (I feel a bit left out). Is there any way to get an invite sooner? What am I doing wrong?
I singed up in the first week and got my invite on Monday. I don't have anything "linked" with my github via vscode or anything so I doubt IDE or program usage was a deciding factor. I do regularly commit small stuff.
This will lead IMO to an interesting problem, although the technology and the idea is certainly cool.
Currently many programmers do not take the time to really understand how/why their code works -- programming without understanding 1.0. Essentially make library calls and fuzz around with the arguments until it appears to work. [Not wanting/suggesting to go back to the world before libraries/code completion, just stating where we are now.]
This will enable programming without understanding 2.0 -- not only will you not know how/why a particular function call works, you will now fail to understand why you want a sequence of functions in a particular order.
Most of the arguments here boil down to the same belabored point, that you shouldn't expect Copilot to actually write the code for you at the end of the day.
My take is it's what you make of it. Copilot is only equivalent to copy-and-pasting from stack overflow if that's how you choose to field its suggestions.
As an example, I've enjoyed typing "const one_day_in_ms" and letting it finish it out with "1000 * 24 * 60 * 60". I already knew how to do that, but having GCP finish it for me and verifying on my own didn't make me feel stupider, it made me more efficient. I have more interesting problems to tackle.
On the other hand, another coder could have not known this calculation and thrown their trust into GCP. That's bad practice and it's on them, not on the tool.
Sometimes GCP gives me code that it learned from bad coding patterns. I know how GCP works and I know to look out for that, so I ignore those suggestions.
Of course, sometimes I don't know if what looks like a good idea from GCP is actually not. I take that on as my responsibility to trust but verify. If it's writing some function to slugify a string for a URL, I check it against what people are discussing online. Does it defeat the purpose of GCP in this case if I have to check it on my own? Probably, but it's only in these specific instances when I'm doing something I'm not familiar with.
The comments are so full of praise that I will have to give this an earnest try. But is writing code really the part of software that people are struggling with?
> Login to GitHub Copilot using the device auth flow and authorized GitHub Copilot IntelliJ plugin with your GitHub Account in an external browser.
> Read and agree to the GitHub Copilot additional telemetry terms.
Can anybody comment on the privacy aspects of this? Is the telemetry reasonable? Why on earth do I need to login: presumably so that they can associate my coding with my account to structure the data they are gathering?
In my opinion, Copilot is going to become one of those "perceived authorities" that have just enough legitimacy to be blindly trusted by the inexperienced, but not enough to actually be useful to the experts.
This is like social media (or even the Internet as a whole) and say, our parents' generation. Countless times I receive links to Facebook posts or random articles that somebody thinks must be true, simply because The Oracle (i.e. their smartphone) showed it to them. For much of the older generation, smartphones are these all-knowing repositories of wisdom, and anything they come across while using them is likely to be true. This is why I think misinformation has spread so easily.
I imagine Copilot going down a similar path. The next generation of programmers who didn't grow up with knowing how to sift through API docs or SO answers for the right bit of code, or who's attention spans have been fizzled away, will love the idea of Copilot. Instant gratification in the firm of a tool that can seemingly do your work for you. This will be dire consequences for their ability to code and think for themselves.
Presumably they are tree-sitter grammars for those languages. I think these grammars are open source so should be available (assuming they were unmodified).
Copilot thread is always hot. TBH I don't mind the plagiarism. They can use my code.
The real problem is indeed the code quality, since Copilot does actively provide low-quality codes, and this will bite a lot of people. I guarantee. I don't think this massive learning approach can't be the solution. We need an extra kick.
My shower-thought solution for this garbage-spewing problem is to design a new library with Copilot in mind. Tighten up the interface, and use strict patterns instead of domain-specific hacks.
In other words, we can make libraries so lame that Copilot (and newb programmers) simply can't produce low-quality code. Just disregard smart programmers with fast hands. They don't need any help anyway. Don't even try to target them, because it's gonna cause more stupid flame wars...
I wonder how long before Github is forced to include youtube’s content idesque system. So that i can cease and desist even before I even push the code. And oracle, google and ms can charge my Github billing account for violations.
In the few short weeks i have used CoPilot inside of VSCode, it's been a big* timesaver; I can type out a bare skeleton of a class, and with minimal guidance for things like naming conventions CoPilot let's you simply tab+enter through each class property.
Maybe I'm missing something obvious, but I feel like CoPilot should have the ability to allow a user to define a 'class template' in the form of a block comment and then allow a user to write "make this class" or something.
*Big being it's probably saved me around two hours of manually typing out 'public property... etc.'
What this feels like to me, after a nominal amount of time using it, is that I am now in charge of vetting the proposed code, and is it appropriate for my use case, and does it meet our coding standards and does it integrate with my code or further suggestions. Honestly it feels like constantly code reviewing, which is not pleasant nor conducive to productivity.
And I do a lot of PR approval, and I shudder to think about PRs stitched together in haste because look how easy it is now to crank out code!
And lastely it doesn't work well when you have a large legacy codebase that your working within.
Totally agree. Co-pilot is just stressing more our roles as code reviewers. It’s not even a new idea - people working in program synthesis have done this for a while now. I’ve written about this a couple of years ago: https://medium.com/@marceloabsousa/the-software-shift-toward...
- Co-pilot will not replace software engineers, however ...
- I do think it will in some cases help them be more productive as some have expressed in this thread already.
- Once they open it up to fine-tuning on your own codebase I suspect it can be used to bring new engineers upto speed faster plus it get will get more trustworthy.
- I do have concerns about the legal aspect of selling software built with assistance if co-pilot but a lawyer could probably get me comfortable (or not!)
- I have personally found it useful for data science type tasks especially getting familiar with new libraries.
Any word on if they will ever make Copilot free software? It sounds interesting, but I avoid using proprietary software in my development stack. Plus it looks like it be fun to play with.
Looking forward to the next generation of programmers who can only program if they have Copilot, and will complain in interviews when they are asked to code without Copilot.
I've been using TabNine for several years and love it. I'd be interested in hearing a comparison between TabNine and GH Copilot from anyone who has used both.
It does seem a bit controversial to develop a relationship with neovim without approaching emacs. I wonder if they even asked on emacs-devel, I’m sure they’ll have got a friendly response …oh hang on…
I'm very interested in this as a learning tool: instant feedback about what could come next, compared to the status quo of searching the internet or searching the language documentation. A lot of the time I learn a way to do something, then sometime later stumble across a better way to do the same thing. My hope is that Copilot can be a shortcut for that discoverability.
this has been "inevitable" for three decades now. The difference is, walled gardens making participation non-optional; commercial intent over "fairness"; elevating the trivial for the pleasure of management .. what could go wrong?!
overall, a new forms generator with a somewhat terrifying amount of horsepower.. zero trust of microsoft here, basically
It say on the github page that "GitHub Copilot will attempt to match your code's context and style. You can edit the suggested code as you choose.". Does this mean that the plugin will transmit my code to github servers?
Yes. The machine learning model runs on GitHub's servers (it's too big/expensive to run locally), so it has to submit the context of what you're working on to GitHub.
I used it for about a week but found I could type faster than it could suggest, even if it suggested the right thing. I have been coding for about 25 years but maybe it will help people new to whatever language they are using.
EDIT2: Looks like you can force the plugin to work by editing the plugin.xml contained in github-copilot-intellij-1.0.1.jar within the plugin archive. Just remove the line that includes Rider as incompatible. The same should work for CLion.
I joined the waitlist at the end of June and got an invite yesterday. So 4 months?
However, I imagine a lot of people signed up just before me, so I was probably at the end of a long list. The wait wouldn't be this long if you sign up today, I reckon. I'm just guessing though.
makes it even easier for both the US (via Microsoft) and Russian govs (via JetBrains) to copy, analyze and inject arbitrary code into any codebase (public or "private") used by this combo of tools
In terms of difficulty, writing code is maybe on average a two out of ten.
On average, maintaining code you wrote recently is probably a three out of ten in terms of difficulty, and maintaining code somebody else wrote or code from a long time ago probably rises to around a five out of ten.
Debugging misbehaving code is probably a seven out of ten or higher.
GitHub Copilot is optimising the part of the process that was already the easiest, and makes the other parts harder because it moves you from the “I wrote this” path to the “somebody else wrote this” path.
Even during the initial write, it changes the writing process from programming (which is easy) to understanding somebody else’s code to ensure that it’s right before accepting the suggestion (which is much less easy). I just don’t understand how this is a net time/energy savings?