AI in software engineering at Google: Progress and the path ahead

kgilpin · 2024-06-07T03:20:58 1717730458

When AI is used properly, it’s doing one of two things:

1) Making non-controversial fixes that save time and take cognitive load off the developer. The best example is when code completion is working well.

2) It’s making you smarter and more knowledgeable by virtue of the suggestions it makes. You may discard them but you still learn something new, and having an assistant brainstorm for you enables a different mode of thinking - idea triage - that can be fun, productive, useful and relaxing. People sometimes want completion to do this also, but it’s not well suited to it beyond teaching new language features by example.

The article makes an interesting assertion that AI tools “fail to scale” when the user has to remember to trigger the feature.

So how can AI usefully suggest design-level and conceptual ideas in a way that doesn’t require a user “trigger”? Within the IDE, I’m not sure. The example given of automated comment resolution is interesting and “automatic”, but not likely to be particularly high level in nature. And it also occurs in the “outer flow” of code review. It’s the “inner flow” that’s the most interesting to me because it’s when the real creativity is happening.

asabla · 2024-06-07T06:53:50 1717743230

> So how can AI usefully suggest design-level and conceptual ideas in a way that doesn’t require a user “trigger”?

I'm just guessing here. But maybe make it part of some other (already natural and learned) trigger made by the user.

I'm thinking part of refactoring. Were your AI is not only looking at the code, but the LSP, recent git changes (both commit and branch name), which code files you've been browsing.

And if you want to make it even more powerful. I guess also part of your browser history will be relevant (even if there is privacy concerns)

thaumasiotes · 2024-06-07T12:57:40 1717765060

> So how can AI usefully suggest design-level and conceptual ideas in a way that doesn’t require a user “trigger”? Within the IDE, I’m not sure.

https://upload.wikimedia.org/wikipedia/en/d/db/Clippy-letter...

whywhywhywhy · 2024-06-07T13:15:08 1717766108

Clippy was on the right path at the wrong time, the issue wasn't the concept of an assistant that watches your work and provides help, the issue was it could detect a letter, offer help but the help it gave you was just a bunch of shallow formatting suggestions without context to the actual work.

An actual assistant that can preempt what you need and create it before you get there with a 95% success rate will not feel like Clippy.

mistrial9 · 2024-06-07T13:49:27 1717768167

Clippy from MSFT? this is where the techies really lose perspective.. you see, its not just a computer, a computer company, and a user.. Real life includes social systems with social contract, and the relationship of the user's logs, records and autonomy to the "master" of the economic relationship. Microsoft has made it clear that surveilling the user and restricting autonomy is as valuable or more valuable from a business perspective than the actual actions performed, AND at the same time, entire industry product lines are designed to trivialize and factor out personal skill and make job performance more like replaceable piece-work.

Lastly, people are more or less aware of these other dynamics at work. Yet, people are people and respond to other social cues. The sheer popularity of something being novel, cute or especially "cool" does move the needle about adoption and implementation. Clippy was a direct marketing response to Apple getting "cool" ratings with innovative GUI elements.

nedw · 2024-06-07T21:37:28 1717796248

While I agree with some of your points about corporate data practices and the potential for technology to deskill labor, I'm not sure I see the direct connection with Clippy. Clippy, as limited as it was, seems like a poor example of these larger concerns.

Furthermore, saying "techies really lose perspective" comes across as dismissive and judgmental. It's important to remember that people are drawn to technology for various reasons, and many are deeply concerned about the ethical implications of what they build. Painting an entire group as lacking awareness isn't helpful or accurate.

If we want to have a productive conversation about the impact of technology, we need to avoid generalizations and engage with each other respectfully, even when we disagree.

whywhywhywhy · 2024-06-09T23:10:53 1717974653

>Clippy was a direct marketing response to Apple getting "cool" ratings with innovative GUI elements

Honestly think Clippy predated that, came out in 1996 for Office 97, Macs were only on System 7.5 back then while I think you're thinking of the early MacOS X era which was was 6-7 years later.

zikzak · 2024-06-07T13:33:56 1717767236

Not an AI expert but this sounds a lot like code review.

Most people are using systems to facilitate code review these days (we aren't sitting around boardrooms with printouts these days) so I wonder if there is a way to use the code review data streams combined with diffs to train AI to "do code reviews"?

dieortin · 2024-06-09T17:34:24 1717954464

That sounds like what is mentioned in the article

TeeMassive · 2024-06-09T03:18:30 1717903110

As long LLMs are seen as patterned knowledge aggregator then work as intended. They hallucinate answers mostly because they try to interpolate data or there is some pattern that it didn't catch due to lack of context or training.

They're really good for type / name finding and boiler plate generation. For larger suggestions as your pointed out they're too wrong to be used as is. They can give good ideas, especially if guided with comments, but usually at this stage I just use Phind.

stuckinhell · 2024-06-07T13:36:01 1717767361

"When AI is used properly" is a very very loaded statement. Especially since evidence shows code is one of the worst things its good at, it' a lot better at other tasks.

pixiemaster · 2024-06-07T13:47:19 1717768039

every expert i talk to says: „LLMs are not good for my domain, but for $other_domain“

diggan · 2024-06-07T15:02:44 1717772564

Put another way: "LLMs are great when I don't have lots of experience in the domain but want to do something in that domain. Otherwise, my own brain works better"

motoxpro · 2024-06-08T06:21:18 1717827678

Id argue it's the opposite. If you have extremely deep knowledge you can constrain the problem and get a great answer. i.e. boilerplate for X. Boilerplate is only boilerplate because you have a certain level of knowledge.

motoxpro · 2024-06-08T06:17:33 1717827453

Code is one of the only things LLMs ARE good at.

At this point, asking an LLM to "implement feature X please" is not going to give you great results. however, unless you can type at 600 wpm, an LLM doing extremely trivial boilerplate code completion is a godesend.

trybackprop · 2024-06-07T00:30:36 1717720236

From the blog post: > We observe that with AI-based suggestions, the code author increasingly becomes a reviewer, and it is important to find a balance between the cost of review and added value. We typically address the tradeoff with acceptance rate targets.

In the past year since GPT-4 came out, I've also found this to be the case. I'm an ML/backend engineer with little experience in frontend development. Yet, I've been able to generate React UIs and Python UIs with GPT-4 in a matter of minutes and simply review the code to understand how it works. I find this to be very useful!

jprete · 2024-06-07T01:05:25 1717722325

This isn't a good trend. Reviewers never have the depth of knowledge or understanding of authors.

WalterSear · 2024-06-07T06:31:44 1717741904

IMHO, review is a misnomer for where software engineering is going. I'm not sure where we are going, but review implies less responsibility for the outcome.

But I do think that we will have less depth of knowledge of the underlying processes. That's the point of having a machine do it. I expect this, however, to be a good trend: the systems will need to be up to a task before it makes sense to rely on them.

reacharavindh · 2024-06-07T07:06:47 1717744007

This is how progress (in developer productivity) has always been made. We coded in assembler, then used macros, then a language like C, Fortran, then more of Java/Go/Puthon/Rust/Ruby et al. A developer writing a for loop over a list in Python need to necessarily know about linked lists and memory patterns because Python takes care of it. This frees up that developer from abstracted details and think one level closer to the problem at a higher speed.

LLMs _can_ be a good tool under the right hands. They certainly have some ways to become a reliable assistant. I suppose in the way of LLMs, they need better training before they can get there.

discreteevent · 2024-06-07T10:07:26 1717754846

> We coded in assembler, then used macros, then a language like C, Fortran, then more of Java/Go/Puthon/Rust/Ruby et al.

The difference is that:

1) All of those things are deterministic [1]

2) In all of those cases I can debug at the level of the abstraction.

[1] Meaning: Do exactly what I say. Don't make it up.

CoastalCoder · 2024-06-07T11:22:29 1717759349

I agree with your overall point.

In a certain sense I'd say optimizing compilers aren't deterministic:

The same source code can produce different object code, depending on data and algorithms into which a typical programmer has little insight.

kibwen · 2024-06-07T13:22:32 1717766552

Crucially, optimizers are supposed to be semantics-preserving, and if they aren't that's considered a major problem.

vitiral · 2024-06-07T12:52:41 1717764761

Being a bit pedantic. Globally but not locally deterministic

shrimp_emoji · 2024-06-07T13:20:56 1717766456

Your brain's not deterministic.

kajecounterhack · 2024-06-07T05:22:53 1717737773

Depends on what's being authored. Many times more experienced engineers review PRs from junior engineers and do have a higher level of knowledge.

vineyardmike · 2024-06-07T07:29:43 1717745383

Frankly, its fine more often than we may care to admit.

As the parent comment suggested, UI elements are a great candidate for this. Often very similar (how many apps have a menu bar, side bar, etc) and full of boilerplate. And at the rate things change on the front-end, it's often a candidate for frequent re-writes, so code quality and health don't need to be as strict.

It'd be nice if every piece of software ever written was done so by wise experts with hand-crafted libraries, but sometimes it's just a job and just needs to be done.

Rumudiez · 2024-06-07T13:31:58 1717767118

UI is a terrible example to make your point. Tell me you don’t know frontend development…

Accessibility, cross browser+platform support, design systems, SEO, consistency and polish, you name it. You are most certainly not getting that from an LLM and most engineers don’t know how or don’t have a good eye for it to catch when the agent has gone astray or suggested a common mistake

pquki4 · 2024-06-08T03:12:53 1717816373

You definitely have a point, but the reality is that LLMs are about as good as an "average" UI developers in some cases -- lots of people who work on UI every day think very little about accessibility and don't understand if their code actually runs in a non-chromium browser.

fnordpiglet · 2024-06-08T07:49:24 1717832964

Does everything ever written need to be crafted by an artisan? And awful lot of useful stuff written “good enough” is good enough. Depth of knowledge or understanding is irrelevant to a lot of front end UI development where the key is the design itself and that the behavior of the design is accurate and reliable not that the engineer -really- understand at a core of their soul graphql and react with the passion of a Japanese craftsman when they’re building a user interface for the ML backend that internal users use for non critical tasks. There does exist a hierarchy of when depth matters and it’s not homogeneously “literally everything you do.”

elboru · 2024-06-07T06:21:43 1717741303

Sometimes you simply don’t need that knowledge. Not needing to understand something frees your mind.

williamcotton · 2024-06-07T10:37:09 1717756629

This is a false equivalence.

When someone is using an LLM they are still the author.

Think about it like someone who is searching through record crates for a good sample. They're going to "review" certain drum breaks and then decide if it should be included in an artwork.

The reviewing that you're alluding to is like a book reviewer who has nothing to do with the finished product.

naijaboiler · 2024-06-07T15:15:52 1717773352

Yup that’s an old reviewer/author problem. Reviewer has a huge blind spot because they don’t even know what they don’t know. The author knows what he knows but more importantly also has a bigger grasp on what he doesn’t know. So he understands what’s safe to do and what’s not

mewpmewp2 · 2024-06-07T12:38:13 1717763893

Well it is not actually review where you have a PR. It is more like you are guiding and reviewing a very fast typer in your decided order that in any simple cases handles it 99 percent of the time.

anonyfox · 2024-06-07T12:04:38 1717761878

not every developer knows how exactly his modern CPU oder memory layers work, or how electromagnetic waves are building up a signal.

people use tools to make things. Its okay. Some "hardcore folks" advance the "lower level" tooling, other creative folks build actually useful things for daily life, and mostly these two groups have very little overlap IMO.

Mmskynettio · 2024-06-07T09:40:35 1717753235

I do and i found plenty of issues while doing code review.

But i started to read a lot more code than what i did 10 years ago

refulgentis · 2024-06-07T01:37:10 1717724230

Source? This is unintuitive to me, I can't come up with a rationale.

jprete · 2024-06-07T01:53:37 1717725217

I know of no review process that produces the same level of understanding as does authorship, because the author must build the model from scratch and so must see all the details, while the reviewer is able to do less work because they're fundamentally riding on the author's understanding.

In fact, in a high-trust system, e.g. a good engineering culture in a tech company, the reviewer will learn even less, because they won't be worried about the author making serious mistakes so they'll make less effort to understand.

I've experienced this from both sides of the transaction for code, scientific papers, and general discussion. There's no shortcut to the level of understanding given by synthesizing the ideas yourself.

Mathnerd314 · 2024-06-07T17:23:28 1717781008

> "the author must build the model from scratch and so must see all the details"

This is not true. With any complex framework, the author first learns how to use it, then when they build the model they are drawing on their learned knowledge. And when they are experienced, they don't see all the details, they just write them out without thinking about it (chunking). This is essentially what an LLM does, it short-circuits the learning process so you can write more "natural", short thoughts and have the LLM translate them into working code without learning and chunking the irrelevant details associated with the framework.

I would say that whether it is good or not depends on how clunky the framework is. If it is a clunky framework, then using an LLM is very reasonable, like how using IDEs with templating etc. for Java is almost a necessity. If it is a "fluent", natural framework, then maybe an LLM is not necessary, but I would argue no framework is at this level currently and using an LLM is still warranted. Probably the only way to achieve true fluency is to integrate the LLM - there have been various experiments posted here on HN. But absent this level of natural-language-style programming, there will be a mismatch between thoughts and code, and an LLM reduces this mismatch.

refulgentis · 2024-06-07T02:02:07 1717725727

So the software lifecycle ends up with a sort of Zeno's paradox, each incremental maintainer understands the system less...fascinating, ty!

codr7 · 2024-06-07T03:27:48 1717730868

I believe pretty much anyone who has observed a few cycles can tell as much.

Often the major trigger for a rewrite is that the knowledge has mostly left the building.

But then there's the cognitive dissonance; because the we like pretending that the system is the knowledge and thus has economic value in itself, and that people are interchangeable. None of which is true.

Tao3300 · 2024-06-07T02:11:19 1717726279

On sufficiently large and old code bases, yes, this is exactly the case.

_the_inflator · 2024-06-07T05:19:51 1717737591

I totally agree.

Reviewing the solution is limited. What you don’t get are the myriads of other ways that didn’t work out.

Elegant solutions are the result of weeding out dozens of other messy ways.

So what gets perpetuated here then is the Dunnimg Kruger effect.

While it might be speed things up in many normal circumstances, it devalues hard work in the long run. Not good.

williamcotton · 2024-06-07T10:38:22 1717756702

I've found that it is a force multiplier for my hard work.

mycologos · 2024-06-07T11:12:36 1717758756

The Dunning-Kruger effect, and its many misspellings, might be the most overused term on this website.

mattgreenrocks · 2024-06-07T12:56:13 1717764973

The industry is rife with it, and this is a haven for experienced professionals as well as newcomers.

One need only look at open source projects to see the massive variance in code quality that is out there.

refulgentis · 2024-06-07T16:13:05 1717776785

Lol, 100%, it's overeducated for "dumb"

azakai · 2024-06-07T06:02:07 1717740127

It is similar to how much a student learns from working hard to solve a problem versus from being given the final solution. The effort to solve it yourself tends to give a deeper understanding and make it easier to remember.

Maybe that's not the case in all fields, but it is in my experience at least, including in software. Code I've written I know on a personal level, while I can much more easily forget code I've only reviewed.

bdjsiqoocwk · 2024-06-07T01:19:19 1717723159

Also, people shouldn't be allowed to use computers unless they understand how transistors work. If you don't have the depth of knowledge you get nothing.

desolved · 2024-06-07T01:30:19 1717723819

It took me way too long to realize this was a joke.

notpachet · 2024-06-07T02:44:40 1717728280

It's HN so you can't be entirely sure.

BOOSTERHIDROGEN · 2024-06-07T03:06:01 1717729561

Although this is well-expressed, I'm left with an unexplainable feeling that I can't quite put my finger on.

bdjsiqoocwk · 2024-06-07T14:44:27 1717771467

Let me try and articulate it.

The person Im responding to was gatekeeping. I responded by sarcastically doing the same to an extreme degree. A lot of people Will have agreed with the person i'm responding to. "Oh yeah of course You should understand these things, the things that I already understand", genuinely not realizing that there's no basis for that. When they reae my response they realize what they were doing, and are less feeling embarrassed for their senseless (and pretentious!) gatekeeping.

bongodongobob · 2024-06-07T01:51:59 1717725119

Right because it's impossible for people to learn things themselves. Knowledge must only be passed down by The Gatekeeper.

I'm starting to feel like the programming community is just mad things are easier to learn now.

irrational · 2024-06-07T01:55:49 1717725349

Are they learning? Or are they just accepting the results, as long as it appears to more or less do what they want, and moving on?

The danger seems to be code that is syntactically correct and compiles without errors, but is logically incorrect.

hombre_fatal · 2024-06-07T02:10:17 1717726217

This is just how learning happens either way.

As a noob I copied code from Railscasts or Stack Overflow or docs or IRC without understanding it just to get things working. And then at some point I was doing less and less of it, and then rarely at all.

But what if the code I copied isn't correct?! Didn't the sky fall down? Well, things would break and I would have to figure out why or steal a better solution, and then I could observe the delta between what didn't work vs what worked. And boom, learning happened.

LLMs just speed that cycle up tremendously. The concern trolling over LLMs basically imagines a hypothetical person who can't learn anything and doesn't care. More power to them imo if they can build what they want without understanding it. That's a cracked lazy person we all should fear.

thomashop · 2024-06-07T09:52:17 1717753937

One data point:

In our startup we are short on frontend software engineers.

Our project manager started helping with the UI using an IDE (cursor a VS-code fork) with native ChatGPT integration. In the span of six months, they have become very proficient at React.

They had wanted to learn basic frontend coding for multiple years but never managed to pass the initial hurdles.

Initially, they were only accepting suggestions made by ChatGPT and making frequent errors, but over time, they started understanding the code better and actively telling the LLM how to improve/fix it.

Now, I believe they would have the knowledge to build simple functional React frontends without assistance, but the question is why? As a team with an LLM-augmented workflow, we are very productive.

skydhash · 2024-06-07T13:06:36 1717765596

With the number of anti patterns in React, are you sure everything’s OK? The thing’s about declarative is that it’s more like writing equations. Imperative have a tighter feedback loop, and ultimately only code organization is a issue. But so many things can go wrong with declarative as it’s one level higher in the abstraction stack.

I’m not saying that React is hard to learn. But I believe buying a good book would have them get there quicker.

thomashop · 2024-06-09T09:27:39 1717925259

Yes totally. They already tried doing a few online courses and had a few books.

I'm a senior React Dev and the code is totally fine. No more anti-patterns than juniors who I've worked with who learned without AI. The contrary actually.

Lonota · 2024-06-13T02:01:50 1718244110

Sounds like a great learning tool.

Not gonna fault people for learning, think the FUD is more so in the vein of being ignorant while working.

Yea, you dont really need to know how transistors work to code, but you didnt need that for 2 generations. Personally think (and hope), LLM code tools replace google and SO, more so than writing SW itself.

I got my start on a no-code visual editor. Hated it because the 5% of issues it couldn't handle took 80% of my time (with no way to actually do it in many cases). See LLM auto generation as the same, the problems that the tool dosent just solve will be your jobs, and you still need to know things for that.

userbinator · 2024-06-07T02:26:48 1717727208

Or more insidiously, is correct for everything except the edge cases.

bongodongobob · 2024-06-07T02:13:14 1717726394

Oh come on, the user is talking about building UIs. I don't know how else you learn. Your attitude just reeks of high-horse. As if it was better to learn things from stackoverflow.

skydhash · 2024-06-07T13:13:16 1717765996

Who learned stuff from stack overflow? In my own case, it was all books, plus a few videos. Stack Overflow was mostly to know why things has gone (errors not explicit enough) or very specific patterns. And there was a peer review system which lent credibility to answers.

shermantanktop · 2024-06-07T14:55:35 1717772135

“In my own case...” something worked for you. So what?

Are you sure that would work for others? And that other approaches might not be more effective?

I’ve learned lots of things from SO. The top voted answers usually provide quite a bit of “why” content which have general utility or pointers to more general content.

Yes, there are insufferable people on there, but there are gatekeepers and self-centered people everywhere.

irrational · 2024-06-07T17:43:23 1717782203

I’ll add to their anecdote, with my own. I don’t think I’ve ever learned anything from SO. It is books and occasional videos for me too.

shermantanktop · 2024-06-07T18:50:11 1717786211

Maybe I am being snarky, but saying “I don’t like that” or “that’s not how I did it” just isn’t that interesting. I’d love to hear why books are so much effective, for instance, or which books, or what YT channels were useful.

skydhash · 2024-06-07T20:03:38 1717790618

> why books are so much effective

Because they’re consistent and they follow (the good ones) a clear path to learn what you want to learn. The explanations may not be obvious at first glance and that’s when you may need somone to present to you in another perspective (your teacher) or provide the required foundational knowledge that you may lack. You pair it with some practice or do cross-reference with other books and you can get very far. Also they’re can be pretty dense in terms of information.

> which books, or what YT channels were useful.

I mostly read manuals nowadays, instead of tutorials. But I remember starting with the “Site du Zero” books (a French platform) for C and Python. As tech is moving rapidly, for tutorial like books, it’s important to get the latest one and know which software versions it’s referring to.

Now I keep books like “Programming Clojure”, “The GO Programming Language”, “Write Great Code” series, “Mastering Emacs”, “The TCP/IP Guide”, “Absolute FreeBSD”, “Using SQlite”, etc. They’re mostly for references purposes and deep dive in one subject.

The videos I’m talking about were Courses from Coursera and MIT. Algorithms, Android Programming, Theory of Computation. There are great videos on Youtube, but they’re hidden under all the worthless one.

imabotbeep2937 · 2024-06-07T01:54:29 1717725269

"Easier to learn" is abjectly false.

"Easier to get code submitted to a code base" is only marginally correlated with learning anything.

zackproser · 2024-06-07T01:53:03 1717725183

Having a similar experience but in the other direction. I have a lot of backend and frontend development experience but no ml background. Being able to ask stupid questions to get further faster has been making difference for me.

runlevel1 · 2024-06-07T01:22:48 1717723368

Humans have limited RAM, so we have to put our ideas into an external medium that can then be refined.

I've been finding AI's suggestions -- even when rather wrong -- help me do that initial step faster. Which, I think, jives with their findings here.

gerdesj · 2024-06-07T02:06:59 1717726019

"Humans have limited RAM"

I would suggest we have flexible RAM. Also, we have an awful lot of it. The analogy breaks down as soon as you look at it too seriously!

In IT we largely deal with compute, persistent storage and non-persistent storage. Roughly speaking: CPU, RAM, HDD. In humans we might be considered to have similar "abilities" but unlike IT there is a mostly a single thing that performs all of those functions - the brain. That organ is both compute and storage.

LLMs can be surprisingly useful but they are a tool. As with all tools they can be abused and no doubt you have spotted all those tech blogs that spout the same old thing and often with subtle failings (hallucinations).

Keep your tools sharp and know how to safely use sharp tools.

gbnwl · 2024-06-07T02:45:05 1717728305

Just because the brain is a single “thing” doesn’t mean it doesn’t have distinct types of memory. Consider looking up “working memory” as it’s probably the best analogue to RAM here.

lanstin · 2024-06-07T05:10:30 1717737030

Not really, it's more like CPU registers. Very limited and stuff has to be in their to be computed on (consciously) (lots of unconscious computation as well of course; a lot of efficiency to be found in moving computation from conscious to unconscious).

ffsm8 · 2024-06-07T10:57:34 1717757854

These kinds of similies make less and less sense nowadays because we've got nvme storage nowadays, and that can be as fast as 7GByte/s. That's a lot faster then the RAM in most devices today. And with less latency too.

RAMs differentiating factor is increasingly just that it can handle a lot of read/write cycles, not it's speed. And that doesn't map to anything in biology

vonmoltke · 2024-06-07T14:07:31 1717769251

> These kinds of similies make less and less sense nowadays because we've got nvme storage nowadays, and that can be as fast as 7GByte/s. That's a lot faster then the RAM in most devices today. And with less latency too.

7 GB/s is the low end of the DDR3 performance range; DDR3 is 17 years old. Meanwhile, DDR5 performance ranges from about 33.5 GB/s to about 69 GB/s. RAM latency, even on DDR3, is measured in nanoseconds; NVMe latency is measured in microseconds, making it about three orders of magnitude higher.

ffsm8 · 2024-06-07T15:10:36 1717773036

By quantity, most devices are low-end and old androids. But I admit, I might have phrased my comment poorly. I didn't mean to imply that high end RAM has the same performance profile as high end NVMe storage

andersa · 2024-06-07T13:22:04 1717766524

Wat? NVMe storage is much slower than RAM. And orders of magnitude slower than VRAM.

nomel · 2024-06-07T02:03:29 1717725809

First we extended the reach of our perception with language.

Then we extended the energy in our calories reserves with crops/livestock.

Then we extended the length of our memories with writing.

Then we extended the breadth of our thinking with AI?

lifty · 2024-06-07T08:19:02 1717748342

Then we extend the vastness of our consciousness with the Borg.

longerd2 · 2024-06-07T10:29:16 1717756156

please let me leave my flesh prison

czl · 2024-06-07T09:17:20 1717751840

...

We extend our perception with remote sensors (video, sound).

We extend our muscles with machines.

...

3abiton · 2024-06-07T08:36:30 1717749390

> I've been finding AI's suggestions -- even when rather wrong -- help me do that initial step faster. Which, I think, jives with their findings here.

What's your process? Can you give an example? So far for me, I found them to be most useful using LLMs as code copilot.

runlevel1 · 2024-06-07T20:13:38 1717791218

For example, I asked it for a database schema given some parameters. What it gave me wasn't quite what I wanted, but seeing it written out helped me realize some flaws in the original design I had in mind. It wasn't that what it gave me was better. The act of evaluating its suggestions helped me clarify my own ideas.

It's sort of like Cunningham's Law with a party of one. Giving me the wrong answer helps me clarify what the correct answer should look like.

Or, perhaps a better way to put it:

It's easier to criticize than to create. It gives me something to criticize and tinker with. Doing so helps me hone in on the solution I want. (Provided its suggestion was at least in the right universe, of course.)

darby_nine · 2024-06-07T05:08:16 1717736896

> I've been finding AI's suggestions -- even when rather wrong -- help me do that initial step faster.

I have no idea how I could even integrate AI into my workflow so that it's useful. It's even less reliable than search is for basic research and can't even cite its sources....

This argument held a lot more weight when it was a search engine playing the role of our memory.

runlevel1 · 2024-06-07T06:31:30 1717741890

That comment was solely about AI code suggestions.

Generative AI still has a ways to go for other forms of research, and it will never fully replace the utility of a search engine. They're two different tools for different but overlapping tasks.

mrslave · 2024-06-07T07:58:24 1717747104

Just to be charitable to GP and not to enter the debate, many of my colleagues have replaced Google with ChatGPT as their first port of call.

darby_nine · 2024-06-08T01:40:02 1717810802

> That comment was solely about AI code suggestions.

Even AI code suggestions seem to be only a minor improvement over basic LSP integration. One major exception is tedious formatting of the text—say you want to copy over a table by hand to a domain value, copilot is really good at recognizing values and situating them appropriately in the parent l-value.

If chatbots could serve as my RAM, surely they'd be able to generate code relevant to the rest of the codebase or at the very least not require deep scrutiny to ensure their RAM matches mine (it most often does not).

jillesvangurp · 2024-06-07T10:08:08 1717754888

LLMs are undeniably useful for programming. The core challenge in making them more useful is the right UX for making this more seamless. I use intellij and things like codegpt.

A few weeks ago they enabled auto complete. I disabled it after a day. Reason: most of the suggestions weren't great and it drowned out the traditional auto complete, which I depend on. I just found the whole thing too distracting.

I also have the chat gpt desktop app installed since a few weeks. This adds a key binding (option + space) to be able to ask it questions. I've found myself using it more because of that. Copy some code, option+space, ask it a question and there it goes.

The main issue is that it is a bit ground hog day in the sense that I have to explain it in detail what I want every time. I can't just ask it to generate a unit test (which it does very well). Instead I have to specify that I want it to generate a unit test, use kotlin-test and kotest-assertions, and not use backticks for the function names (doesn't work with kotlin-js). Every time. If I don't it will just default to the wrong things because it doesn't remember your preferences for frameworks, style, habits, etc. and it doesn't look at the whole code base to infer any of that.

Mostly, progress here is going to come in the form of better IDE support and UX. The right key bindings would help. A bigger context so it can just grab your whole git project and be able to suggest appropriate code using all the right idioms, frameworks, etc. would then be possible. Additionally, it would be able to generate files and directories as needed and fill them with all the right stuff. I don't think that's going to take very long. Gpt-4o already has a quite large context and with the progress in OSS models, we might be doing some of this stuff locally pretty soon.

simonw · 2024-06-07T13:09:18 1717765758

> Instead I have to specify that I want it to generate a unit test, use kotlin-test and kotest-assertions, and not use backticks for the function names (doesn't work with kotlin-js). Every time.

Have you experimented with GPTs much? I'd solve this problem by creating my own private "write a unit test" GPT that has my preferences configured in the system prompt.

jillesvangurp · 2024-06-07T13:34:08 1717767248

I ask many things in a typical day. Creating custom GPTs for everything that pops in my head is not very practical. It's workaround for the fact that it doesn't remember anything about previous chats and conversations.

In an IDE context a lot of this stuff should be solved in the UI, not by the end user.

calderwoodra · 2024-06-07T13:21:18 1717766478

This, as well as system prompts could help you a bunch.

willsmith72 · 2024-06-07T13:50:36 1717768236

you can fit a lot in 1 system prompt. i would instead put it all in the main system prompt, at least until openai comes up with a better ux

the_doctah · 2024-06-08T04:30:03 1717821003

>Instead I have to specify that I want it to generate a unit test, use kotlin-test and kotest-assertions

Am I missing something? I tell GPT to remember my preferences and it does.

ghnws · 2024-06-07T15:14:39 1717773279

IntelliJ also has "full line completion" built in that uses a local model. I've found it rather good and it's the only in-IDE code generation I use. It does not get in the way and usually gives me exactly what I want.

https://www.jetbrains.com/help/idea/full-line-code-completio...

roncesvalles · 2024-06-07T21:56:01 1717797361

>LLMs are undeniably useful for programming.

I don't know. There's an idealized vision that people have in their heads of what LLMs should be in which they're "undeniably useful". It's unclear if that already exists and is just constrained by the right UX concept, or if it doesn't exist and never will.

cloudking · 2024-06-07T00:36:39 1717720599

The most interesting chart is the "fraction of code created with AI assistance via code completion" trending up to 50%

ks2048 · 2024-06-07T01:33:05 1717723985

Seems they need to compare against "dumb" code completion. It seems that even when they are error-free, "large" AI-code-completions are just boilerplate that should be abstracted away in some functions rather than inserted into your code base.

On a related note, maybe they should measure number of code characters that can be REMOVED by AI rather than inserted!

williamcotton · 2024-06-07T10:47:54 1717757274

boilerplate that should be abstracted away in some functions rather than inserted into your code base

Boilerplate is often tedious to write and just as often easy to read. Abstraction puts more cognitive load on the developer and sometimes this is not worth the impact on legibility.

calderwoodra · 2024-06-07T13:31:28 1717767088

Totally agree - LLMs remove the tedium of writing boilerplate code, which is often a better practice than abstracted code. But it takes years to experience to know what that's the case.

skydhash · 2024-06-07T15:17:12 1717773432

Code generators and snippets template are your friends.

pquki4 · 2024-06-08T03:16:29 1717816589

Say I'm faced with a choice right now -- repeat the same line twice with 2 minor differences which gets checked by IDE immediately, or create a code generator that generates all 3 lines which may not work well with the IDE and the build system which leads to more mistakes?

sanex · 2024-06-07T02:53:01 1717728781

Agree with the first sentence. I used to tell people I never type anything more than periods and two or three letters because we already had good autocomplete.

cplat · 2024-06-07T00:50:16 1717721416

Yes. Although I’d like to see a deeper investigation. Of course, quality of completions have improved. But there could be a confounding phenomenon where newer folks might just be accepting a lot of suggestions without scrutiny.

runlevel1 · 2024-06-07T01:41:41 1717724501

I accept a lot of wrong suggestions because they look close enough that it's quicker for me to fix than it is to write the whole thing out from scratch. Which, IIUC, their metric captures:

> Continued increase of the fraction of code created with AI assistance via code completion, defined as the number of accepted characters from AI-based suggestions divided by the sum of manually typed characters and accepted characters from AI-based suggestions. Notably, characters from copy-pastes are not included in the denominator.

jghn · 2024-06-07T01:46:05 1717724765

As someone who is generally pro-AI coding tools, I do the same. However sometimes it winds up taking me so much effort to fix the mistakes that it'd have been faster to do it myself. I often wonder if I'm seeing a net positive effort gain over time, I *think* so, but it's hard to measure.

tmpz22 · 2024-06-07T01:52:21 1717725141

Question for any Googlers in the thread - do folks speak up if they see flaws in the methodology or approach of this research or is the pressure from the top so strong on this initiative that people hush up?

UncleMeat · 2024-06-07T13:09:07 1717765747

There were already teams building ML-based code completion, code suggestions, and code repair before LLMs blew up a couple years ago. So the principle of it isn't driven by AI hype.

Yes, there are oodles of people complaining about AI overuse and there is a massive diversity of opinion about these tools being used for coding, testing, LSCs, etc. I've seen every opinion from "this is absolute garbage" to "this is utter magic" and everything in between. I personally think that the AI suggestions in code review are pretty uniformly awful and a lot of people disable that feature. The team that owns the feature tracks metrics on disabling rates. I also have found the AI code completion while actually writing code to be pretty good.

I also think that the "% of characters written by AI" is a pretty bad metric to chase (and I'm stunned it is so high). Plenty of people, including fairly senior people, have expressed concern with this metric. I also know that relevant teams are tracking other stuff like rollback rates to establish metrics around quality.

There is definitely pressure to use AI as much as reasonably possible and I think that at the VP and SVP level it is getting unreasonable, but at the director and below level I've found that people are largely reasonable about where to deploy AI, where to experiment with AI, and where to AI to fuck off.

barrkel · 2024-06-07T09:50:22 1717753822

The code completion is quite smart and one of the bigger advantages Google has now is the monorepo and the knowhow to put together a pipeline of continuous tuning of models to keep them up to date.

The pressure, such that it is, is killing funding for the custom extension for IntelliJ that made it possible to use it with the internal repo.

Cider doesn't have the code manipulation featureset that IntelliJ has, but it's making up for that with deeper AI integration.

DannyBee · 2024-06-07T11:35:42 1717760142

Fwiw, ai code completion initiatives at Google well predate current hype (ie overall research path, etc, was started a long time ago, though obviously modified as capability has changed).

So this particular thing was a very well established program, path, and methodology by the time AI hype came.

Whether that is good or bad I won't express an opinion, but it might mean you get a different answer to your question for this particular thing.

rishabhjain1198 · 2024-06-07T07:56:19 1717746979

No pressure from the top. Methodologies are pretty solid.

The internal dev tooling at Google is quite far ahead of what's available on the market rn.

rishabhjain1198 · 2024-06-07T07:58:46 1717747126

In my empirical experience at Google, the code-complete is hit or miss when I haven't put in my first edit.

However, once I do something, I guess the LLM gets the nudge/prompt in the right direction and almost always auto-completes the full thing correctly.

bitwize · 2024-06-07T02:20:13 1717726813

Today on Facebook I saw a headline: "Marvel Fans Want to See Miley Cyrus as Rogue in the MCU". And I thought to myself: Which Marvel fans?

And now...

> Just five years later, in 2024, there is widespread enthusiasm among software engineers about how AI is helping write code.

Which software engineers?

AuryGlenz · 2024-06-07T03:34:50 1717731290

There are plenty in this post, and I'm one of them.

It may or may not be great for job prospects, but as someone who isn't one of you fancy startup/FAANG super programmers it's great for me to be able to ask it design problems, what the 'best' way to do a certain thing is, or tell it "I need a function given x,y, and z that does this and returns this."

There are plenty of instances where it doesn't make sense to use it, but I always have a tab with ChatGPT open when coding now. Always.

viking123 · 2024-06-07T08:37:30 1717749450

It's good if you have some domain knowledge and can kind of detect the bullshit there, and also just in cases where there is large amounts of training data. I am in big tech and I use it pretty much every day, mostly in cases where I would have spent a lot of time googling before.

vitus · 2024-06-07T02:50:01 1717728601

From the screencast:

> implement also for Days

This fails to recognize that this is a bad feature that the Abseil library would explicitly reject (hence the existence of absl::CivilDay) [0], and instead perpetuates the oversimplification that 1 day is exactly 24 hours (which breaks at least twice every year due to DST).

Which is to say: it'll tell you how to do the thing you ask it to do, but will not tell you that it's a bad idea.

And, of course, that assumes that it even makes the change correctly in the first place (which is nowhere near guaranteed, in my experience). I have seen (and bug-reported!) cases where it incorrectly inverts conditionals, introduces inefficient or outright unsafe code, causes unintended side effects, perpetuates legacy (discouraged) patterns, and more.

It turns out that ML-generated code is only as good as its training data, and a lot of google3 does not adhere to current best practices (in part due to new library developments and adoption of new language versions, but there are also many corners of the codebase with, um, looser standards for code quality).

[0] https://github.com/abseil/abseil-cpp/blob/bde089f/absl/time/...

suyash · 2024-06-07T05:16:03 1717737363

So how long till AI will be fully replacing a SWE at Google? That is where the drive for productivity at organizations like Google are leading towards.

knallfrosch · 2024-06-07T06:28:03 1717741683

Assuming Google has tens of thousands of software engineers (for a lower bound of 10,000) and artificial intelligence increases productivity by at least 0.01%, the first engineer has already been replaced.

vineyardmike · 2024-06-07T07:34:23 1717745663

> for a lower bound of 10,000

I'm guessing it's 30+k, depending on how liberal you are with the job (eg. data engineering, SREs, etc).

> the first engineer has already been replaced

Realistically, many engineers never got hired because of this already. Also they've had a few rounds of layoffs, so probably plenty are fired by now.

Google already was known for their use of automated code-gen long before they invented LLMs so I wouldn't be surprised if they're on the verge of major systems being gen'ed or templated with LLMs.

bamboozled · 2024-06-07T06:42:23 1717742543

So how long till AI will be fully replacing a SWE at Google?

Notice parent said "fully"

DebtDeflation · 2024-06-07T11:30:36 1717759836

It's the wrong question.

It's like asking "How long until a bulldozer [1] fully replaces a human construction worker?" A bulldozer is not a full replacement for a human construction worker. However, 1 worker with a bulldozer can do the work of 10 workers with shovels. And 10 workers with bulldozers can build things no amount of workers with shovels can.

[1] I'm using bulldozer as shorthand for all automated construction equipment - front end loaders, backhoes, cranes, etc.

booleandilemma · 2024-06-07T13:40:51 1717767651

But then the 10 shovel-wielding humans are out of a job, right? The employer isn't just going to give everyone bulldozers.

It'll be one guy with a bulldozer doing the work of ten men.

shermantanktop · 2024-06-07T15:01:26 1717772486

Buggy-whip makers aren’t around anymore…

New tech drives job changes. Always has, always will. No guarantee that the changes are good for every individual. And there’s no guarantee that they are good in the aggregate, either.

wil421 · 2024-06-07T15:09:54 1717772994

Or you become a bug hunter for 10 AI shovels who are now your senior.

zekica · 2024-06-07T07:27:28 1717745248

Never? Google will probably reduce their SWE more and more assuming we get exponentially better at LLMs or something better comes along, but it won't ever be fully replaced.

rebolek · 2024-06-07T07:39:03 1717745943

That depends on when we'll have AI and my guess is never. But I may be wrong.

knallfrosch · 2024-06-07T14:17:42 1717769862

The question posed makes no sense.

Either it's 1 Software developer full-time equivalent (already happened) or all SWEs (never going to happen.)

Slyfox33 · 2024-06-08T00:21:30 1717806090

Why would it mean that? Couldn't it just mean google is more productive?

bufferoverflow · 2024-06-08T04:12:10 1717819930

Or not, and they want more productivity.

belter · 2024-06-07T10:38:23 1717756703

Not if they use Gemini Advanced...It's so far behind Claude or ChatpGPT4 as to be laughable.

N3cr0ph4g1st · 2024-06-08T08:52:55 1717836775

You clearly haven't used 1.5 pro it seems. Pretty damn close to gpt-4o

belter · 2024-06-08T14:43:30 1717857810

Yes paying customer of all three. Sometimes for fun, start a task with the same prompt across all three. Within tree interactions Gemini start giving woke lectures, and provide worst or unusable results while the other two shine.

Just try it....

hi-v-rocknroll · 2024-06-07T07:16:15 1717744575

The gradual, eventual, ultimate conclusion killer app of AI is to run the systems, debug apps like an SRE, manage warm datastores, and eventually write app code based on the desires of feedback and requirements descriptions from users, design new programming languages/formats for it, and later design silicon. The curious bit is how product managers will fit into this picture to perhaps supervise what is allowed, only to eventually removed. There's no conceivable job that can't be eaten partially or mostly by AI, if not nearly entirely eaten in the future.

At some region on the technological "singularity" timeline, there will can and will be fully-autonomous corporations. The question is: With corporate personhood, can a corporation exist, pay (some) taxes, reinvest in itself, and legally function without any human owners? This may also be contingent on whether or not a corporation conduct legal and business activities, i.e., if performed by a human agent or through some automated means, even if they were directed by AI management. IANAL, but I guess an autonomous corporation could be sued, and perhaps even the creators of the software used to create the AI that run it could also be potentially at risk.

vsskanth · 2024-06-07T01:56:39 1717725399

I wonder if AI code completion reduces the value of abstractions if boilerplate code is easy to generate. Will this lead to highly repetitive code ?

fewald_net · 2024-06-07T03:54:54 1717732494

In Go it’s not uncommon to use code generation to recreate boilerplate code, especially before the introduction of generics. No human looks at this usually. And if they do, they find the code they’re looking for contained in a few files. I personally found this pattern pretty good and easy to reason about.

piloto_ciego · 2024-06-07T03:08:59 1717729739

If it works and it never needs a human, does it matter?

dennis_moore · 2024-06-07T08:07:23 1717747643

It works until it doesn't at which point you have a massive, useless pile of uninterpretable garbage.

piloto_ciego · 2024-06-07T16:10:41 1717776641

Then you ask another LLM to explain it to you lol, like - I don’t think people are thinking hard enough about what this future is likely to look like.

pquki4 · 2024-06-08T03:20:50 1717816850

Well, someone (a human being) still maintains it, and ultimately someone likely will find the code unmaintainable even if LLMs help. If you use ChatGPT enough you would know it has its standards as well, actually pretty high. At one point the code likely still needs to be refactored, by human or not.

the_doctah · 2024-06-08T04:33:56 1717821236

Crazy that people say shit like this without seeing a problem with it.

piloto_ciego · 2024-06-08T04:42:49 1717821769

It’s really not a problem at a certain point. Also, we’ll probably have “remove and replace” but for software in the next couple years with this stuff.

zackproser · 2024-06-07T01:59:56 1717725596

After reading this I'm wondering how the indy code autocomplete tools are going to be able to compete longterm with this giant feedback rich data machine Google has built...do engineering orgs of sufficient scale ultimately hoard their tooling for competitive advantage, thereby leaving independent players to cater to developers outside of Google? Feels like yes...but plenty of inventions trickle out in various forms.

skywhopper · 2024-06-07T02:11:17 1717726277

I’m not seeing any evidence that any of this is actually good for Google’s business. Their observations that half of code checked in is from suggestions by an LLM is not really surprising in a regimented dev platform with tons of boilerplate. That stat tells us nothing about actual code quality, development velocity, or skill curves over time, much less business impact.

What product of Google’s has been improved by this feedback loop? The trajectory of Google search itself in the past year has been steadily downhill. And what other products of theirs I use don’t really change much, at least not in positive ways. Gmail is just gradually being crushed from every side by other app widgets scrounging for attention. Chrome has added… genAI features and more spyware? Great.

javawizard · 2024-06-07T02:38:32 1717727912

Xoogler here.

This exactly. There is so much boilerplate involved in writing anything inside Google.

AI was great to cut that down a bit. It's still nowhere near what it's like in the outside and/or non-Java world.

Which isn't to say that this isn't progress - just that that stat should be taken with context.

chubot · 2024-06-07T05:04:15 1717736655

Yeah I kind of agree that when LLMs work REALLY well for autocomplete of your codebase -- that might be an indication that the language and library abstractions you use don't fit the problem very well.

Code is read more than it's written. And it should be written to be read.

If you are barfing out a lot of auto-completed stuff, it's probably not very easy to read.

You have to read code to maintain it, modify it, analyze its performance, handle production incidents, etc.

vineyardmike · 2024-06-07T07:40:43 1717746043

> If you are barfing out a lot of auto-completed stuff, it's probably not very easy to read.

From my experience using LLMs, I'd guess the opposite. LLMs aren't great at code-golf style, but they're great at the "statistically likely boilerplate". They max out at a few dozen lines at the extreme end, so you won't get much more than class structures or a method at a time, which is plenty for human-in-loop to guide it in the right direction.

I'm guessing the LLM code at Google is nearly indistinguishable from the rest of it for a verbose language with a strong style expectation like java. Google must have millions of lines of Java, and a formatter that already maintains standards. An LLM spitting out helper methods and basic ORM queries will look just like any other engineers code (after tweaking to ensure it compiles).

If you already apply a code-formatter or a style guide in your organization, I'm guessing you'd find that LLM code looks and reads a lot like the rest of your code.

chubot · 2024-06-07T13:44:42 1717767882

Yes, it can make stuff that fits in the rest of the codebase

But I am saying it's not going to ever make the code significantly better

In my experience, code naturally gets worse over time as you add features, and make the codebase bigger. You have to apply some effort and ingenuity to keep it manageable.

So if everyone is using LLMs to barf out status quo code, eventually you will end up with something not very readable. It might look OK locally, but the readability you care about is a global concern.

mensetmanusman · 2024-06-07T02:13:47 1717726427

Google prints money until search goes away. Nothing else they work on has to succeed.

zackproser · 2024-06-07T02:26:13 1717727173

But increasingly I prefer to ask LLMs the same things I used to search Google for...

SrslyJosh · 2024-06-07T07:12:51 1717744371

And then you do search to check that the output reflects reality, right? If not, good luck.

mensetmanusman · 2024-06-08T10:41:41 1717843301

Hallucinations saved Google.

zackproser · 2024-06-07T02:25:06 1717727106

What you say tracks - but I'm wondering what happens if they manage to unlock some meaningful velocity increase to the point where they can begin tackling other domains and shuttling out products at a higher rate...Agree with your thoughts on search - it's f'ing unusable now and frustrating to look at.

zackproser · 2024-06-07T02:25:32 1717727132

Meaningful velocity increase while keeping quality high enough for the services to function in production and charge users..

davidcbc · 2024-06-07T12:11:24 1717762284

Coding velocity is not a barrier to google tackling other domains.

orsenthil · 2024-06-07T15:41:53 1717774913

An extrapolation to these make me think that programmers becoming like Tony Stark telling Jarvis what to do, and it gets it done. Or Minority Report style interfaces, where people will inform their intent (prompt?) and get the answers.

rldjbpin · 2024-06-09T07:09:34 1717916974

Happy for them to experiment with the tools, but i am afraid it will have more negative effect through others blindly copying without reading between the lines.

i am still struggling to find value where the main selling points of using llm for code lies. for boilerplate this is a very inefficient although interesting way, but it does not help me do the thinking. well-written, language-specific boilerplate generation in most ide brings out the current effects just fine.

bigiain · 2024-06-07T01:30:27 1717723827

I'm looking forward to the day that some spicy autocomplete regurgitates an obvious chunk of AGPL code that it's stolen without permission or attribution - and it ends up in some critical part of Googles money printing machine, and the outside world finds out about it.

I'm gonna need a _lot_ of popcorn.

scarmig · 2024-06-07T02:00:47 1717725647

I wouldn't be surprised if the sole training data for autocompletes was google3. It's an absolutely massive codebase, using the libraries and patterns Googlers use, and more or less entirely safe to train on. Any training data beyond that would be whitelisted by legal.

mark_l_watson · 2024-06-07T13:44:47 1717767887

That was my assumption also. I worked at Google as a contractor in 2013 on a AI related project and their entire internal development environment was so much fun to use, and massive. Their internal coding world is an ideal and probably sufficient source of end-stage training data, given early training on the usual data sources to build language understanding first. They wouldn’t train on just their internal data exclusively.

jjallen · 2024-06-07T01:35:27 1717724127

How would the outside world find out about private Google code?

Maybe the fact could be leaked? Unlikely though.

imabotbeep2937 · 2024-06-07T01:52:56 1717725176

You're way, way under thinking the problem.

People who work at your bank right now are pasting your personal details into language models and asking if you deserve a loan. People will figure out how to get this data back out.

"But the model only uses old training data" there are myriad lawsuits in flight where these companies took information they shouldn't have, in all forms. Prompt engineers have already got the engines to spew things that are legally damning.

A real hack, which might archive user inputs as well as exfiltrate training data. We're only beginning to imagine the nightmare.

Can people use these language models to get private Google info? Only if Google was dumb enough to include that in the training. Hint: Yes, it's much, much worse than anyone imagines.

HeatrayEnjoyer · 2024-06-07T01:41:53 1717724513

An insider could tip them off, then IP owner sues and Google has to produce it during discovery.

shermantanktop · 2024-06-07T15:06:16 1717772776

And? For big companies, lawsuits are mostly about being forced to pay money. Rarely does losing a case drive real behavioral change, much less fundamental change.

IP getting coughed out of an AI will keep some lawyers busy and result in fig leaf changes.

HeatrayEnjoyer · 2024-06-09T19:34:22 1717961662

The outcome of any IP lawsuit is going to include an order to cease use of the IP. Even a DMCA notice requires cessation.

If the infringing code powered a widely used service, replacing the code with a clean room refactor could be arduous and time-consuming and might require a court-appointed monitor to sign off.

vitus · 2024-06-07T14:48:38 1717771718

If there's any risk of this happening, it's probably because someone imported AGPL code into the codebase, outside of //third_party, after removing the copyright notice / license. (We have an allowlist of acceptable third-party licenses; I would not expect it to include copyleft licenses like AGPL.)

We nominally have controls against this sort of thing, but if it's only weakly-enforced policy, then unfortunately it's easy enough to bypass.

summerlight · 2024-06-07T06:31:44 1717741904

I would be very surprised if their training data is not explicitly curated by lawyers.

Tao3300 · 2024-06-07T02:16:47 1717726607

They probably just buy the other party.

yencabulator · 2024-06-07T17:46:11 1717782371

What I've learned is that a video play/pause button in the far right corner is horribly unintuitive.

dash2 · 2024-06-07T09:18:42 1717751922

We're going to invest $7 trillion to get... better autocomplete?

Apple and Microsoft are gonna smoke these guys.

vitiral · 2024-06-07T12:56:35 1717764995

Where did you get 7 trillion?

dash2 · 2024-06-07T14:30:39 1717770639

It's the figure Sam Altman bandied about.

vitiral · 2024-06-07T18:11:24 1717783884

Sam Altman doesn't work at Google though

dr_dshiv · 2024-06-07T06:30:06 1717741806

Half of all code at Google is AI generated?? (See the line chart, reaching 50%)

hejdufufjrj · 2024-06-07T06:44:40 1717742680

No, half of new code has at least one character AI generated, something like that.

Majestic121 · 2024-06-07T09:45:48 1717753548

Half of new characters are AI generated

dr_dshiv · 2024-06-08T14:51:27 1717858287

That’s my understanding too

rKarpinski · 2024-06-07T06:38:52 1717742332

And what % of "code" is boilerplate?

29athrowaway · 2024-06-07T02:20:28 1717726828

Welcome to the AI pawn shop, where you sell your skills to the AI, you get paid and then your knowledge and skills is sold at cents per token.

bitwize · 2024-06-07T03:09:43 1717729783

You'd be surprised at how many "Make money teaching AIs to code!" job postings there are out there now. Taking "training your replacement" to whole new levels.

m3kw9 · 2024-06-07T01:16:10 1717722970

Path ahead is to test you stuff before releasing

ur-whale · 2024-06-07T05:30:51 1717738251

I wonder what fraction of "accepted AI-generated code" is used in the LLM building and training codebase ...

Kind of reminds me of the mad cow disease problem.

endisneigh · 2024-06-07T01:09:53 1717722593

Anyone who works at Google knows they are force feeding the employees with this stuff.

Same thing they’re doing with the general public and search.

Anyone who worked at google during + can see this is the same half hearted type of force feeding that will fail, with the rationalizations to boot.

Disclaimer: I hold google stock but don’t work at google

danpalmer · 2024-06-07T01:48:34 1717724914

I work at Google, this stuff is available, entirely optional, and in fact the most recent stuff is still something you have to sign up for and get on an opt-in basis in return for providing feedback/answering surveys.

endisneigh · 2024-06-07T01:51:20 1717725080

This is not true. You can’t even disable AI entirely in critique (AI suggested edits), not to mention cider.

Certain aspects can be disabled but you cannot entirely remove all AI affordances. It’s force feeding.

It will eventually all be opt-out. This is obvious to anyone who works at google. Again, same as force feeding of gen AI search

danpalmer · 2024-06-07T06:03:39 1717740219

I have AI suggested edits turned off. It's the 4th setting down in the settings menu. AI suggestions in Cider can be disabled. And all of these things are web apps for which googlers have a rich history of creating Chrome extensions to change things they don't like or disagree with.

I don't feel it's obvious at all that this will become forced. What would that even mean – you don't get to edit code anymore but can only interact via chat with an AI which edits the code for you? I think it's obvious that's not going to happen.

Will there be a day when you can't turn off AI suggestions for something? Maybe, but suggestions aren't requirements, no one is forced to use them so at worst they're a minor UI annoyance.

I'm interested to hear why you have such a negative take on it? I don't particularly enjoy most AI implementations in products (not Google specific), but they're almost entirely optional everywhere, and ignoring them really isn't much of a problem.

endisneigh · 2024-06-07T09:31:16 1717752676

You can turn off giving ai suggestions, not receiving them.

Not all AI affordances in cider can be disabled. There are plenty of people complaining about this on yaqs and buganizer.

> no one is forced to use them so at worst they're a minor UI annoyance.

This attitude is exactly the problem with Google and is why Google’s AI rollout has been terrible thus far. Luckily the stock is still going up, so whatever I guess.

danpalmer · 2024-06-07T11:52:31 1717761151

You don't receive AI suggestions in code review, you receive suggestions from your reviewers. If they are providing bad recommendations, regardless of the source, I suggest providing them feedback.

I agree this may not be how they are used, but honestly, code review is a skill and many people are bad at it. Blindly suggesting things because an AI (or a presubmit) suggests it is bad form regardless of the source. People slam that "please fix" button without looking at what it's asking for, or without reading my detailed explanations of why I'm ignoring a recommendation, etc. This was already a problem and AI isn't changing that.

We can make code review better, but that comes through training reviewers better, not by ruling out AI tools just because they're AI tools.

endisneigh · 2024-06-07T12:02:43 1717761763

> You don't receive AI suggestions in code review, you receive suggestions from your reviewers

If a reviewer makes a comment, Critique will create an AI suggestion based on their comment, even if they didn’t explicitly do so (unless they turn off giving AI suggestions on their end, but there’s no way to stop from seeing it from the other).

> not by ruling out AI tools just because they're AI tools.

This was not my point at all.

The point is that Google force feeds AI tools, and makes it difficult to opt out. In many cases it is not possible as shown with GenAI search.

refulgentis · 2024-06-07T20:54:32 1717793672

Word of advice from someone who left: if you've reached the point you're trashing work for a root-level OKR that is considered existential, taking an absolutist exaggerated stance against it, and discussing fine-grained details of internal tools, while claiming they go against PR/"research papers"...you're well past the right time to leave.

We all have unique circumstances, but I can almost guarantee you that you'll be absurdly happier elsewhere.

Life's short, and no matter how much you save, something can take it away.

Better to start pursuing it now, than after being the sacrificial MI, or after the call from HR asking you to chill because a VP got upset and had lackeys reverse engineer your identity.

pquki4 · 2024-06-08T03:32:34 1717817554

> you are well past the right time to leave

Well, that's easy for you to say. People have family to feed, right?

refulgentis · 2024-06-08T05:46:33 1717825593

> We all have unique circumstances, but I can almost guarantee you that you'll be absurdly happier elsewhere.

Thanks for your insightful contribution - I read the post again and found this right after what you found, like, right after. I hope this helps clarify

pquki4 · 2024-06-10T11:05:06 1718017506

> I can almost guarantee

Guarantee with what? Personal money? I know people who have more than 3+ years of experience having trouble with getting an offer after months of searching these days. What can you offer to guarantee the "happiness" "elsewhere"?

Such a ridiculous, out of touch comment.

jstarks · 2024-06-07T01:54:39 1717725279

Do you find this problematic?

endisneigh · 2024-06-07T01:55:10 1717725310

It’s usually wrong. Use Gemini if you want a preview.

I don’t mind optionality, but it can’t be disabled so it’s annoying.

rishabhjain1198 · 2024-06-07T08:01:57 1717747317

+1 for this. Stuff is completely optional.

longerd2 · 2024-06-07T13:51:37 1717768297

+1 like in google+? :D

Tao3300 · 2024-06-07T02:15:41 1717726541

> entirely optional

Optional. I'll bet.

danpalmer · 2024-06-07T06:05:44 1717740344

What would it mean for code suggestions to be non-optional? Like, you can't edit the code file yourself but have to talk to a chatbot to ask it to make the edits for you? I think that's fairly obviously a ridiculous notion.

vitus · 2024-06-07T14:30:27 1717770627

I could envision some point in the future where the tooling is good enough that if you reject the suggestion, you had better have a very good reason for doing so. We're already there on the formatting front, as well as the widely-enabled clang-tidy checks (e.g. pessimizing moves). Once a tool consistently meets that high bar, it's irrelevant whether its suggestions are derived from static analysis or a LLM.

As far as being non-optional: a code owner could very much refuse to approve your change until it conforms to their standards for their code base.

ML code suggestions at Google do not inspire that high level of confidence in me today, but I have no reason for thinking that will always be the case.

danpalmer · 2024-06-10T13:03:23 1718024603

I see your point, and maybe you're right, but I do think formatting tools are materially different. You can define correctness for a formatting tool and they're understood to be about enforcing a consistent style. That doesn't apply to code more generally where there's more open to interpretation, other "style" to consider, program behaviour, etc.

Also I would say that automatic formatters weren't popular until the mid 2010s from what I've experienced, despite being technically possible since pretty much the advent of programming. I even remember having to push hard for adoption of them in ~2018. Even if the AI tools were at the level (and they're definitely not yet, any of them), it could easily take a decade for it to become the norm or for it to be mandated.

Tao3300 · 2024-06-07T20:02:23 1717790543

I meant for straight opting out of even having it.

refulgentis · 2024-06-07T01:39:31 1717724371

That's not true, I'm a xoogler as of October, and at least 2 of my ex-colleagues continue to generally wonder if AI can write code or not, and if it can, they haven't tried it. Last update 60 days ago.

It does look like there's an auto-installed cider extension, which is fine, the worst case for this stuff is "it's in my autocomplete list" -- that's fine!