When I talk to my Google Home then 50% of my brain power is engaged in predictin...

chipgap98 · on Nov 10, 2022

I think the biggest use case for this is accessibility. There are plenty of people who permanently or temporarily cannot use a keyboard (and/or mouse). This will be great for those users.

For the average dev, I agree this is more of a novelty.

jesterswilde · on Nov 10, 2022

I am highly suspicious of new tech coming in the guise of 'accessibility'. As someone goin blind, a lot of things toted as good for me are cumbersome and bad.

Maybe this will be different, and that'd be neat. Though I just think more expressions of code is neat. I also know the accessibility you're talkin about isn't for blindness.

That being said I can talk about code decently well, but if you've never heard code come out of text-to-speech, well, it's painful.

I bring up the text-to-speech because if speech is input, it would make sense for speech to also be the output. Selfishly, getting a lot of developers to spend time coding through voice might end up with some novel and well thought out solutions.

harvey9 · on Nov 10, 2022

For sight problems you are correct. But voice input is valuable by itself. I had chronic tendonitis in my wrists a few years ago. I looked into voice coding and it was difficult to set up. Fortunately for me I've been able to adapt with a vertical mouse and split keyboard.

theCrowing · on Nov 10, 2022

You look at the product from your point of view and you are not the target group, it's that easy.

boredumb · on Nov 10, 2022

I do think there will be big advancements in the text-to-speech realm. I've noticed some ML projects imitating voices surprisingly well and while it's not quite there yet - it's already a bit less grating than it was even a few years ago.

melling · on Nov 10, 2022

“I think there is a world market for maybe five computers.” - Thomas Watson

I bet if we use our imaginations, we’ll think of a lot of places were using voice to code could come in handy.

Personally, I’ve been waiting for it for a few decades.

The creator of TCL has RSI and has been using voice since the late 1990’s

https://web.stanford.edu/~ouster/cgi-bin/wrist.php

Thought we were really close 10 years ago when Tavis Rudd developed a system:

https://youtu.be/8SkdfdXWYaI

GitHub seems to be more high-level. It figures out the syntax and what you actually want to write.

This would help if you barely knew the language.

Time to learn Rust or Scala with a little help from machine learning.

darkwater · on Nov 10, 2022

> GitHub seems to be more high-level. It figures out the syntax and what you actually want to write.

To me, looks like it's feeding your voice input to Copilot that then generates the code output just as before. So, the same strength and weaknesses of Copilot apply (and you can probably mimic it locally with a voice input method you control, just dictate comments for copilot)

mkl · on Nov 10, 2022

> “I think there is a world market for maybe five computers.” - Thomas Watson

This statement probably didn't happen. The closest thing to it was 10 years after the quote is usually supposed to have happened and was about a single model of a single machine: https://geekhistory.com/content/urban-legend-i-think-there-w...

awslattery · on Nov 10, 2022

As a new dad, I would love to have the voice-to-text accuracy and speed I get on my Pixel phone on my desktop OS. Done right, I could easily see myself using it more often than when I have my youngling in one arm as I've been WFH for the better part of the last 6 years of work.

cdrini · on Nov 10, 2022

This looks to be much more heavily using GPT3/Codex/Copilot, which I've found to be eerily effective. It basically feels like a voice interface to Copilot. The main difference between these and something like Google Home is how effectively they pick up on context. "Hey Github" would be able to use all the code in the file as context, so when you say "wrap this in a function", it'll have an idea of what you mean, without that function having to be explicitly programmed. Voice assistants have to _always_ be in a voice space, so context is very limited. And generally the way Google home-style voice assistants are created is by programming specific actions linked to specific phrases. ML helps make the phrase matching flexible, but the action is usually entirely explicitly coded. Using Codex would let the action be ML influenced as well.

If Copilot is any indicator of effectiveness, then I have high hopes for this! I've always wanted to program while stationary biking :)

bryanrasmussen · on Nov 10, 2022

I think yes this could be a real multiplier for seniors, you're doing something you have done lots of times before just a bit different you know pretty much everything you need to do, describe it until it is in a state where you can through and finish it off. Exactly like a stationary bike or out in the garden with your kid type thing.

IF the voice analysis was any good of course. But maybe it will also be able to be better than typical voice analysis because the syntax is limited, when programming I use a much more limited vocabulary than when writing literary criticism. So while text to speech is total crap for handling complex literary phrasing it might be adequate for programming structures.

edgyquant · on Nov 10, 2022

I’m a senior/systems architect coming down with bad carpel tunnel and this sounds like a godsend

raylad · on Nov 10, 2022

Around 1998 I broke my collarbone and had to use Dragon Dictate.

I found that for general subjects it was quite difficult to use because of the fairly poor recognition rate.

But when I talked about computers, it got almost everything right. I assumed it must have been trained by the developers, who talked about computers mostly.

This is another special purpose vocabulary, so it seems as if it would have a good chance of a high recognition rate.

eurasiantiger · on Nov 10, 2022

It’s most likely just Cortana bolted on to Copilot.

rahulpandita · on Nov 10, 2022

GithubNext here! Just clarifying that this is not the case.

machiaweliczny · on Nov 10, 2022

Then it’s Whisper into command tree thing?

rpastuszak · on Nov 10, 2022

I don't use voice assistants any more due to privacy concerns but I wrote some similar software in 2010s. I'm fluent in English, but with the current tech, the success rate for me giving commands to a machine is still 50/50.

> I could never be productive programming like this.

It's likely to work much better than a generic speech-to-text model due to fine-tuning.

Plus, consciously or not, we will adapt our human language to the English-ML "pidgin" (e.g. by introducing a more efficient grammatical structures, using a specific subset of vocabulary).

The way I see it is that it's not much different from giving commands to your dog, writing a Google query, writing a Stable Diffusion prompt. It'll get better. Manual input is not as fast as speech though and that's where I see the issue.

atdrummond · on Nov 10, 2022

I am happy to take a severe deficit over not being able to work at all. When my back was acting up, I could not physically use my left side. Dictation was the ONLY way I could code. By the end of this period, my output was back up to 95% of my typed output - especially as I don’t type code nearly as fast as I do general language writing.

rahulpandita · on Nov 10, 2022

GitHubNext here! We would love to hear more about your experience. Please help us out by signing up for this experiment :)

mrtksn · on Nov 10, 2022

The voice interface experience(in general) so far is like trying to make a really stupid person do something for you. Out of context misunderstandings are the worst because it breaks your flow trying to understand why that happens and how to fix it.

I imagine that voice to code would be like standing over the shoulders of a junior coder who knows the syntax and some techniques just enough to follow orders but has no idea whats doing and when gets it wrong will be very wrong.

smartmic · on Nov 10, 2022

"Writing is thinking. To write well is to think clearly. That’s why it’s so hard." ~David McCullough

This not only holds for literature but also for programming. Concerning the hard part, I would argue that is the reason why it is not called "talking is thinking".

yi_xuan · on Nov 10, 2022

"If you're thinking without writing, you only think you are thinking." -Leslie Lamport

Even though now speech recognition rate is really high, but I wonder how many authors use speech to write articles. The comparison may make sense. And I think there's few.

mjburgess · on Nov 10, 2022

I think there's a difference between communicating your intent to a machine, which is hopeless since it has no model of intention; and commanding a machine to reproduce something.

Ie., when you're managing your house you want something that can be communicated in an infinite number of ways, but the "AI" accepts a tiny finitude of ways.

However when programming it seems like we arent asking the machine to "write a function to do X", but rather saying, "def open-paren star args...."

This seems like a pretty trivial problem to solve.

dustedcodes · on Nov 10, 2022

> However when programming it seems like we arent asking the machine to "write a function to do X", but rather saying, "def open-paren star args...."

Click the link first and take a look at what is being showcased, because your comment is the exact opposite of what they demo when you visit the HN link.

mjburgess · on Nov 10, 2022

You're right... So, yes, it will be largely useless (as shown) for actual programming.

But I suspect there'll be a subset of its features consistent with my comment that will be actually useful.

Programming, via Naur/Ryle, is always a kind of theory building. And unless you're basically copy/pasting, it's a novel theory of some area (, business process, etc.).

That's something where intentions arent even really communicable as such, since the art of programming is sketching possible theories as a way of finding out what we ought intend.

So this is another gimmik with maybe marginal improvements at the edges.

wizardofmysore · on Nov 10, 2022

It's really useful for those who have challenges typing (arthritis, disabilities etc..), perhaps not best for general audience as typing with auto complete is faster.

alvis · on Nov 10, 2022

For repetitive tasks like preparing a report in the demo, saying is definitely faster than typing. It's quite impressive if your boss ask you to prepare one and the report is done in less than two minutes.

However, I too really doubt if there's any better use cases than simple tasks, let alone everyone would hear what you ask the AI to do in the office. Oh my! How embarrassing am I?

danielbln · on Nov 10, 2022

The assistants (Google, Alexa, Siri) are not great at NLP. Compare how you speak to them vs speaking to a LLM like gpt-3, there is a world of difference. The latter feels like speaking to a human, the former more like your trying to get your voice commands into a state machine.

Kiro · on Nov 10, 2022

Blind people are already very productive using voice-to-code.

jfk13 · on Nov 10, 2022

There may well be examples of this, but while the blind developers I have known (a small sample, I admit) typically use screen-reader technologies to navigate and read code, they use a keyboard to write and edit it.

dustedcodes · on Nov 10, 2022

I don't disagree with that. I just meant I don't think it's going to have mainstream appeal. A wheel chair also makes a disabled person super productive if the alternative is not being able to go anywhere at all, but it doesn't make wheelchairs super appealing to people with healthy two legs if you see what I mean.

dkns · on Nov 10, 2022

I think this is great for: a) people who are visually impaired or have issues with their hands/fingers b) people who aren't programmers; if you could make it more Scratch-like then this is amazing tool for showing off power of programming

0-_-0 · on Nov 10, 2022

The mental load would reduce with practice very quickly

CrociDB · on Nov 10, 2022

they're creating a new job "prompt engineers" to replace the engineers. this is 2022.

ykonstant · on Nov 10, 2022

The rise of the... t...talking monkey? cognizes intensely