I was considering learning Plover and even bought a keyboard to learn, but that was about the time that Otter.ai and other speech-to-text tools started becoming excellent. So I thought, what’s the point?
Can anyone convince me? I may still like to learn, but consider that it’s much more difficult than touch typing. Each word has its own combination that must be memorized (or sometimes a series of words). Think of the difference between learning the Roman alphabet vs learning Chinese characters. There is seemingly no end to it.
I find it extremely cringy to talk to a machine and get very annoyed at correcting transcription errors in the middle of a sentence or paragraph after it was transcribed. If you don't find it cringy to use voice assistants I guess you're just a different type of person and would prefer talking instead of typing.
I can sympathize with the dislike of voice assistance as they are pretty universally crap. But recording yourself monologuing and having it AI transcribed can feel really natural.
I still default to typing, but whenever I'm feeling writers block, I'll just start thinking out loud and record it to get started and it works quite nicely.
Yeah I'm not sure if I'm just weird or what, but from the start with voice assistants I've been utterly confused that people want to talk to machines using a shitty interface and broadcasting to others what they are doing (not even talking about the NSA, just people in my vicinity). Imagine being in my room, saying out loud each URL I wanted to visit. It feels as cringy as if instead of turning a page silently in a book while I'm sitting in my room reading, I would instead disrupt that silence and focus by telling the book "TURN PAGE". That's how it feels to me when I hear someone say "OKAY GOOGLE", I almost physically recoil.
It's like you have a nice silent interface that will be perfectly interpreted, and decide instead to be loud and imprecise and have to guess if the machine is going to understand you. I think maybe rather than cringy another word would be that it feels in bad taste. It's half about the "style" and half about the lack of "function" of the medium. It's effectively worse in terms of reliability and you look like someone that doesn't know how to use a computer all at the same time.
Ah, I get what you mean. For any interaction, I fully agree with you. I would hate to be in a room with someone who is interacting with their machine through voice.
I was reading the parent of your post as using these speech-to-text tools for dictation, not for interaction. I think they can be quite useful for dictation (if they can interpret your voice well).
"Okay google" is so gross.
Totally the apex of imposed advertising.
Physically recoil in fact, like a bitter taste in my mouth.
However, as opposed to some assistant turning imaginary 'pages' in .epub, sitting alone in focused dictation is a really great way to explore your mind and have free roaming thought processes.
It is also not new in any sense, e.g., Dostoyevsky dictated The Idiot to a stenographer.
It suddenly stopped working on my phone. Somehow without my explicit direction it has switched to "Hey Google".
Even though my usage of smartphone is minimal something like "Hey Google ... Drive to 3 Bullshit Ave" is very convenient and causes zero sensory recoil. But I do not live my life on the phone either.
I wouldn't use the word cringy myself, but I think I get where they're coming from. Reading the output of speech recognition a few seconds behind is jarring. Constant game of anticipation. Will it get that name or term right? Nope. Now I have to pause dictation, go back, and fix it. It's more involved than tapping backspace a couple times, the feedback is less instant, and that means more things I have to keep in my working memory, which is non-existent.
Putting on my narrator’s hat for a moment, I’d also point out that speaking clearly for hours on end is hard, and not something you can do without preparation (copious amounts of water, a room with little background noise, a good microphone, etc). Any of the preparation that’s skipped makes it harder to speak clearly and be understood.
Sometimes you need to both type and listen to other people talking, so talking over them would be pretty rude. Also typing with text to speech in a public place would really rob you of privacy.
Honestly for me steno is about ergonomics. It seems like steno should be harder than normal typing but you’re making so many fewer strokes that it takes a ton of strain off of your hands. Typing faster is a happy side effect.
You don't need to memorize each word's combination, although you eventually will as you type them frequently enough. It's shorthand for typing, where you can transcribe any word. You are chording syllables and in many cases, entire words, but you are not spelling each letter of a word. I am no steno expert, and my speed is not there yet, but I am benefiting from less finger movement and stress. It is also neat to rewire your brain to do something different. I am learning Colemak layout as well for touch typing.
Otter.ai is great, but it still makes mistakes, and it tends to make them in domain- or project-specific telerminology which typically is the most important to get right. Which means you need to proof read and fix the transcripts. This can later take the same amount of time as the meeting itself. If there's a way to make an accurate transcript on the fly, then it's better than voice to text.
I also picked up a basic steno keyboard earlier in 2022 (the EcoSteno, I think). I'm no good with it yet (I'm still working through layout drills, honestly), but for me, the draw isn't in transcription or text input specifically, but in chord-based control over my computer more generally.
A standard computer keyboard layout has ~100 keys. You can use various modifier keys (Ctrl, Alt, Shift, Meta) and combinations thereof to assign multiple meanings to each key, which morally organizes the keyboard into layers depending on which modifiers are active, but most layers are inconvenient to reach -- anything beyond two modifiers gets annoying and sees limited use. You're almost always on the unmodified layer.
A stenotype board has only ~20 keys, but the unit of input is an entire set of keys rather than a single key. In principle, you can comfortably enter chords of up to 10 keys at once, giving ~184,756 (20 choose 10) inputs. This is modified a bit by ergonomics, but it's still orders of magnitude more possibilities than an idealized keyboard with modifiers (something like 1,600).
That kind of space for addressing commands begs for some kind of principled organization. The Plover community calls assignments of commands to chords "dictionaries", and they generally follow an internally-consistent set of rules called a "theory". If you're working with English input, for example, you'll learn a theory that lets you almost always reason out the chord for a word.
There's nothing that limits stenographic input to transcription, though. You can assemble, say, a dictionary of Emacs commands, and assign related commands chords that share a common subset of keys. (Emacs is kind of like this already IMHO, but I am not a fan of the modifier+key system -- it feels like the addressing space is too small, and I'm afraid to customize the default keymap.)
Moreover, you aren't limited to single-chord input either. Multi-chord input is common; you can easily define entries in the dictionary which are based on a sequence of chords. I believe (but lack the experience to confirm) that English dictionaries tend to be organized around syllables or syllable clusters; the normal English stenotype layout specifically has sections for initial consonent, vowel, and terminal consonant, and theories tend to organize around that structure. Again, there's no reason you can't apply the same tools to non-transcription inputs.
I think this is a really cool input system, and for my interests, complementary to a regular QWERTY keyboard. I still want to learn a proper English theory (to avoid having to switch frequently between multiple keyboards!), but I mostly just want to have the option of chorded input in the first place.
I goofed on my calculations a little bit -- steno boards are designed so that one finger can depress two keys at once, so in principle you can actually hit all 20 keys simultaneously. You can't just hit any two keys with the same finger, though, so I don't believe it gets quite to 2^20 possibilities.
(Also, I computed only 20 choose 10, when you can of course have any smaller subset as well. I guess the point here is, a stenotype gives you even more than 184k inputs.)
I would love to try this some time, and then I remember that I spend 90% of the time on my PC programming, and then I get sad and settle with my qwerty.
Yes, there is support for doing that with plover, but it would be better to just use a normal keyboard. Just because it's possible to technically write python with it, that doesn't mean that it is a good experience.
I’ve found that ZipChord offers the best trade of between convenience (does not require N-key rollover, can continue to type normally on a QWERTY keyboard for most words) and speed. It lets you type a chord of several characters at once to type a word (e.g. I have “eml” set to type my email address and “bw” to type “between”).
The recent 2.0 beta release is a game changer, it has gotten really good at distinguishing character entry from chording (e.g. if you type need and hit the e and d key at the same time by accident, it won’t trigger your “ed” chord).
I don't use ZipChord for writing code specifically - I have found existing code specific completion tools to be better suited for most of what I code. I do use it for writing code comments.
Of course I've removed a bunch of cords (people's names and email addresses, mostly, as well as some confidential stuff from my work). It's also specific to my keyboard (because keyboards without NKRO have different patterns of keys that can be pressed at the same time).
Perhaps, but almost every birder would look at you strangely if you pronounced it like pl-oh-ver. Plover rhyming with lover is universal amongst birders.
I've always said ploh-ver (rhymes with clover), but that comes not from any birding experience, but from drinking Andytown's Snowy Plovers[1]. I believe the staff there has always said ploh-ver as well, but it's been a while and I could be mistaken.
https://en.m.wikipedia.org/wiki/Steganography