Hacker News new | past | comments | ask | show | jobs | submit login
Fix Terminals (leonerd.org.uk)
135 points by smartmic on Jan 27, 2021 | hide | past | favorite | 65 comments



While I don't enjoy being negative, this doesn't sound like a solution that would work - it essentially needs 'boiling the ocean' i.e. convincing everyone else.

(That's literally the plan described in the article)

The first paragraphs of https://clojure.org/community/etiquette nail it: the software community is about making. If you want something done, make it.

How would a "maker-driven" solution look like in this domain? For example:

- make a terminal emulator that accepts the devised keybinding system

- make at least one popular program (e.g. Emacs) play fine with said terminal

- promote that combination, showing the world how the technology can in fact work. Word of mouth would do the rest.


You are certainly not wrong. In this specific case however, I'd like to point out that the author, leonerd, has been busily working on "making stuff" to this end for years now.

They are the lead author of libvterm, a popular modular terminal emulator library that is for example used in neovim and emacs-libvterm.

They have also been working on libtermkey, a library that accepts input from the divised keybinding system, now part of libtickit mentioned in the post.



Cool, so heat vemv requested already exists with iTerm and neovim.


Any other terminals support it?


seems cool but it is macOS-only


Can anyone point me to an article that explains why we are still stuck with these standards from the 1970s? I understand backward compatibility can be important, but I don't quite understand why terminal standards are the ONE thing in the computing world that just can not move into the future.


Just be glad that it didn't go the other way with terminals being full HTML 5 and having to watch advert videos between command invocations.


i think a few years ago someone tried to get me to try some terminal app that was Electron based. not AS bad but still... barf.


> just can not move into the future.

It can. How would you like them to move into the future? It's not as if there's been no recent evolution in terminal emulators.

Notice that evolving into the future and using old ideas are not incompatible. Modern cryptography still uses Euclid's algorithm from more than two millennia ago.


Here's one example: Why can't I interact with text at the prompt in the way I do in every other text input context on my modern computer? Selecting text with shift and arrow keys, moving the cursor with the mouse.

Keep in mind that I do not deeply understand the stack of standards and implementations that comprise a terminal. That said, my assumption is that this complaint is an unfortunate, unavoidable byproduct of that stack.

While introducing newcomers to terminal usage, there are simple actions from the gui world that do not work at all, or even seem to break the terminal display. It's extremely confusing for them, they get over it eventually, and come to accept that the terminal just can't do certain things. This is what I mean by being unable to move into the future.

As a point of comparison, consider how much of a mess it is to deal with spaces in file names, in bash. To a newcomer, this seems like a crazy hassle, some antiquated nonsense. It's hard for them to understand how experienced Linux users can stand to deal with it. It's annoying, but it's a direct consequence of the choice to use the space character as the delimiter between tokens in bash. This choice is actually super convenient most of the time, because that is the best use of the space character/key in a terse shell language.

What is the analogous rationale behind the shortcomings of the terminal paradigm? What would we need to give up, in order to make the terminal interface a little bit more modern?


One of the problems with treating terminals like other text input contexts is that they're inherently different and you're going to constantly run into issues when the context switches to displayed text.

For example, you could get a terminal that allows selecting text with keyboard but what happens when a user inevitably wants to or accidentally selects text from the non-input part of the terminal? Should the input part of the terminal and the display part of the terminal be treated differently? In Firefox as I type this I get a nice big input box where I can do multi-line paragraphs but if I start clicking and dragging to select the text in the input box it'll never let me select text from your comment, likewise in reverse, but on a terminal this would be undesirable behaviour for the common pattern of selecting text to copy output for documenting or troubleshooting. Some terminals have plugins or scripts to allow such selection of text but not for input purposes, if you decide to only allow text selection on the input field as a means of improving text input do you just get two such methods of being able to select text? The sane approach might be to allow selecting all text but you'll end up with paper cuts where a user doesn't care about what is going on before the $ but might end up in situations where text is selected far before the $ because of a reverse search gone wrong, etc.

As an aside, terminals/bash do have some form of text-editor like functionality, readline is the usual library involved (man bash, /^readline) and has reasonable support for moving cursor around words, move cursor to character search, deleting/yanking/pasting words, etc. It's not the best text editing interface but for dealing with a single command line it's usually sufficient. There's even a vi-like editing mode (set -o vi) built into bash if you feel like you need a modal editor for a single line but it seems even less intuitive and harder to grok.

https://readline.kablamo.org/emacs.html

https://catonmat.net/bash-emacs-editing-mode-cheat-sheet

https://catonmat.net/bash-vi-editing-mode-cheat-sheet


> How would you like them to move into the future?

For starters: stop using in-band signaling.


That's a big "for starters."

In-band signaling — or rather, in-band markup — is a big part of what makes a "terminal" (i.e. a TTY/PTY device) semantically a "terminal": a hybrid character-grid / event-log that works both as a sink for streamed-in text, and as a "plotter" for making line-printer art. A device that both streams out as, effectively, an input-event log (where this log can be captured for replay, as with script(1) or most logging systems); but which also maintains a notion of being a ring-buffer "containing" a rectangularly-bounded volume of text (and control-events), such that clients just connecting onto it can begin streaming just from the beginning of that buffer, to end up with one complete "image" of the latest PTY state, minus any scrollback, without needing previous history.

All that is kind of predicated on control-characters being embedded in the text and "following" the text around, such that taking a slice of the text (as the PTY's ring-buffer does every time a line is expunged) will preserve the corresponding slice of events.

What would reading back the contents of a PTY device look like in a world with out-of-band TTY signalling? What would flow over a serial port? Would it even be a character-stream device, or are you imagining a TTY/PTY as operating in something more akin to a structured datagram event-stream mode, such that you'd use sendmsg(2) and recv(2) on it rather than write(2) and read(2)?

I mean, it's not an impossible dream; but that really is a "start over with a whole separate ecosystem that no existing software works with until made compatible" kind of change. Effectively it'd be a separate thing from TTYs, that just happens to have similar functionality. But it wouldn't support any existing software, or any existing hardware, except by virtualization (i.e. running a PTY emulator process inside your modern OoB-signalling terminal emulator.) Kind of like what Windows has been going through to replace its own command-line.

---

Personally, I'd prefer to keep in-band signalling (in the "in-band markup" sense above, not the "you have to recognize conventional escape-code sequences heuristically to even know they're not regular text" sense.)

But I'd rather just make the in-band signalling structured — i.e. to make TTYs into a data-stream containing a variable-length self-synchronizing bit-encoding with clear prefix separation for control- and data- packets.

Y'know, like UTF-8 is for text.

...or, well, speaking of Unicode: we could just use Unicode for this, reserving another block† of control characters to go with the 30-odd ones that sit at the beginning of the BMP. Then "is this is a control codepoint, and if so, what does it mean" could just be answered by consulting a Unicode table. (In such a setup, CSI command parameterization would be accomplished with zero-width joiners, variant selectors, and other things. Just picture control-characters as invisible emoji — specifically like the flag emoji that are formed by spelling out country-codes in a sort of "flags meta-alphabet"; or like that family emoji [https://emojipedia.org/family/] with the combinatoric variants.)

† Why not use the Private Use Area? Because this would be an explicitly inter-compatible signalling standard, not a proprietary usage. It's not text, but it is a standard signal within a text document. Just like emoji — or like the existing control codepoints in Unicode.


Thank you. I appreciate the detailed explanation, but you really did sum it up well in the first sentence.

I have some frustrations with terminals, which I have always interpreted as being caused by their adherence to some old standards. After reading this comment, I think I can see how it's more the terminal paradigm itself that is responsible for some of these things.

Now, in the future I will be able to look at these frustrations in a new light, and hopefully understand a little better why it makes sense to continue using the terminal paradigm despite them.

This is something I've struggled to understand for a long time, and your comment, obvious though it may seem to some, is one of the first helpful answers I've seen.


It works okay, unlike most of the rest of the crap in computing. Leave it alone.


You're right! I just want to understand why it's stuck at ok, and has what seems like 50-year-old cruft that can't be touched.


Typing "echo <ctrl-v><esc>c<return>" to fix a garbled Vt100 like terminal remains embedded in my brain from long ago. Despite simpler things like "stty sane" existing. I don't know if maybe I was an admin on some box with a wonky stty, or why that's in my brain.


I just type `reset` (maybe it's a zsh thing)


I don't remember, though I worked on a lot of oddball stuff. 3B2, Pyramid OSX, DGUX, etc.


Is <ctrl-v><esc> same as \x1b? Why are they same?


Yes, 0x1b is an ascii <esc>.

<ctrl-v> essentially means "insert the next key I press verbatim". So, like in Vim, <esc> would switch modes, unless you press <ctrl-v> first. Helpful if you're trying to insert a control character into a file, or echo a literal <ctrl-c> etc.


Does the terminal emulator handle the ctrl-v or shell or somewhere in kernel?


Whichever facility receives and processes the inbound keystrokes. In simple software that uses the terminal in canonical mode, the line discipline (usually in the kernel) handles the batching up of bytes into lines before passing them onto the software, and thus provides the line editing. In more complex software that puts the terminal into raw mode, it's then handled in that software; e.g., by libedit or readline or some bespoke terminal handling.


Why not just `clear` or `reset` terminal?


Wish I could remember why not :). I know now, but the muscle memory sticks.


This proposal is incomplete and has various bugs: https://sw.kovidgoyal.net/kitty/keyboard-protocol.html#fixte...


I agree that using the 8-bit CSI to eliminate the escape ambiguity would be great. The rest I'm a bit "meh" on. In particular I use C-S as the modifier for my terminal shortcuts, so if the terminal could detect it I'd be looking for yet another set of bindings that no terminal programs will ever use.


The only decent solution in my view would be a new, separate kind of terminal spec, maybe even completely focused at terminal emulators. It could make sense to handle input better(maybe even with keyup/keydown events) and maybe display images in a non-hacky way(it can already be done via w3m...kind of).

Changing the existing standards just won't work as it would break a ton of programs, although if I'm being honest - while I like the idea of such a terminal rework - I don't know if it can be done nowadays without it opening the flood gates by developers who just can't restrain themselves from adding so many features it might as well be a new browser engine.

And either way I don't think it's worth the effort as those input limitations only affect a few specific situations. I use hundreds of custom keybindings in Vim and so far the only obstacle I encountered was the `Tab` & `Ctrl-i` overlap. If I need to hold down more than one modifier I did something wrong anyway.



Looks great, the only gripe I have with Kitty is how slow it is, especially on startup. I guess mostly because the combination of Python and tons of features has a negative effect in this area. In fact that was the only reason I switched away from it despite its great font ligature support; I use terminals a lot and so the constant delay when opening a new window was a bit much.

However I like those modernised protocols and it would be neat to have widespread support for it.


This! I would love a "purified" version of kitty which is just an xterm with font ligatures. And no silly features like underlining of links, etc. Also, an option to display bold text by using brighter colors (as God intended).


kitty startup time was slow because of a bug in GLFW, fixed a while ago. And you can have its startup time be 0 with --single-instance.


Using the latest kitty release 0.19.3 vs. st, both already loaded:

  ~ >=> time st ls

 real   0m0.048s
 user   0m0.041s
 sys    0m0.008s
  ~ >=> time kitty ls

 real   0m0.239s
 user   0m0.173s
 sys    0m0.059s
If kitty isn't nicely cached it takes over 500ms on my machine. Using your suggested flag it still takes twice as long.


You need to run kitty -1 to start kitty and leave it running. Then all future kitty -1 invocations will open new windows instantaneously.


I use kitty.

"time kitty -1 true" takes about 180ms on my machine. That's more than fast enough for me, but certainly slower than many other terminals.


It's still an order of magnitude slower during startup than other terminals such as xterm, rxvt or even mlterm. On my intel laptop I can often see the gl context flashing before becoming the final background color, which is annoying. Requires also way more ram.

kitty is a great terminal, but it's one example of fast not being also lightweight.


I cant reply to your other post, so: you need to run the other kitty instance also with -1. If you do that, you will get the same numbers I got.


I know, I'm actually using kitty regularly.


time kitty -1 false real 0.098 user 0.080 sys 0.017 maxmem 23 MB faults 0

time xterm false real 0.052 user 0.035 sys 0.000 maxmem 9 MB faults 1

Doesnt look like an order of magnitude to me.


xterm false 0.06s user 0.01s system 81% cpu 8Mb mem 0.090 total

mlterm -e false 0.08s user 0.02s system 84% cpu 13Mb mem 0.125 total

kitty -1 false 0.22s user 0.05s system 93% cpu 78Mb mem 0.290 total

(and yes, there's a kitty instance running already..)


Personally I feel like the fix for terminals is not to use them. Emacs for example has a lot of effort put into supporting various different terminals and efficiently displaying text. Some xterm-specific support is for many colours, mouse support, and the control codes to ask xterm to send more precise key escape sequences so TAB and C-i aren’t confused for example. It also has a gui which doesn’t need to pile on layers of hacks and works great so long as you don’t need to go through a terminal (eg running over ssh.) And yet it is still straddled with backwards compatibility from those days (some users expect TAB and C-i to do the same thing.)

To me it feels silly to put a lot of effort into supporting things better in terminal emulators because they are not as flexible as actual user interfaces and I think the main reasons we still have them are historical.

Aside: I’d also like to see a command line shell which runs outside the terminal emulator rather than inside it.


>Aside: I’d also like to see a command line shell which runs outside the terminal emulator rather than inside it.

The problem with such a shell would be that all the simple utilities and tools one likes to use in a shell will not work without a tty (or pty or fake-pty-conhost.exe). So the shell by itself would be pretty useless.

This theoretical shell and its family of utilities would all have to emulate terminal emulators by using something of a common library of wrappers around tty functionality. At that point you've re-invented cygwin for your OS of choice. And you're going to be running the same kinds of stuff you'd run in a real terminal emulator.

I'm not being snarky. I gave this kind of thing a lot of thought many years ago when I was thinking of native GUI unix-style shells on Windows.

Unless you have a different use case in mind for this shell ?


>>Aside: I’d also like to see a command line shell which runs outside the terminal emulator rather than inside it.

Plan 9 did something like that with its rio windows (that replace the terminal emulators) and its rc shell.

https://9p.io/wiki/plan9/using_rio/index.html

http://man.cat-v.org/plan_9/1/rio

> The problem with such a shell would be that all the simple utilities and tools one likes to use in a shell will not work without a tty (or pty or fake-pty-conhost.exe). So the shell by itself would be pretty useless.

Right -- no vi, no emacs, no readline, no curses -- nothing that uses cursor addressing. You have to start all over. Plan 9 can be seen as an experiment to find out how much you can simplify if you can abandon backwards compatibility.


I’m suggesting running outside the terminal by which I mean pulling up new terminal emulators as necessary rather than running inside a single terminal emulator and taking over the whole thing for each command (and allowing applications to get it into a bad state). If you want to run less on the output of one command or watch some thing change while still running other commands you either need to open a new second terminal emulator and go to the same place as the first or you need to use something like tmux (and either have some janky hack or be very careful about when you run tmux if you want to duplicate shells on a remote box for example).

I also think it’s not so useful anymore to just set up a pipeline and let it run off and do it’s work. Interactive shell usage is often mostly about editing or extending a small suffix of the previous command, so I think a shell should be optimised to do that well.


hmm, so much like the console subsystem on Windows where you can start shells pretty much arbitrarily, detached from any real tty.

Could be do-able in Unixy OSes, with a lot of work to detach the OS from the concept of a tty.

Interesting !


All the issues are just matters of backwards compatibility at this point. As far as I'm concerned, we'll always need a text based REPL oriented interface.


> Aside: I’d also like to see a command line shell which runs outside the terminal emulator rather than inside it.

So how would that work over something like ssh or even a serial terminal. And you'd have a shell containing a lot of very platform specific code. That really breaks everything. I'd prefer to see well thought out evolutions to the ANSI escape sequence standards that solve real problems. The proposal on the original post does that for keys. Other things like bracketed paste and support for more than a few colours have seen some adoption. Some good ideas get less attention like having a stack for titlebar changes. Some things that weren't a great idea security-wise have been dropped (key redefinitions, retrieving the title).

One of the problems is that the concept behind termcap and terminfo doesn't really scale or allow for innovations. As a user of rxvt-unicode I often suffer the frustration of it being unrecognised on bare OS installs. But I respect the fact that it doesn't just claim to be xterm while emulating it imperfectly like many.


Well no one uses a serial terminal for much so that can be disregarded. For ssh, you can just stop pretending that it is just like any other program and have actual integration with it (if you still care about serial terminals, just make the mechanism that understands ssh generic). On the remote host you either need your shell executable to run in some “remote” mode like rsync does or you need your shell to be able to generate appropriate bash commands.

Take eshell for an example: it connects processes together in emacs and can natively support navigating to places on remote hosts, opening files there, or running (remote) shell commands.

I sympathise with you on the pain of terminfo+ssh where the remote host doesn’t have the appropriate files. But I don’t think that further piling on “standards” to terminal escape sequences is a long-term solution to terminal woes.


You can also learn to use the terminal. This can clear many of the problems that you would otherwise have by thinking that the terminal is just another windows/macos application.


I don’t know how to respond to this as you seem to just think I’m an idiot who doesn’t know how to use computers.

The problem with using a terminal is that control of it is distributed between a few things the user can control and the escape sequences (or just output) produced by the shell and any processes that run in it. These may end up conflicting with each other leaving your terminal in a bad state or you may just get broken output (ever tried piping pv something | ... | less?). One way to regain some control is with something like tmux but this can become unwieldy (and heaven forfend it sees a multibyte Unicode character—terminal emulators don’t really have a way to communicate with applications about how wide a character is going to be when drawn)

I think a few things get conflated because there are few text-centric or command-line user interfaces. I put it to you that it is possible to have good composable text-centric user interfaces that don’t rely on pretending to be a VT100 or an ancient single-byte-stream-with-control-sequences protocol.


The problem with GUIs (I assume you mean them by "actual user interfaces") is that they rely too much on the user's memory, and that they do not scale, especially for the cloud systems.

I have many text notes which record various commands. I have them because it is so easy -- once you figured out how to do something, get the history and copy relevant stuff. At work, we share the commands on Slack, put them into Wiki, paste them in the docs, and so on. If the commands grow too complex, they turn into scripts -- after all, scripts are just file rename away.

I don't think I could do this efficiently I had to use GUIs. Theoretically, I could write an manuals with dozens of steps[0], but this is much more significant effort, and they'll likely go stale anyway.

[0] https://docs.microsoft.com/en-us/iis/application-frameworks/...


You are confusing command line and terminal here, I believe. Note how OP was specifically using emacs as an example. From your "store and share commands" point of view so called TUIs (like emacs in terminal) are as opaque as GUIs (emacs in X11).


Note that nothing stops a GUI program from also supporting programmability and scripting. While it's true that most of them don't, you'll also find that a surprising number do.

For example, the entire MS Office Suite is programmable - you can write VB Script (or lately, JS?) and achieve most if not all of the functionality supported in the GUI.

Rather more well known on this front, Emacs and many other Lisp systems have always supported the same kind of programmability as a shell from within the GUI environment - both in the form of a simple REPL and more advanced GUI-command interaction (e.g. executing the current selection as elisp code, executing a command with the current selection as input etc).


I'd also really like to just be able to interact with a keyboard like other apps i.e. receive keyup events, reading multiple keys at once etc. Kitty has something like that but popular libraries don't support it. I've had many ideas for terminals games and apps but didn't continue because of these limitations.


The conflation of various control characters, i.e. `Ctrl-i` and `Ctrl-I`, was one of the primary reasons that convinced me to try using Emacs outside of the terminal. I'm glad I switched. Now I can make much more ergonomic keybindings.


Indeed, terminals have to be fixed. Even DOS/Windows text mode (I don't mean the shell) always felt way better than a Linux terminal.

Perhaps it finally is time to come up with an entirely new terminal technology re-engineered from [almost] scratch based on today achievements and needs.

For a period of time near 2000 people felt like terminals were going to be abandoned in favor of GUIs so changing them didn't feel reasonable. But now it's clear that was wrong.


> Perhaps it finally is time to come up with an entirely new terminal technology re-engineered from [almost] scratch based on today achievements and needs.

What do you mean, exactly, by that? Do you think that programs designed to work in current terminals (e.g., vim) should stop working with the new terminal technology? That seems a bit tough for progressive adoption. Otherwise, how do you think that these programs should be changed? Or wrapped in some intermediary "terminal emulator" so that they can run unchanged in the newer terminals?


Why not?

Also, the new terminal technology doesn't necessarily have to be fundamentally incompatible with the legacy technology. It probably can be made reasonably easy to compile classic apps for any terminal technology. Many of them are compatible (although not necessarily 100% compatible) with both Linux and DOS/Windows already.


How does this interact with alternative and personalised keyboard layouts and with sticky keys for accessibility?

I don't have a proper overview of how these things work, but I'm a bit worried that there may be multiple layers (reprogrammable keyboard, kernel, X server, terminal, editor) trying to solve similar problems in similar ways and they might end up interfering with each other.


I didn't fully understand the spec, it got hard to read towards the end. Is this is supported by curses/terminfo?


I have few mobile devices with limited keyboard: smartphones with keyboard, chromebook. They lack some keys presented on a standard PC keyboard. Will this new input library work with virtual keyboard, limited keyboard? How I will be able to input a missed key or control combination?


Excellent satire!


You can work directly at the keycode level, and watch the key up/down events. Many programs do, especially games.


That doesn't work in a terminal. Terminals can operate over many different channels which don't pass through raw keyboard events (like SSH), or which don't have them at all (like serial links).




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: