> More often than not a problem that looks simple and has insane complexity once...

maccard · on May 22, 2021

The number 1 assumption is that all text will meet the format you're expecting it in. Be that an encoding (everyone in the US speaks English, but utf-8 is dangerously compatible with ASCII), or a character range (do we block input of certain characters?)

Presentatuon is huge too, How do we render a pasted newline into a single line block? Where do you put the cursor if someone pastes a block of Arabic text into your multiline input? What do you display if your font doesn't have the character they've copied and pasted from another source?

There's also just the basic stuff of "every keypress/chord equals a new character" - I have an app that I use every day that renders Ctrl + backspace rather than deleting the word?

Then there's input considerations; Macos uses alt for per-word navigation, windows uses Ctrl. Do you support the OS input type, or do you support a popular editor bindings (emacs) and how do you differentiate between an input to display and an input to take a navigation from? What about mobile? Most boards support input gestures, and autocomplete suggestions. How do you know to modify your current context over appending it to your input?

Finally, what about non-renderable input? If you're modifying a rich text string, how do you escape from a block, or how do insert between two separate blocks?

kayodelycaon · on May 22, 2021

> utf-8 is dangerously compatible with ASCII

This. My name is Kayodé Lycaon. Note the é... how many places don’t support it? Some people consider names to be sacred and changing the spelling is more than a little offensive. Even UTF-8 can’t represent all characters in use. I believe there are Japanese names that can’t be represented by its character set.

I get it’s technically difficult but so many places treat people who are different as edge cases to be optimized away.

amake · on May 22, 2021

UTF-8 can represent any Unicode code point. If there is a character that it can’t represent, then that’s because that character is not encoded in Unicode; it has nothing to do with UTF-8 itself.

kayodelycaon · on May 22, 2021

Not being in Unicode is what I meant.

microtherion · on May 22, 2021

One of the most ridiculous places to insist on ASCII recently was... the Unicode consortium: https://twitter.com/Laserhedvig/status/1395338394713079810

exporectomy · on May 21, 2021

Perhaps the assumption by Unicode that we need emoji modifiers instead of just more emojis. All ~2000 valid combinations are enumerated anyway and many require different images so they're effectively different isolated characters.

madhadron · on May 21, 2021

That text is a sequence that we edit by having a cursor into it as opposed to a specialized form of graphics, maybe? We've inherited the controls of a typewriter (which provided a very simple graphical model for a narrow range of languages), but maybe they aren't what we need?