windows probably defaults to latin-1

bryanrasmussen · on Oct 20, 2021

the default windows encoding is UTF-16, a long time ago it was Windows-1252 https://en.wikipedia.org/wiki/Windows-1252

deathanatos · on Oct 21, 2021

Given the frequency with which Windows-12* mojibake occurs, people are either a number of holdouts still using Windows 98 SE, or there are a good number of paths in Windows that still use the non-Unicode encodings.

GoblinSlayer · on Oct 21, 2021

Windows supports Windows 98 API and it's more natural to use for some languages like C++. No change is planned there. Windows 98 API is also closer to Unix API, which can incentivize the programmer to use the same approach on windows and unix.

account42 · on Oct 21, 2021

All windows needed to do is support setting that API to UTF-8. It's not like it doesn't already support multi-byte encodings. It's not like they even needed to even assign an ID for UTF-8 or implement the conversions - those existed already. All they needed to do is allow programs to set their codepage to UTF-8. This finally became possible two years ago. Better late than never I guess.

hprotagonist · on Oct 20, 2021

or CP-1251, in some locations.

deathanatos · on Oct 21, 2021

There are a good number of them, all depending on locale.

In this case, I'd guess CP-1250, since 0xb3, from the error, decodes to "ł", from the name, in that encoding. (But not in CP-1251, or '52.)

if you want to see how to arrive there: https://news.ycombinator.com/item?id=28939960