Hacker News new | past | comments | ask | show | jobs | submit login

> So much so that many official docs assert this to be true even though it technically isn't.

Do you have any links for that? I've been working with winapi recently and have had a hell of a time getting some clear concrete statements about exactly what encoding (if any) is used in file paths.




https://docs.microsoft.com/en-us/windows/desktop/FileIO/nami...

> the file system treats path and file names as an opaque sequence of WCHARs.

In essence I think you should use UTF-16 encoded strings when creating file paths. However, when reading them you can't assume any encoding (aside for the special characters mentioned in that article). For accessing the filesystem, just treat paths as an opaque blob of data. When displaying a name to the user, assume UTF-16 encoding but handle any decoding errors (e.g. by using replacement characters where neceeary).


Oh, I meant, did you have any links from official docs that said UTF-16 was used?

Your advice is fine, but when the rest of the world is UTF-8 (including the regex engine), things become quite a bit trickier!


Oh I see. UTF-16 is the preferred encoding for all new applications: https://docs.microsoft.com/en-us/windows/desktop/intl/unicod...

Basically, in Windows land, unicode means UTF-16 unless code pages are mentioned https://docs.microsoft.com/en-us/windows/desktop/intl/code-p...




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: