Thanks for the bug report. I filed it for you: https://issues.dlang.org/show_bug...

tialaramex · on Oct 20, 2022

> Thanks for the bug report. I filed it for you: https://issues.dlang.org/show_bug.cgi?id=23405

#23405 was resolved as fixed a week ago. It isn't fixed. I guess at least I didn't waste my time filing the bug.

tialaramex · on Oct 12, 2022

The problem does need solving, but it only needs solving once. D's approach means the programmers needs to make this decisions over, and over, and over again everywhere they have an alleged "string". Or they must track somehow (by convention perhaps?) whether string A is or is not "really" a string.

If you have type safety, you can make the choice just once.

Rust's String::from_{utf8,utf16}_lossy turn valid UTF-8/16 sequences into strings, and "fix" invalid ones with U+FFFD

Meanwhile String::from_{utf8,utf16} attempt the same but with an Err instead of replacement on failure if that's what the programmer wants.

Imagine if all D's numeric functions took the same attitude as its string functions, insisting on being passed arrays of bytes so that each function can parse those bytes, decide if this is actually a 16-bit unsigned integer (for example) and if so do what's expected otherwise perhaps return an error. We'd spot right away that this was not a practical design.

D's choices here are conventional, but I've come to expect a lot more and so I'm disappointed when I can't have it.

WalterBright · on Oct 12, 2022

I don't see the difference here. D offers the same options when processing a string.

tialaramex · on Oct 12, 2022

That's surely the whole point, every D std.string function is also a string decoder with varying features. But a suitably decoded "string" is still just the same type, whereas Rust has a distinct type for actual UTF8 strings

wtetzner · on Oct 12, 2022

I think the point is that you run the unicode validation once on your [u8] array, which gives you a &str (or String for the lossy variants). From then on, you know you have valid unicode and don't need to keep checking.