D's string is not text by itself because it is an array of UTF-8 code units. How... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

acehreli on Oct 11, 2022 | parent | context | favorite | on: Introduction to Dlang [video]

D's string is not text by itself because it is an array of UTF-8 code units. However, we have this infamous feature called auto-decoding in the standard library that presents strings as unicode code points.

On the other hand, D's dstrings are more like text because they are not only UTF-32 but also random-accessible code points. (D does not address multiple representations of graphemes at language level. For example, at language level, ğ is different from "g and combining breve" but there are std.uni and std.utf modules that help.)

tialaramex on Oct 12, 2022 [–]

> D's string is not text by itself because it is an array of UTF-8 code units.

Bytes. It's an array of bytes. D's char type isn't actually restricted to UTF-8 code units, char x = '\xFF'; works just fine even though that's not UTF-8.

acehreli on Oct 12, 2022 | [–]

I see what you mean but array of bytes is something else in D: byte[].

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact